The term “Renaissance skepticism” refers to a diverse range of approaches to the problem of knowledge that were inspired by the revitalization of Ancient Greek Skepticism in fifteenth through sixteenth century Europe. Much like its ancient counterpart, Renaissance skepticism refers to a wide array of epistemological positions rather than a single doctrine or unified school of thought. These various positions can be unified to the extent that they share an emphasis on the epistemic limitations of human beings and offer the suspension of judgment as a response to those limits.
The defining feature of Renaissance skepticism (as opposed to its ancient counterpart) is that many of its representative figures deployed skeptical strategies in response to religious questions, especially dilemmas concerning the criterion of religious truth. Whereas some Renaissance thinkers viewed skepticism as a threat to religious orthodoxy, others viewed skepticism as a powerful strategy to be adopted in Christian apologetics.
Philosophers who are typically associated with Renaissance skepticism include Gianfrancesco Pico della Mirandola, Michel de Montaigne, Pierre Charron, and Francisco Sanches. Beyond philosophy, the revitalization of skepticism in Renaissance Europe can also be seen through the writings of religious thinkers such as Martin Luther, Sebastian Castellio, and Desiderius Erasmus; pedagogical reformers such as Omer Talon and Petrus Ramus; and philologists such as Henri Estienne and Gentian Hervet. This article provides an overview of the revitalization of skepticism in Renaissance Philosophy through the principal figures and themes associated with this movement.
Ancient Greek skepticism is traditionally divided into two distinct strains: “Academic skepticism” and “Pyrrhonian skepticism.” Both types of skepticism had a considerable influence on Renaissance philosophy albeit at different times and places. The term “Academic skepticism” refers to the various positions adopted by different members of Plato’s Academy in its “middle” and “late” periods. Figures such as Arcesilaus (c. 318-243 B.C.E.), Carneades (c. 213-129 B.C.E.), Clitomachus (187-110 B.C.E.), Antiochus (c. 130-c. 68 B.C.E.), Philo of Larissa (c. 159/8 – c. 84/3 B.C.E.), and Cicero (106-43 B.C.E.) are associated with Academic skepticism. The term “Pyrrhonian skepticism” refers to an approach adopted by a later group of philosophers who sought to revive a more radical form of skepticism that they associated with Pyrrho (c. 365-270 B.C.E.). Figures associated with Pyrrhonian skepticism include Aenesidemus in the first century B.C.E., and Sextus Empiricus in the second century C.E.
Both strains of ancient skepticism share an emphasis on the epistemic limitations of human beings and recommend the suspension of assent in the absence of knowledge. Both varieties of skepticism advance their arguments in response to dogmatism, although they differ in their specific opponents. The Academic skeptics direct their arguments primarily in response to Stoic epistemology, particularly the theory of cognitive impressions. In contrast, the Pyrrhonian skeptics direct their arguments in response to Academic skepticism as well as other ancient schools of thought.
One key distinction between the two strains of ancient skepticism can be found in their differing stances on the nature and scope of the suspension of assent. Arcesilaus, for example, maintains the radical view that nothing can be known with certainty. In response to the absence of knowledge, he recommends the suspension of assent. In response to the Stoic objection that the suspension of judgment impedes all rational and moral activity, Arcesilaus offers a practical criterion, “the reasonable” (to eulogon), as a guide to conduct in the absence of knowledge. Arcesilaus’ student Carneades presents yet another kind of practical criterion, “the persuasive” (to pithanon), as a guide for life in the absence of knowledge. In response to the inactivity charge of the Stoics, he maintains that in the absence of knowledge, we can still be guided by convincing or plausible impressions.
Philo offers a “mitigated” interpretation of Academic skepticism. His mitigated skepticism consists in the view that an inquirer can offer tentative approval to “convincing” or “plausible” impressions that survive skeptical scrutiny. Cicero discusses Philo’s mitigated interpretation of Academic skepticism in his Academica, translating Carneades’ practical criterion as “probabile” and “veri simile.” Cicero gives this practical criterion a “constructive” interpretation. In other words, he proposes that probability and verisimilitude can bring the inquirer to ever closer approximations of the truth. Admittedly, the question of whether Academic skeptics such as Cicero, Carneades, and Arcesilaus are “fallibilists” who put forth minimally positive views, or “dialectical” skeptics who only advance their arguments to draw out the unacceptable positions of their opponents, is a subject of considerable scholarly debate. For further discussion of this issue, see Ancient Greek Skepticism.
The Pyrrhonian skeptics introduce a more radical approach to the suspension of assent in the absence of knowledge. They offer this approach in response to what they perceive to be dogmatic in the Academic position. Aenesidemus, for example, interprets the Academic view that “nothing can be known” as a form of “negative dogmatism.” That is, he views this position as a positive and therefore insufficiently skeptical claim about the impossibility of knowledge. As an alternative, Aenesidemus endeavors to “determine nothing.” In other words, he seeks to neither assert nor deny anything unconditionally. Sextus Empiricus, another representative figure of Pyrrhonian skepticism, offers another alternative to the allegedly incomplete skepticism of the Academics. In the absence of an adequate criterion of knowledge, Sextus practices the suspension of assent (epoché). Although the Academics also practice the suspension of assent, Sextus extends its scope. He recommends the suspension of assent not only regarding positive knowledge claims, but also regarding the skeptical thesis that nothing can be known.
2. The Transmission of Ancient Skepticism into the Renaissance
In fourteenth to mid-sixteenth century Europe, the writings of Augustine, Cicero, Diogenes Laertius, Galen, and Plutarch served as the primary sources on Ancient skepticism. The writings of Sextus Empiricus were not widely available until 1562 when they were published in Latin by Henri Estienne. Due to the limited availability of Sextus Empiricus in the first half of the sixteenth century, philosophical discussions of skepticism were largely constrained to the Academic skeptical tradition, with very few exceptions. It was only in the latter half of the sixteenth century that Renaissance discussions of skepticism began to center around Pyrrhonism. For this reason, this article divides Renaissance skepticism into two distinct periods: pre-1562 and post-1562.
Throughout the Renaissance, the distinction between Academic and Pyrrhonian skepticism was neither clearly nor consistently delineated. Before the publication of Sextus Empiricus in the 1560s, many authors were unaware of Pyrrhonian skepticism, often treating “skepticism” and “Academic skepticism” as synonyms. Those who were aware of the difference did not always consistently distinguish between the two strains. Following the publication of Sextus Empiricus, many thinkers began to use the terms “Pyrrhonism” and “skepticism” interchangeably. For some, this was in apparent acceptance of Sextus’ view that the Academic skeptics are negative dogmatists rather than genuine skeptics. For others, this was due to a more syncretic interpretation of the skeptical tradition, according to which there is common ground between the various strains.
3. Popkin’s Narrative of the History of Renaissance Skepticism
Scholarly debate surrounding the revitalization of ancient skepticism in the Renaissance has been largely shaped by Richard Popkin’s History of Scepticism, first published in 1960, and expanded and revised in 1979 and 2003. This section presents Popkin’s influential account of the history of skepticism, addressing both its merits and limitations.
The central thesis of Popkin’s History of Scepticism is that the revitalization of Pyrrhonian skepticism in Renaissance Europe instigated a crisis of doubt concerning the human capacity for knowledge. According to Popkin, this skeptical crisis had a significant impact on the development of early modern philosophy. On Popkin’s account, the battles over theological authority in the wake of the Protestant Reformation set the initial scene for this skeptical crisis of doubt. This crisis of uncertainty was brought into full force following the popularization of Pyrrhonian skepticism among figures such as Michel de Montaigne.
While influential, Popkin’s narrative of the history of skepticism in early modernity has drawn criticism from many angles. One common charge is that Popkin exaggerated the impact of Pyrrhonian skepticism at the expense of Academic skepticism and other sources and testimonia such as Augustine, Plutarch, Plato, and Galen. Another common criticism is that he overstated the extent to which skepticism was forgotten throughout late Antiquity and the Middle Ages and only recovered in the Renaissance. This section provides an overview of these two main criticisms.
Charles Schmitt’s 1972 study of the reception of Cicero’s Academica in Renaissance Europe demonstrates that the impact of Academic skepticism on Renaissance thought was considerable. Schmitt argues that although the Academica was one of Cicero’s more obscure works throughout the Latin Middle Ages, it witnessed increased visibility and popularity throughout the fifteenth and sixteenth centuries. By the sixteenth century, Cicero’s Academica had become the topic of numerous commentaries, such as those by Johannes Rosa (1571) and Pedro de Valencia (1596) (for an analysis of these commentaries, see Schmitt 1972). Not only that, but the Academica became an object of critique among scholars such as Giulio Castellani (Schmitt 1972). Although Schmitt ultimately concedes that the impact of Academic skepticism on Renaissance thought was minimal in comparison to that of Pyrrhonism, nevertheless, he maintains that it was not as marginal as Popkin had initially suggested (Schmitt 1972; 1983). Over the past few decades, scholars such as José Raimundo Maia Neto have studied the impact of Academic skepticism further, arguing that its influence on early modern philosophy was substantial (Maia Neto, 1997; 2013; 2017; see also Smith and Charles eds. 2017 for further discussion of the impact of Academic skepticism on early modern philosophy).
Popkin’s “rediscovery” narrative has also been challenged, specifically the idea that Pyrrhonism was largely forgotten throughout Late Antiquity and the Middle Ages only to be rediscovered in the Renaissance. One notable example is Luciano Floridi’s study on the transmission of Sextus Empiricus, which documents the availability of manuscripts throughout late antiquity and the Middle Ages. Floridi shows that although Sextus was admittedly obscure throughout Late Antiquity and the Middle Ages, he was not quite as unknown as Popkin had initially supposed (Floridi 2002).
Increased scholarly attention to medieval discussions of skepticism has shown further limitations to Popkin’s rediscovery narrative (Perler 2006; Lagerlund ed. 2010; Lagerlund 2020). Although neither strain of ancient Greek skepticism was particularly influential in the Latin Middle Ages, discussions of skeptical challenges and rehearsals of skeptical arguments occurred in entirely new contexts such as debates regarding God’s power, the contingency of creation, and the limits of human knowledge in relation to the divine (Funkenstein 1987). Although most medieval discussions of skepticism were critical, such as that of Henry of Ghent, who drew on Augustine’s Contra Academicos in his attack on skeptical epistemology, some were sympathetic, such as that of John of Salisbury, who discussed the New Academy in a favorable light, and adopted elements of Cicero’s and Philo of Larissa’s probabilism in his own epistemology (Schmitt 1972; see also Grellard 2013 for a discussion of John of Salisbury’s probabilism). The following section provides an overview of medieval treatments of skepticism.
4. Medieval Skepticism and Anti-Skepticism
Philosophers typically associated with medieval skepticism and anti-skepticism include John of Salisbury, Henry of Ghent, John Duns Scotus, Nicolas of Autrecourt, and John Buridan, among others. Although not all of these thinkers engaged directly with ancient Greek skepticism, they still responded to epistemological challenges that can be called “skeptical” in a broader sense.
John of Salisbury (1115-1180) was one of the first philosophers of the Latin Middle Ages to discuss Academic skepticism in any significant detail and to openly embrace certain views associated with the New Academy. In the Prologue to the Metalogicon, for example, John associates his own methodology with Academic probabilism. He writes, “[b]eing an Academician in matters that are doubtful to a wise man, I cannot swear to the truth of what I say. Whether such propositions be true or false, I am content with probable certitude” (ML 6). Similarly, in the Prologue to the Policraticus, John writes that “[i]n philosophy, I am a devotee of Academic dispute, which measures by reason that which presents itself as more probable. I am not ashamed of the declarations of the Academics, so that I do not recede from their footprints in matters about which wise men have doubts” (PC 7). John associates Academic methodology with epistemic modesty toward claims that have not been conclusively demonstrated and combines this humility with an openness toward the possibility of truth.
Although John associates his own methodology with Academic probabilism, he stipulates very clear limits to his skepticism. He restricts his skeptical doubt to the inferences derived from ordinary experience, maintaining that these inferences should be affirmed as probable rather than necessary. Although John believes that it is reasonable to doubt the inferences derived from ordinary experience, he maintains that we can still affirm the truth of what can be known rationally. He argues, for example, that we cannot doubt the certainty of God’s existence, the principle of non-contradiction, or the certainty of mathematical and logical inferences (PC 153-156).
In the thirteenth century, both Henry of Ghent (c. 1217-1293) and John Duns Scotus (1265/1266-1308) were concerned with establishing the possibility of knowledge in opposition to skeptical challenges (for a discussion of their positions in relation to skepticism, see Lagerlund 2020). Henry’s Summa begins by posing the question of whether we can know anything at all. Henry attempts to guarantee the possibility of knowledge through a theory of divine illumination he attributes to Augustine. John Duns Scotus discusses and rejects Henry’s divine illumination theory of knowledge, arguing that the natural intellect is indeed capable of achieving certainty through its own powers. Like Henry, Scotus develops his theory of knowledge in response to a skeptical challenge to the possibility of knowledge (Lagerlund 2020). In contrast to Henry, Scotus maintains that the natural intellect can achieve certitude regarding certain kinds of knowledge, such as analytic truths and conclusions derived from them, thus requiring no assistance through divine illumination.
In early fourteenth century Latin philosophy, a new type of skeptical argument, namely the “divine deception” argument, began to emerge (Lagerlund 2020). The divine deception argument, made famous much later by Descartes, explores the possibility that God is deceiving us, thus threatening the very possibility of knowledge. Philosophers such as Nicholas of Autrecourt and John Buridan developed epistemologies that could respond to the threat to the possibility of knowledge posed by this type of skeptical argument (Lagerlund 2020). Nicholas offers an infallibilist and foundationalist epistemology whereas Buridan offers a fallibilist one (Lagerlund 2020).
Nicholas of Autrecourt (c. 1300-c. 1350) entertains and engages with skeptical challenges in his Letters to Bernard of Arezzo. In these letters, Nicholas draws out what he takes to be unacceptable implications of Bernard’s epistemology. Nicolas takes Bernard’s position to entail an extreme and untenable form of skepticism about the external world and even of one’s own mental acts (Lagerlund 2020). In response to this hyperbolic skepticism, Nicholas develops a positive account of knowledge, offering what Lagerlund calls a “defense of infallible knowledge” (Lagerlund 2020). Nicholas’ epistemology is “infallibilist” insofar as he maintains that the principle of noncontradiction and everything that can be resolved into this principle is immune to skeptical doubt. This infallibilist epistemology is tailored to respond to the skeptical challenge of divine deception.
Nicholas’s approach to skepticism sets a very high bar for the possibility of knowledge. This exceedingly high standard of knowledge is challenged by John Buridan (c. 1295-1361). Like Nicholas, Buridan also develops an epistemology that can withstand the skeptical challenge of divine deception. Unlike Nicholas, the epistemology he develops is a “fallibilist” one (Lagerlund 2020). As Jack Zupko argues, Buridan’s strategy against the skeptical challenges entertained by Nicholas is to show that it is unreasonable to accept the excessively high criterion of knowledge presupposed by the hypothesis of divine deception (Zupko 1993). As Zupko shows, Buridan’s response to the divine deception argument entertained by Nicholas is to “acknowledge it, and then to ignore it” (Zukpo 1993). Instead, Buridan develops a fallibilist epistemology in which knowledge admits of degrees which correspond to three distinct levels of “evidentness” (Lagerlund 2020).
Throughout the Latin Middle Ages skepticism did not disappear to the extent that Popkin suggests. Nevertheless, although many medieval philosophers deal with skeptical challenges to the possibility of knowledge, and develop epistemologies tailored to withstand skeptical attack, their approaches to these issues are not always shaped by Ancient Greek skepticism. In the Renaissance, this began to change due to the increasing availability of classical texts. The next section discusses Renaissance treatments of skepticism both before and after the publication of Sextus Empiricus.
5. Renaissance Skepticism Pre-1562: Skepticism before the Publication of Sextus Empiricus
In Renaissance Europe, philosophical treatments of skepticism began to change as Cicero’s Academica witnessed increased popularity and Sextus Empiricus’ works were translated into Latin. This section discusses how Renaissance thinkers approached the issue of skepticism (both directly and indirectly) from the early sixteenth century up until the 1562 publication of Sextus Empiricus by Henri Estienne. Due to the limited availability of Sextus Empiricus in Renaissance Europe, most discussions of skepticism prior to 1562 draw primarily on the Academic skeptical tradition. One notable exception is Gianfrancesco Pico della Mirandola.
a. Gianfrancesco Pico della Mirandola’s use of Pyrrhonism
Gianfrancesco Pico della Mirandola (1469-1533) is the earliest Renaissance thinker associated with Pyrrhonian skepticism. His Examination of the Vanity of the Doctrines of the Gentiles and of the Truth of the Christian Teaching (1520) is often acknowledged as the first use of Pyrrhonism in Christian apologetics (Popkin 2003). Although Sextus Empiricus was not widely available in Pico’s time, he had access to a manuscript that was housed in Florence (Popkin 2003).
In the Examination of the Vanity of the Doctrines of the Gentiles and of the Truth of the Christian Teaching, Pico deploys skeptical strategies toward both positive and negative ends. His negative aim is to undermine the authority of Aristotle among Christian theologians and to discredit the syncretic appropriation of ancient pagan authors among humanists such as his uncle, Giovanni Pico della Mirandola. Pico’s more positive aim is to support the doctrines of Christianity by demonstrating that revelation is the only genuine source of certitude (Schmitt 1967; Popkin 2003). Pico subjects the various schools of ancient philosophy to skeptical scrutiny in order to demonstrate their fundamental incertitude (Schmitt 1967; Popkin 2003). In so doing, he seeks to reveal the special character of divinely revealed knowledge.
Although Pico uses skeptical strategies to attack the knowledge claims advanced by ancient pagan philosophers, he maintains that the truths revealed in Scripture are immune to skeptical attack. One reason for this is that he understands the principles of faith to be drawn directly from God rather than through any natural capacity such as reason or the senses. Since Pyrrhonian arguments target reason and the senses as criteria of knowledge, they do not apply to the truths revealed in Scripture. Not only does Pico maintain that Pyrrhonian arguments are incapable of threatening the certainty of revelation, he also suggests that this Pyrrhonian attack on natural knowledge has the positive potential to assist Christian faith.
Pico’s use of Pyrrhonism presents a case of what would later become common throughout the Reformation and Counter-Reformation, namely the deployment of Pyrrhonian skeptical strategies towards non-skeptical Christian ends. Pico did not subject the doctrines of Christianity to skeptical attack. Instead, he deployed Pyrrhonism in a highly circumscribed context, namely as an instrument for detaching Christianity from ancient pagan philosophy and defending the certitude of Christian Revelation (see Copenhaver 1992 for a discussion of Pico’s detachment of Christianity from ancient philosophy).
b. Skepticism and Anti-Skepticism in the Context of the Reformation
As Popkin argues, skeptical dilemmas such as the Problem of the Criterion appear both directly and indirectly in Reformation-era debates concerning the standard of religious truth. The “problem of the criterion” is the issue of how to justify a standard of truth and settle disputes over this standard without engaging in circular reasoning. According to Popkin, this skeptical problem of the criterion entered into religious debates when Reformers challenged the authority of the Pope on matters of religious truth and endeavored to replace this criterion with individual conscience and personal interpretation of Scripture (Popkin 2003).
Popkin draws on the controversy between Martin Luther (1483-1546) and Desiderius Erasmus (1466-1536) on the freedom of the will as one example of how the skeptical problem of the criterion figured indirectly into debates concerning the criterion of religious truth (Popkin 2003). In On Free Will (1524), Erasmus attacks Luther’s treatment of free will and predestination on the grounds that it treats obscure questions that exceed the scope of human understanding (Popkin 2003; Maia Neto 2017). Erasmus offers a loosely skeptical response, proposing that we accept our epistemic limitations and rely on the authority of the Catholic Church to settle questions such as those posed by Luther (Popkin 2003; Maia Neto 2017). Luther’s response to Erasmus, entitled The Bondage of the Will (1525), argues against Erasmus’ skeptical emphasis on the epistemic limitations of human beings and acquiescence to tradition in response to those limits. Luther argues instead that a true Christian must have inner certainty regarding religious knowledge (Popkin 2003).
Sebastian Castellio (1515/29-1563), another reformer, takes a more moderate approach to the compatibility between faith and epistemic modesty in his On the Art of Doubting and Concerning Heretics (1554). In Castellio’s stance, Popkin identifies yet another approach to the skeptical problem of the criterion (Popkin 2003). Like Erasmus, Castellio emphasizes the epistemic limitations of human beings and the resultant difficulty of settling obscure theological disputes. In contrast to Erasmus, Castellio does not take these epistemic limitations to require submission to the authority of the Catholic Church. In contrast to Luther, Castellio does not stipulate inner certainty as a requirement for genuine Christian faith. Instead, Castellio argues that human beings can still draw “reasonable” conclusions based on judgment and experience rather than either the authority of tradition or the authority of inner certitude (Popkin 2003; Maia Neto 2017).
c. Skepticism and Anti-Skepticism in Pedagogical Reforms
Academic skepticism had a major impact in mid-sixteenth century France through the pedagogical reforms proposed by Petrus Ramus (1515-1572) and his student Omer Talon (c. 1510-1562) (Schmitt 1972). Ramus developed a Ciceronian and anti-Scholastic model of education that sought to bring together dialectic with rhetoric. Although Ramus expressed enthusiastic admiration for Cicero, he never explicitly identified his pedagogical reforms with Academic skepticism and his association with it was always indirect.
Omer Talon had a direct and explicit connection to Academic skepticism, publishing an edition of the Academica Posteriora in 1547, and an expanded and revised version which included the Lucullus in 1550. Talon included a detailed introductory essay and commentary that Schmitt has called the “first serious study of the Academica to appear in print” (Schmitt 1972). Talon’s introductory essay explicitly aligns Ramus’ pedagogical reforms with the philosophical methodology of Academic skepticism. He presents Academic methodology as an alternative to Scholastic models of education, defending its potential to cultivate intellectual freedom.
Talon adopts the Academic method of argument in utramque partem, or the examination of both sides of the issue, as his preferred pedagogical model. Cicero also claims this as his preferred method, maintaining that it is the best way to establish probable views in the absence of certain knowledge through necessary causes (Tusculan Disputations II.9). Although this method is typically associated with Cicero and Academic skepticism, Talon attributes it to Aristotle as well, who discusses this method in Topics I-II, 100a-101b (Maia Neto 2017). Despite the Aristotelian origins of the method of argument in utramque partem, it was not popular among the Scholastic philosophers of Talon’s time (Maia Neto 2017).
Talon’s use of skepticism is constructive rather than dialectical, insofar as he interprets the Academic model of argument in utramque partem as a positive tool for the pursuit of probable beliefs, rather than as a negative strategy for the elimination of beliefs. Specifically, he presents it as a method for the acquisition of probable knowledge in the absence of certain cognition through necessary causes. Following Cicero, Talon maintains that in a scenario where such knowledge is impossible, the inquirer can still establish the most probable view and attain by degrees a closer and closer approximation of the truth.
Talon’s main defense of Academic skepticism hinges on the idea of intellectual freedom (Schmitt 1972). He follows Cicero’s view that the Academic skeptics are “freer and less hampered” than the other ancient schools of thought because they explore all views without offering unqualified assent to any one of them (see Cicero’s Acad. II.3, 8). Like Cicero, Talon maintains that probable views can be found in all philosophical positions, including Platonism, Aristotelianism, Stoicism, and Epicureanism. To establish the most probable view, the inquirer should freely examine all positions without offering unqualified assent to any one of them.
Talon’s syncretism is another distinctive feature of his appropriation of Ciceronian skepticism. His syncretism consists in his presentation of Academic skepticism as harmonious with Socratic, Platonic, and even at times, Aristotelian philosophy. Throughout his introductory essay, Talon makes a point of demonstrating that Academic skepticism has an ancient precedent with Socrates, Plato, Aristotle, and even some earlier Pre-Socratic philosophers. He takes great care to clear Academic skepticism of common charges such as negative dogmatism, presenting it in a more positive light that emphasizes its common ground with other philosophical schools. Talon places particular emphasis on Socratic learned ignorance and commitment to inquiry as central to skeptical inquiry.
The impact of Academic skepticism in mid-sixteenth-century France can also be seen through the emergence of several anti-skeptical works. Ramus’ and Talon’s proposed pedagogical reforms were controversial for many reasons, one of which was the problem of skepticism. Pierre Galland (1510-1559), one of Ramus’ colleagues at the Collège de France, launched a fierce attack on the role of skepticism in these proposed reforms (for a detailed discussion of Galland’s critique, see Schmitt 1972). Galland’s main concern was that Ramus’ and Talon’s pedagogical reforms threatened to undermine philosophy and Christianity alike (Schmitt 1972). He argues that a skeptical attack on the authority of reason would eventually lead to an attack on all authority, including theological authority (Schmitt 1972).
Another example of anti-skepticism can be seen in a work by Guy de Brués entitled Dialogues contre les nouveaux académiciens (1557) (for a discussion of de Brués, see Schmitt 1972; see also Morphos’ commentary to his 1953 translation). In this work, Brués advances an extended attack on Academic skepticism through a dialogue between four figures associated with the Pléiade circle: Pierre de Ronsard, Jean-Antoine de Baïf, Guillaume Aubert, and Jean Nicot. In his dedicatory epistle to the Cardinal of Lorraine, Brués states that the goal of his anti-skeptical dialogue is to prevent the youth from being corrupted by the idea that “all things are a matter of opinion,” an idea he attributes to the New Academy. He argues that skepticism will lead the youth to distain the authority of religion, God, their superiors, justice, and the sciences. Much like that of Galland, Brués’ critique of skepticism centers on the threat of relativism and the rejection of universal standards (Schmitt 1972).
6. Renaissance Skepticism Post-1562: The Publication of Sextus Empiricus
In 1562, the Calvinist printer and classical philologist Henri Estienne (Henricus Stephanus) (c. 1528-1598) published the first Latin translation and commentary on Sextus Empiricus’ Outlines of Skepticism. This publication of Sextus’ works into Latin reshaped Renaissance discussions of skepticism. In 1569, Estienne printed an expanded edition of Sextus’ works that included a translation and introductory essay on Adversus Mathematicos by the Catholic counter-reformer, Gentian Hervet (1499-1584). This edition also included a translation of Diogenes Laertius’ Life of Pyrrho, and a translation of Galen’s anti-skeptical work, The Best Method of Teaching, by Erasmus. In contrast to the numerous editions and commentaries on the Academica that were available throughout the sixteenth century, Estienne’s editions were the only editions of Sextus’ works that were widely available in the sixteenth century. A Greek edition was not printed until 1621.
Estienne and Hervet both include substantial prefaces with their translations (for a discussion of these prefaces, see Popkin 2003; for a translation and discussion of Estienne’s preface, see Naya 2001). In each preface, the translator comments on the philosophical value of Sextus and states his goals in making Pyrrhonism available to a wider audience. Both prefaces treat the question of whether and how Pyrrhonism can be used in Christian apologetics, and both respond to the common objection that Pyrrhonism poses a threat to Christianity. Although Estienne was a Calvinist, and Hervet was an ardent Counter-Reformer, both offer a similar position on the compatibility of Christianity with Pyrrhonism. Both agree that Pyrrhonism is a powerful resource for undermining confidence in natural reason and affirming the special character of revelation. Although Estienne and Hervet were not philosophers, their framing of skepticism and its significance for religious debates had an impact on how philosophers took up these issues, especially given that these were the only editions of Sextus that were widely available in the sixteenth century.
a. Henri Estienne’s Preface to Sextus Empiricus’ Outlines of Skepticism
Henri Estienne’s preface to Sextus’ Outlines combines the loosely skeptical “praise of folly” genre popularized by Erasmus with a fideistic agenda resembling that of Gianfrancesco Pico della Mirandola. Estienne begins with a series of jokes, playfully presenting the Outlines as a kind of joke book. It begins as a dialogue between the translator and his friend, Henri Mesmes, in which the one Henri inquires upon the nature and value of skepticism, and the other Henri offers responses that parody the traditional Pyrrhonian formulae.
When asked about the nature and value of skepticism, Estienne recounts the “tragicomic” story of his own “divine and miraculous metamorphosis” into a skeptic. Drawing on conventional Renaissance representations of melancholy, Estienne recounts a time when he suffered from quartan fever, a disease associated with an excess of black bile. This melancholy prevented him from pursuing his translation work. One day, Estienne wandered into his library with his eyes closed out of fear that the mere sight of books would sicken him, and fortuitously came across his old notes for a translation of Sextus Empiricus. While reading the Outlines, Estienne began to laugh at Sextus’ skeptical attack on the pretensions of reason. Estienne’s laughter counterbalanced his melancholy, allowing him to return to his translation work afresh.
Estienne discusses the “sympathy” between his illness and its skeptical cure, describing an “antiperistasis” in which his excess of learning was counterbalanced by its opposite (namely skepticism). Much to his surprise, this skeptical cure had the fortuitous result of reconciling him with his scholarly work, albeit on new terms. Estienne’s encounter with skepticism allowed him to return to the study of classical texts by reframing his understanding of the proper relationship between philosophy and religion.
In the second half of his preface, Estienne turns to the question of whether skepticism poses a threat to Christianity. Anticipating the common objection that skepticism leads to impiety and atheism, he replies that it is the dogmatist rather than the skeptic who poses a genuine threat to Christianity and is at greater risk of falling into atheism. Whereas skeptics are content to follow local customs and tradition, dogmatists endeavor to measure the world according to their reason and natural faculties.
In the final paragraphs of the preface, Estienne addresses his reasons for publishing the first Latin translation of Sextus Empiricus. Returning to the themes of illness discussed at the beginning of the preface, he remarks that his goal is a therapeutic one. That is, his aim is to cure the learned of the “impiety they have contracted by contact with ancient dogmatic philosophers” and to relieve those with an excessive reverence for philosophy. Here, Estienne presents skepticism as a cure for the pride of the learned, playing on the ancient medical idea that health is a humoral balance that can be restored by counterbalancing one excess with another.
Finally, Estienne responds to the common objection that skepticism is an anti-philosophical method that will destroy the possibility of establishing any kind of truth. Estienne argues that this skeptical critique does not pertain to religious truths revealed in Scripture. He suggests instead that a skeptical attack on natural knowledge will only serve to reaffirm the prerogative of faith. Much like his predecessor, Gianfrancesco Pico della Mirandola, Estienne envisions Pyrrhonism as a tool to be used toward non-skeptical religious ends. He presents Pyrrhonian skepticism both as a therapy to disabuse the learned of their overconfidence in natural reason, and as a tool for affirming the special character of the truths revealed in Scripture.
b. Gentian Hervet’s Preface to Sextus Empiricus’ Adversus Mathematicos
Gentian Hervet’s 1569 preface to Sextus’ Adversus Mathematicos is more somber in tone than Estienne’s and places a more transparent emphasis on the use of Sextus in Christian apologetics. Hervet frames his interest in skepticism in terms of his desire to uphold the doctrines of Christianity, voicing explicit approval for the project of Gianfrancesco Pico della Mirandola. Hervet adds a new dimension to his appropriation of Pyrrhonism, presenting it as a tool for combatting the Reformation, and not only as a means of loosening the grip of ancient philosophy on Christianity.
Much like Estienne’s preface, Hervet’s preface begins with a brief history of his encounter with Sextus. He reports that he fortuitously stumbled across the manuscript in the Cardinal of Lorraine’s library when in need of a diversion. He recounts the great pleasure he took in reading Adversus Mathematicos, noting its particular success at demonstrating that no human knowledge is immune to attack. Like Estienne, Hervet argues that a skeptical critique of natural reason can help reinforce the special character of the truths revealed in Scripture. In contrast to Estienne, Hervet emphasizes the potential of Pyrrhonian arguments for undermining the Reformation.
Within his preface, Hervet discusses the value of Pyrrhonism for resolving religious controversies concerning the rule of faith. He raises the problem of the criterion in the context of religious authority, condemning the Reformers for taking their natural faculties as the criterion of religious truth, and for rejecting the authority of tradition (that is, the Catholic Church). He suggests that there is a fundamental incommensurability between our natural faculties and the nature of the divine, and thus that the effort to measure the divine based on one’s own natural faculties is fundamentally misguided. Hervet expresses hope that Pyrrhonism might persuade the Reformers to return to Catholicism, presumably due to the Pyrrhonian emphasis on acquiescence to tradition in the absence of certainty.
Hervet’s preface also discusses the potential utility of Pyrrhonism in Christian pedagogy. Anticipating the common objection that skepticism will corrupt the morals of the youth and lead them to challenge the authority of Christianity, Hervet argues instead that the skeptical method of argument in utramque partem—a method he mistakenly attributes to the Pyrrhonians rather than to the Academics—will eventually lead an inquirer closer and closer to the truth of Christianity. Far from undercutting faith, Hervet proposes that skeptical inquiry will ultimately support it. Specifically, he argues that the method of argument in utramque partem can help the student distinguish the ‘verisimile,’ or the truth-like from the truth itself.
Hervet borrows this vocabulary of verisimilitude from Cicero, who translates Carneades’ practical criterion, to pithanon, as veri simile and probabile. Although within this context, Hervet is ostensibly discussing the merits of Pyrrhonian methodology and not Academic methodology, his description of the goals of argument in utramque partem conflate Pyrrhonism with Academic Skepticism. Whereas the Academic practice of arguing both sides of every issue aims at the discovery of the most probable view, at least in certain cases and on certain interpretations, the Pyrrhonian practice of pitting opposing arguments against each other aims at equipollence.
7. Late Renaissance Skepticism: Montaigne, Charron, and Sanches
The most influential philosophers associated with Renaissance skepticism are Michel de Montaigne, Pierre Charron, and Francisco Sanches. Unlike their predecessors whose appropriations of ancient skepticism were largely subordinated to religious ends, these thinkers drew on skeptical strategies to address a wider range of philosophical questions and themes in areas ranging from epistemology to practical philosophy.
a. Michel de Montaigne
The most famous thinker associated with Renaissance skepticism is the French essayist and philosopher Michel de Montaigne (1533-1592). His Essays, first published in 1580, and expanded and revised up until his death, draw extensively on both Academic and Pyrrhonian skepticism among many other ancient and medieval sources. Throughout the Essays, Montaigne treats a great number of skeptical themes including the diversity of human custom and opinion, the inconsistency of human actions and judgment, the relativity of sense-perception to the perceiver, and the problem of the criterion.
The precise nature and scope of Montaigne’s skepticism is a topic of considerable scholarly debate. Some have located Montaigne’s inspiration in the Pyrrhonian skeptical tradition (Popkin 2003; Brahami 1997). Others have noted how Cicero’s Academica serves as one of Montaigne’s most frequently cited skeptical sources (Limbrick 1977; Eva 2013; Prat 2017). Still others have maintained that the philosophical character of Montaigne’s thought is not reducible to skepticism (Hartle 2003; 2005; Sève 2007). The following sections present a range of different views on the sources, nature, and scope of Montaigne’s skepticism, considering the merits and limitations of each.
i. Montaigne and Pyrrhonism
Among commentators, Montaigne is primarily associated with Pyrrhonian skepticism. In large part, this is due to Richard Popkin’s influential account of the central role that Montaigne played in the transmission of Pyrrhonian skepticism into early modernity. On Popkin’s account, Montaigne played a pivotal role in the revitalization of skepticism by applying Pyrrhonian strategies in a broader epistemological context than the one envisioned by his predecessors and contemporaries (Popkin 2003). Whereas earlier Renaissance thinkers used Pyrrhonian arguments to debate questions concerning the criterion of religious truth, Montaigne applies Pyrrhonian arguments to all domains of human understanding, thus launching what Popkin has termed the “Pyrrhonian crisis” of Early Modern Europe (Popkin 2003).
Popkin’s concept of the “Pyrrhonian crisis” is deeply indebted to Pierre Villey’s influential account of a personal “Pyrrhonian crisis” that Montaigne allegedly underwent while reading Sextus Empiricus. According to Villey, Montaigne’s intellectual development maps onto roughly three stages corresponding to the three books of the Essays: the earlier chapters exhibit an austere Stoicism, the middle chapters exhibit a Pyrrhonian crisis of uncertainty, and the final chapters exhibit an embrace of Epicurean naturalism (Villey 1908). Admittedly, most Montaigne scholars have rejected Villey’s three-stage developmental account for a variety of reasons. Some have rejected the idea that Montaigne’s skepticism was the result of a personal “Pyrrhonian crisis,” preferring to assess his skepticism on a more philosophical rather than psychological level. Others have questioned whether the Essays developed according to three clearly defined stages at all, pointing to evidence that Montaigne’s engagement with Skepticism, Stoicism, and Epicureanism extends beyond the confines of each book.
Scholars typically draw on Montaigne’s longest and most overtly philosophical chapter, the “Apology for Raymond Sebond,” as evidence for his Pyrrhonism. Here, Montaigne provides an explicit discussion of Pyrrhonian skepticism, voicing sympathetic approval for the Pyrrhonians’ intellectual freedom and commitment to inquiry in the absence of certitude. In a detailed description of ancient skepticism, Montaigne explicitly commends the Pyrrhonians in opposition to the Academic and dogmatic schools of ancient philosophy. Within this context, Montaigne voices approval for the Pyrrhonians, arguing that the Academics maintain the allegedly inconsistent view that knowledge is both unattainable and that some opinions are more probable than others. Within this description, Montaigne commends the Pyrrhonians both for remaining agnostic on whether knowledge is possible and for committing to inquiry in the absence of knowledge.
ii. Pyrrhonism in the “Apology for Raymond Sebond”
Since the “Apology” is the longest and most overtly philosophical chapter of the Essays, many scholars, such as Popkin, have treated the “Apology” as a summation of Montaigne’s thought. They have also treated Montaigne’s sympathetic exposition of Pyrrhonism as an expression of the author’s personal sympathies (Popkin 2003). Although scholars generally agree that “Apology” is heavily influenced by Pyrrhonism, its precise role is a matter of considerable debate. The main reasons have to do with the essay’s format and context.
As for the issue of context, the “Apology” was likely written at the request of the Catholic princess, Marguerite of Valois, to defend the Natural Theology of Raymond Sebond (1385-1436), a Catalan theologian whose work Montaigne translated in 1569. Montaigne’s defense of Sebond is (at least in part) intended to support Marguerite’s specific concerns in defending her Catholic faith against the Reformers (Maia Neto 2013; 2017).
As for the issue of format, the “Apology” is loosely structured as a disputed question rather than as a direct articulation of the author’s own position (Hartle 2003). In the manner of a disputed question, Montaigne defends Sebond’s Natural Theology against two principal objections, offering responses to each objection that are tailored to the views of each specific objector. For this reason, the statements that Montaigne makes within this essay cannot easily be removed from their context and taken to represent the author’s own voice in any unqualified sense (Hartle 2003; Maia Neto 2017).
Within the “Apology,” Montaigne ostensibly sets out to defend Sebond’s view that the articles of faith can be demonstrated through natural reason. The first objection he frames is that Christians should not support their faith with natural reason because faith has a supernatural origin in Divine grace (II: 12, F 321; VS 440). The second objection is that Sebond’s arguments fail to demonstrate the doctrines they allege to support (II: 12, F 327; VS 448). The first objection hinges on a dispute about the meaning of faith, and the second objection hinges on a dispute concerning the strength of Sebond’s arguments. Montaigne responds to both objections, conceding and rejecting aspects of each. In response to the first objection, Montaigne concedes that the foundation of faith is indeed Divine grace but denies the objector’s conclusion that faith has no need of rational support (II: 12, F 321; VS 441). In response to the second objection, Montaigne presents a Pyrrhonian critique of the power of reason to demonstrate anything conclusively—not only in domain of religious dogma, but in any domain of human understanding (II: 12, F 327-418; VS 448-557).
It is in the context of this second objection that Montaigne provides his detailed and sympathetic presentation of Pyrrhonism. Montaigne’s response to the second objection begins with a long critique of reason (II: 12, F 370-418; VS 500-557). Drawing on the first of Sextus’ modes, Montaigne presents an extended discussion of animal behavior to undermine human presumption about the power of reason. Following Sextus, Montaigne compares the different behaviors of animals to show that we have no suitable criterion for preferring our own impressions over those of the allegedly inferior animals. Drawing on the second mode, Montaigne points to the diversity of human opinion as a critique of the power of reason to arrive at universal truth. Montaigne places special emphasis on the diversity of opinion in the context of philosophy: despite centuries of philosophical inquiry, no theory has yielded universal assent. Finally, Montaigne attacks reason on the grounds of its utility, arguing that knowledge has failed to bring happiness and moral improvement to human beings.
Following this critique of reason, Montaigne turns to an explicit discussion of Pyrrhonian skepticism, paraphrasing the opening of Sextus’ Outlines (II: 12, F 371; VS 501). Identifying three possible approaches to philosophical inquiry, he writes that investigation will end in the discovery of truth, the denial that it can be found, or in the continuation of the search. Following Sextus, Montaigne frames these approaches as dogmatism, negative dogmatism, and Pyrrhonism. In contrast to the two alternative dogmatisms that assert either that they have attained the truth, or that it cannot be found, Montaigne commends the Pyrrhonists for committing to the search in the absence of knowledge.
Montaigne devotes the next few paragraphs to a detailed description of Pyrrhonian strategies (II:12 F 372; VS 502-3). He provides a sympathetic consideration of Pyrrhonian strategies in contrast to dogmatism and the New Academy (II: 12, F 374; VS 505). He concludes by commending Pyrrhonism for its utility in a religious context, writing that: “There is nothing in man’s invention that has so much verisimilitude and usefulness [as Pyrrhonism]. It presents man naked and empty, acknowledging his natural weakness, fit to receive from above some outside power; stripped of human knowledge in himself, annihilating his judgment to make room for faith” (II:12, F 375; VS 506). By undermining the pretenses of reason, Pyrrhonism prepares human beings for the dispensation of Divine grace.
This connection Montaigne draws between the Pyrrhonian critique of reason and the embrace of faith in the absence of any rational grounds for adjudicating religious disputes, has led to his related reputation as a “skeptical fideist” (for a discussion of Montaigne’s skeptical fideism, see Popkin 2003 and Brahami 1997. For the view that Montaigne is not a fideist, see Hartle 2003; 2013). Those who interpret Montaigne as a “skeptical fideist” often take his exposition of Pyrrhonism and its utility in a Christian context as an expression of Montaigne’s personal view on the limited role of reason in the context of faith (Popkin 2003).
Others, however, have argued that Montaigne’s endorsement of Pyrrhonism and its utility in a religious context cannot be taken as a simple expression of Montaigne’s own position (see, for example, Hartle 2003 and 2013. See also Maia Neto, 2017). In the context of the “Apology” as a whole, Montaigne’s endorsement of Pyrrhonism and its utility in a religious context is part of a response to the second objection to Sebond. In his response to the second objection, Montaigne is arguing on the basis of the assumptions of Sebond’s opponents. He counters the conclusion that Sebond’s arguments fail to adequately demonstrate religious dogma by showing that all rational demonstrations (and not just Sebond’s specific effort to demonstrate the Articles of Faith) are similarly doomed.
Further evidence that suggests that Montaigne’s endorsement of Pyrrhonism ought to be understood as a qualified one can be found in his address to the intended recipient of the essay. Following his detailed exposition of Pyrrhonism and its potential to assist a fideistic version of Catholicism, Montaigne addresses an unnamed person as the intended recipient of his defense of Sebond (II: 12 F 419; VS 558). This addressee is typically assumed to be Princess Marguerite of Valois (Maia Neto 2013; 2017). In Montaigne’s address to the princess, he qualifies his sympathetic presentation of Pyrrhonism in ambivalent terms, calling it a “final fencer’s trick” and a “desperate stroke” that should only be used “reservedly,” and even then, only as a last resort (II: 12, F 419; VS 558). Montaigne urges the Princess to avoid Pyrrhonist arguments and continue to rely on traditional arguments to defend Sebond’s natural theology against the Reformers. Montaigne warns the Princess of the consequences of undermining reason in defense of her Catholic faith (II: 12, F 420; VS 559).
Following this warning to the Princess, Montaigne returns to some further consequences of the Pyrrhonian critique of reason and the senses. He returns to Pyrrhonian themes, such as the relativity of sense-perception to the perceiver, challenging the idea that the senses can serve as an adequate criterion of knowledge. Borrowing from the third mode, Montaigne argues that 1.) if we were lacking in certain senses, we would have a different picture of the world, and 2.) we might perceive additional qualities to those that we perceive through our existing faculties. He raises further challenges to the senses using the first and second mode, pointing to 1.) the lack of perceptual agreement between humans and animals, and 2.) the lack of perceptual agreement between different human beings (II:12, F 443-54).
After Montaigne’s reformulation of the skeptical modes, he turns to a reformulation of the problem of the criterion: “To judge the appearances that we receive of objects, we would need a judicatory instrument; to verify this instrument, we need a demonstration; to verify the demonstration, an instrument: there we are in a circle” (II:12 F 454; VS 600-01).If the senses cannot serve as a criterion of truth, Montaigne then asks whether reason can, but concludes that demonstration leads to an infinite regress (II: 12, F 454; VS 601).
The suspension of assent is the traditional skeptical response to the absence of an adequate criterion of knowledge. This can be done either in the manner of certain Academics by provisionally approving of likely appearances, or in the manner of the Pyrrhonians by suspending assent while permitting oneself to be non-dogmatically guided by appearances. Montaigne expresses reservations toward both solutions. In response to the Academic solution, he raises the Augustinian objection that we have no criterion for selecting certain appearances as likelier than others, a problem that introduces yet another infinite regress (II: 12, F 455; VS 601). In response to the Pyrrhonian solution, he expresses reservations about acquiescence to changeable custom. He concludes that “[t]hus nothing certain can be established about one thing by another, both the judging and the judged being in continual change and motion” (II: 12 F 455; VS 601).
This claim that “nothing certain can be established about one thing by another,” is the conclusion to Montaigne’s response to the second objection. To recall, the second objection was that Sebond’s arguments fail to prove what he set out to demonstrate, namely the Articles of Faith. The Pyrrhonian response to the second objection is that reason is incapable of establishing anything conclusive at all, not only in matters of religious truth but in any domain of human understanding. Montaigne concludes his response to the second objection with the idea that the only knowledge we can attain would have to be through the assistance of divine grace. This conclusion to the essay is often taken as additional evidence for Montaigne’s “skeptical fideism” (Popkin 2003).
Again, whether this conclusion should serve as evidence that Montaigne is personally committed to Pyrrhonism or to skeptical fideism depends on whether we interpret his response to the objections to Sebond as dialectical. That is, it depends on whether we view Montaigne’s arguments as responses he generates on the basis of the assumptions of his opponents—assumptions to which he is not personally committed—in order to generate a contradiction or some other conclusion deemed unacceptable by his opponents.
When taken as a dialectical strategy, however, Montaigne’s use of skepticism still shares much in common with Pyrrhonism understood as a “practice” and “way of life.” For this reason, one might still conclude that although Montaigne’s endorsements of Pyrrhonism within the “Apology” do not necessarily represent the author’s own voice, the methodology and argumentative strategies he adopts within the “Apology” do indeed share much in common with the practice of Pyrrhonism.
iii. Pyrrhonian Strategies Beyond the “Apology”
Although most discussions of Montaigne’s skepticism focus on the “Apology for Raymond Sebond,” this is hardly the sole example of his use of Pyrrhonian strategies. In chapters such as “Of cannibals” and “Of custom and not easily changing an accepted law,” Montaigne adopts the Pyrrhonian mode concerning the diversity of custom to challenge his culture’s unexamined claims to moral superiority. In chapters such as “We taste nothing pure,” he adopts the modes concerning the relativity of sense-perception to the perceiver to challenge the authority and objectivity of the senses. In chapters such as “That it is folly to measure the true and the false by our own capacity” and “Of cripples,” Montaigne adopts skeptical arguments to arrive at the suspension of judgment concerning matters such as knowledge of causes and the possibility of miracles and other supernatural events.
In the domain of practical philosophy, Montaigne borrows from the Pyrrhonian tradition yet again, often recommending behavior that resembles the Pyrrhonian skeptic’s fourfold observances. In the absence of an adequate criterion of knowledge, the Pyrrhonian skeptics live in accordance with four guidelines that they claim to follow “non-dogmatically” (PH 1.11). These observances include the guidance of nature; the necessity of feelings; local customs and law; and instruction in the arts (PH 1.11). Montaigne frequently recommends conformity to similar observances. In “Of custom and not easily changing an accepted law,” for example, he recommends obedience to custom, and criticizes the presumption of those who endeavor to change it. In many cases, Montaigne’s recommendation to obey custom in the absence of knowledge extends to matters of religion. This is yet another reason why Montaigne’s affinities with Pyrrhonian skepticism are often associated with “fideism.” On the “skeptical fideist” interpretation, Montaigne’s obedience to Catholicism is due to the skeptical acquiescence to custom (Popkin 2003; Brahami 1997). The question of Montaigne’s religious convictions as well as his alleged “fideism” is a matter of considerable debate (see, for example, Hartle 2003; 2013).
iv. Montaigne and Academic Skepticism
Although most commentators focus on the Pyrrhonian sources of Montaigne’s skepticism, some scholars have emphasized the influence of Academic skepticism on Montaigne’s thought (see, for example, Limbrick 1977; Eva 2013; Prat 2017; and Maia Neto 2017). One reason to emphasize this role is that Cicero’s Academica was Montaigne’s most frequently quoted skeptical source (Limbrick 1977). Another reason has to do with the philosophical form and content of the Essays, especially Montaigne’s emphasis on the formation of judgment (as opposed to the suspension of judgment and elimination of beliefs) and his emphasis on intellectual freedom from authority as the defining result of skeptical doubt.
We can find one example of Cicero’s influence on Montaigne’s skepticism in his detailed exposition of skepticism in the “Apology for Raymond Sebond.” Here Montaigne weighs the relative merits of the dogmatic and skeptical approaches to assent, embedding two direct quotations from Cicero’s Academica (II:12, F 373; VS 504). Following Cicero’s distinction characterized in Academica 2.8, Montaigne articulates the value of suspending judgment in terms of intellectual freedom (II:12, F 373; VS 504). Although within this context, Montaigne is admittedly discussing the value of Pyrrhonian skepticism, he borrows Cicero’s language and emphasis on intellectual freedom as the defining result of the epoché (for a discussion of Montaigne’s blending of Academic and Pyrrhonian references, see Limbrick 1977; Eva 2013; and Prat 2017).
Although these passages in the “Apology” provide evidence for the influence of Cicero’s Academica on Montaigne’s skepticism, they also admittedly provide evidence for a critical view of the New Academy. As discussed above, within this exposition of skepticism, Montaigne voices explicit approval for the Pyrrhonians over the Academics. Although his characterizations of skepticism borrow significantly from Cicero, he uses these descriptions to present the Pyrrhonians in a more favorable light. Whether these statements should be taken as representative of Montaigne’s own voice, or whether they are part of a dialectical strategy, is discussed above.
Beyond the “Apology for Sebond,” we can see further examples of Montaigne’s debt to Academic skepticism. In contrast to the Pyrrhonian emphasis on the elimination of beliefs, Montaigne adopts skeptical strategies in ways that appear to accommodate the limited possession of beliefs. In this respect, Montaigne’s skepticism resembles the “mitigated” skepticism attributed to Cicero, whose “probabilism” permits the acquisition of tentative beliefs on a provisional basis (see Academica 2.7-9).
One example of the influence of Cicero’s mitigated skepticism can be seen in Montaigne’s discussion of education in chapter I: 26, “Of the education of children.” Given the prominent role of Cicero’s Academica in pedagogical debates in sixteenth century France, this context is hardly surprising (see discussion of Omer Talon above). Throughout “Of the education of children,” Montaigne articulates the goal of education as the formation of individual judgment and the cultivation of intellectual freedom (I: 26, F 111; VS 151). Montaigne recommends a practice that closely resembles the Academic method of argument in utramque partem as a means of attaining intellectual freedom. He recommends that the student weigh the relative merits of all schools of thought, lending provisional assent to the conclusions that appear most probable. The student should be presented with as wide a range of views as possible in the effort to carefully examine the pros and cons of each (I: 26, F 111; VS 151). The student should resist unqualified assent to any doctrine before a thorough exploration of the variety of available positions.
Montaigne presents this exercise of exploring all available positions as a means to attaining a free judgment (I: 26, F 111; VS 151). Through this emphasis on the freedom of judgment, Montaigne’s discussion of the nature and goals of education has clear resonances with that of his contemporary, Omer Talon. Like Talon, Montaigne presents skeptical strategies as a positive tool for cultivating intellectual freedom from authority rather than as a negative strategy for undermining unqualified assent to dogmatic knowledge claims. This emphasis on intellectual freedom and the freedom of judgment resonates more clearly with Ciceronian skepticism than with Pyrrhonism.
Montaigne’s appropriation of Academic skeptical strategies extends beyond his discussions of pedagogy. Beyond the essay “Of the education of children,” Montaigne emphasizes the formation of judgment as a goal of his essay project, often referring to his Essays as the “essays of his judgment” (see II. 17, F 495; VS 653; II: 10, F 296; VS 407; and I: 50, F 219; VS 301-302.) In “Of Democritus and Heraclitus,” for example, Montaigne writes: “Judgement is a tool to use on all subjects and comes in everywhere. Therefore in the essays [essais] that I make of it here, I use every sort of occasion. If it is a subject I do not understand at all, even on that I essay [je l’essaye] my judgment” (I: 50, F 219; VS 301-302). Rather than containing a finished product, or set of conclusions, the Essays embody the very activity of testing or “essaying” judgment (see La Charité 1968 and Foglia 2011 for the role of judgment in Montaigne’s thought).
Throughout the Essays, Montaigne tests or “essays” his judgment on a wide range of topics, attempting to explore these topics from all possible directions. At times he entertains the evidence for and against any given position in a manner that resembles the Academic method of argument inutramque partem. At other times, his method resembles the Pyrrhonian practice of counterbalancing opposing arguments, appearances, and beliefs. Although the “method” of the Essays shares aspects of both skeptical traditions, where it appears closer to Ciceronian skepticism is in Montaigne’s apparent acceptance of certain positive beliefs. Like Cicero, Montaigne appears to hold some beliefs that he accepts on a tentative and provisional basis. In this respect, his skepticism is closer to Cicero’s “mitigated” skepticism than to Sextus’ more radical skepticism that aspires to a life without beliefs.
Although the precise character and extent of Montaigne’s skepticism remain a topic of considerable scholarly debate, most commentators would likely agree on at least some version of the following points: Montaigne was deeply influenced by ancient skepticism and incorporates elements of this tradition into his own thought. Whatever the precise nature of this influence, Montaigne appropriates aspects of ancient skepticism in an original way that goes beyond what was envisioned by its ancient proponents. Montaigne’s essay form, for example, is just one way that he appropriates skeptical strategies toward new ends.
b. Pierre Charron
After Montaigne, Pierre Charron (1541-1643) is one of the most influential figures of Renaissance skepticism. Charron was a close friend and follower of Montaigne. He draws heavily on Montaigne and the Academic skeptical tradition in his major work, Of Wisdom (1601, 1604). According to Maia Neto, Charron’s Of Wisdom was “the single most influential book in French philosophy during the first half of the seventeenth century” (Maia Neto 2017).
In Of Wisdom, Charron expounds what he takes to be the core of Montaigne’s thought. He does so through a method and model of knowledge adopted from Academic skepticism. Charron’s indebtedness to the Academic skeptical tradition can be seen in his emphasis on intellectual freedom from authority and his idea that wisdom consists in the avoidance of error. Following certain Academic skeptics, Charron maintains that truth is not fully accessible to human beings (Maia Neto 2017). Instead, he argues that the truth is only fully available to God. Despite the inaccessibility of truth to human beings, Charron proposes that through the proper use of reason, we can nonetheless avoid error. In Charron’s view, it is the avoidance of error rather than the establishment of a positive body of knowledge that constitutes genuine wisdom. In this respect, he develops what Maia Neto calls a “critical rationalism not unlike that held earlier by Omer Talon and by Karl Popper in the twentieth century” (Maia Neto 2017).
c. Francisco Sanches
Along with Montaigne and Charron, the Iberian physician and philosopher Francisco Sanches (1551-1623) is one of the most notable thinkers associated with Renaissance skepticism. His skeptical treatise, That Nothing is Known (1581), sets out a detailed critique of Aristotelianepistemology drawing on familiar skeptical lines of attack. Sanches’ use of skepticism stands out from many of his predecessors and contemporaries insofar as he applies it to epistemological issues rather than strictly religious ones.
In That Nothing is Known, Sanches targets the Scholastic concept of scientia, or knowledge through necessary causes. Throughout this work, Sanches mobilizes skeptical arguments to attack several Aristotelian ideas, including the idea that particulars can be explained through universals (TNK 174-179) and the idea that the syllogism can generate new knowledge (TNK 181-182). Based on these critiques, Sanches concludes that the Aristotelian concept of scientia results in an infinite regress and is therefore impossible (TNK 195-196). We cannot have scientia of first principles or of any conclusions derived from first principles (TNK 199). It is in this sense that Sanches argues for the skeptical thesis suggested by his title.
Much like that of Montaigne, the precise character of Sanches’ skepticism is a topic of considerable debate. Some scholars maintain that Sanches’ skepticism was inspired by Pyrrhonism. This interpretation was first advanced by Pierre Bayle, who refers to Sanches as a “Pyrrhonian” skeptic in his 1697 Dictionary entry (Limbrick 1988; Popkin 2003). This interpretation finds support in Sanches’ use of skeptical arguments against sense perception as a criterion of knowledge, a strategy resembling the Pyrrhonian modes. One issue with this interpretation, however, is that many thinkers of Bayle’s time used the terms “Pyrrhonism” and “skepticism” interchangeably. Another issue with this interpretation is that there is no conclusive evidence that Sanches read Sextus Empiricus (Limbrick 1988).
For this reason, many scholars maintain that Sanches drew inspiration from Academic skepticism instead (Limbrick 1988; Popkin 2003). This interpretation finds support in the title of Sanches’ work—a clear reference to the skeptical thesis attributed to Arcesilaus. As further evidence of Sanches’ affinities with the New Academy, scholars often point to a letter to the mathematician Clavius (Limbrick 1988; Popkin 2003). In this letter, Sanches uses skeptical arguments to challenge the certainty of mathematical knowledge. He even signs his name as “Carneades philosophus,” explicitly associating himself with a famous representative of Academic skepticism.
Still others have argued that the Galenic medical tradition serves as another source of inspiration for Sanches’ skepticism. Elaine Limbrick, for example, shows that Sanches’ medical training was particularly influential for his skepticism and epistemology in general (Limbrick 1988). She argues that Galen’s emphasis on empirical observation and experiment was fundamental to Sanches’ rejection of Aristotelianism and his effort to develop a new scientific methodology (Limbrick 1988).
Although Sanches uses skeptical strategies in his attack on Aristotelian epistemology, he was not himself a thoroughgoing skeptic. Although Sanches concludes that the Aristotelian concept of scientia is impossible, he does not therefore conclude that all knowledge is impossible. One indication of this is that throughout That Nothing is Known, Sanches refers to other works, one of which deals with methodology, and another of which deals with the acquisition of positive knowledge of the natural world (TNK 290). Sanches appears to have intended these works to explain what knowledge might look like—specifically knowledge of the natural world—in the absence of scientia. Unfortunately, the fate of Sanches’ additional works on the positive acquisition of knowledge remains unknown. They were either lost or never published.
Although we can conclude that Sanches never intended That Nothing is Known to serve as a final statement on his own epistemology, we can only speculate as to what his positive epistemology might have looked like. Since Sanches uses skeptical arguments to undermine the Aristotelian conception of knowledge and pave the way for a different approach to knowledge of the natural world, Popkin and many others have characterized his skepticism as “mitigated” and “constructive” (Popkin 2003). Popkin goes further to argue that Sanches’ theory of knowledge would have been “experimental” and “fallibilist” (Popkin 2003). In this view, although Sanches uses skeptical strategies to undermine the Aristotelian conception of scientia, his ultimate goal is not to undermine the possibility of knowledge as such, but to show that in the absence of scientia, a more modest kind of fallible knowledge is nonetheless possible.
8. The Influence of Renaissance Skepticism
Renaissance skepticism had a considerable impact on the development of seventeenth century European philosophy. Thinkers ranging from Descartes to Bacon developed their philosophical systems in response to the skeptical challenges (Popkin 2003). A close friend of Montaigne’s, Marie Le Jars de Gournay (1565-1645), for example, draws on skeptical arguments in her Equality of Men and Women (1641). In this work, Gournay deploys traditional skeptical strategies to draw out the logically unacceptable conclusions of arguments for gender inequality (O’Neill 2007). François La Mothe Le Vayer (1588-1672), often associated with the “free-thinking” movement in seventeenth century France, also deploys skeptical strategies in his attacks on superstition (Popkin 2003; Giocanti 2001). Pierre Gassendi (1592-1665), known for his revival of Epicureanism, adopts skeptical challenges to Aristotelianism in his Exercises Against the Aristotelians (1624) and draws on the probabilism of the New Academy in his experimental and fallibilist approach to science (Popkin 2003). René Descartes (1596-1650) takes a methodical and hyperbolic form of skeptical doubt as the starting point in his effort to establish knowledge on secure foundations. Although Descartes uses skeptical strategies, he only does so in an instrumental sense, that is, as a tool for establishing a model of scientific knowledge that can withstand skeptical attack. Much like Descartes, Blaise Pascal (1623-1662) was both influenced by skeptics such as Montaigne and deeply critical of them. Although he arguably embraced a version of fideism that shared much in common with thinkers such as Charron and Montaigne, he also attacks these thinkers for their skepticism.
Much like Renaissance skepticism, post-Renaissance treatments of skepticism represent a diverse set of philosophical preoccupations rather than a unified school of thought. To the extent that a central distinction between Renaissance and post-Renaissance skepticism can be identified, it could be said that most Renaissance skeptics place a greater emphasis on debates concerning the criterion of religious truth, whereas most post-Renaissance skeptics place a greater emphasis on the application of skeptical arguments to epistemological considerations. Moreover, most Renaissance skeptics, much like their ancient counterparts, are explicitly concerned with the practical implications of skepticism. In other words, many of the representative figures of Renaissance skepticism are concerned not only with identifying our epistemic limitations, but with living well in response to those limits.
9. References and Further Reading
a. Primary Sources
Brués, Guy de. Dialogues: Critical Edition with a Study in Renaissance Scepticism and Relativism. Translated and edited by Panos Paul Morphos. Johns Hopkins Studies in Romance Literatures and Languages; Baltimore: John Hopkins Press, 1953.
Translation and commentary on Guy de Brués’ Dialogues in English.
Castellio, Sebastien. Concerning Heretics. Trans. and ed. Roland H. Bainton. New York: Columbia University Press, 1935. English translation.
Castellio, Sebastien. De Arte Dubitandi et confidendi Ignorandi et Sciendi. Ed. Elisabeth Feist Hirsch. Leiden: E. J. Brill, 1981.
Charron, Pierre, De la Sagesse, Corpus des Oeuvres de Philosophie en Langue Français, revised by B. de Negroni, Paris: Fayard, 1986.
Cicero, Marcus Tullius. Academica and De Natura Deorum. Loeb Classical Library, trans. H.
Rackham, Cambridge: Harvard University Press, 1933.
Erasmus, Desiderius, and Martin Luther, translated by Ernst F. Winter. Discourse on Free Will. Milestones of Thought. New York: Continuum, 1997.
English translation of Erasmus’ Free Will and Luther’s Bondage of the Will.
Henry of Ghent, “Can a Human Being Know Anything?” and “Can a Human Being Know Anything Without Divine Illumination?” in The Cambridge Translations of Medieval Philosophical Texts, Volume 3: Mind and Knowledge. Edited and translated by R. Pasnau, Cambridge University Press, 2002, pp. 93-135.
John Buridan, “John Buridan on Scientific Knowledge,” in Medieval Philosophy: Essential Readings with Commentary, G. Klima (ed.), Blackwell, 2007, pp. 143-150.
John Duns Scotus, Philosophical Writings, ed. and trans. Allan B. Walter, Cambridge: Hackett Publishing Co., 1987.
John of Salisbury, The Metalogicon of John of Salisbury, trans. with an introduction by Daniel McGarry, Berkeley: University of California Press, 1955.
The translation used in the quotations above, parenthetically cited as “ML” followed by page number.
John of Salisbury, Policraticus: Of the Frivolities of Courtiers and the Footprints of Philosophers, ed. C.J. Nederman, Cambridge: Cambridge University Press, 1990.
The translation used in the quotations above, parenthetically cited as “PC” followed by page number.
Montaigne, Michel de. Les Essais. Ed. Pierre Villey and V.-L. Saulnier. 3 vols., 2nd ed. Paris : Presses Universitaires de France. 1992.
The French edition used in the quotations above, parenthetically cited as “VS” followed by page number.
Montaigne, Michel de. Œuvres complètes. Ed. Albert Thibaudet and Maurice Rat. Paris: Gallimard, Bibliothèque de la Pléiade, 1962.
Montaigne, Michel de. The Complete Essays of Montaigne. Translated by Donald M. Frame. Stanford: Stanford University Press, 1943.
The English translation used in the quotations above, parenthetically cited as “F” followed by page number.
Montaigne, Michel de. The Complete Works of Montaigne: Essays, Travel Journal, Letters. Trans. Donald Frame. Stanford, Calif.: Stanford University Press, 1967.
Naya, Emmanuel. “Traduire les Hypotyposes pyrrhoniennes : Henri Estienne entre la fièvre quarte et la folie chrétienne.” In Le Scepticisme Au XVIe Et Au XVIIe Siècle. Ed.
Moreau, Pierre-François. Bibliothèque Albin Michel Des Idées. Paris: A. Michel, 2001.
Includes a French translation of Henri Estienne’s introductory essay to his translation of Sextus Empiricus’ Outlines of Skepticism.
Nicholas of Autrecourt, His Correspondence with Master Giles and Bernard of Arezzo: A Critical Edition and English Translation by L.M. de Rijk, Leiden: E.J. Brill, 1994.
Popkin, Richard H., and José Raimundo Maia Neto. Skepticism: An Anthology. Amherst, N.Y.: Prometheus Books, 2007.
Includes an English translation of Hervet’s introductory essay to his translation of Sextus’ Adversus Mathematicos, and excerpts from Gianfrancesco Pico della Mirandola’s Examination of the Vanity of the Doctrines of the Gentiles and of the Truth of the Christian Teaching (1520) among other sources.
Sanches, Francisco, That Nothing Is Known = (Quod Nihil Scitur), introduction, notes, and bibliography by Elaine Limbrick, and text established, annotated, and translated by D. F. S. Thomson. Cambridge; New York: Cambridge University Press, 1988.
Critical edition with an extensive introduction. Latin text and English translation. The English translation is parenthetically cited above as “TNK” followed by page number.
Sextus Empiricus, Outlines of Pyrrhonism, Loeb Classical Library, trans. R.G. Bury, Cambridge: Harvard University Press, 1933.
Sextus Empiricus, Adversus Mathematicos, Loeb Classical Library, trans. R.G. Bury, Cambridge: Harvard University Press, 1935.
Estienne and Hervet’s introductions to their translations of Sextus Empiricus from the 1569 edition available in facsimile. Hervet’s introduction begins on the second unpaginated page, and Estienne’s introduction begins on p. 400: https://gallica.bnf.fr/ark:/12148/bpt6k109336w.r=estienne%20hervet?rk=21459;2.
c. Secondary Sources
Brahami, Frédéric. Le scepticisme de Montaigne. Paris : Presses Universitaires de France, 1997.
Brush, Craig B. Montaigne and Bayle: Variations on the Theme of Skepticism. The Hague: Martinus Nijhoff, 1966.
Carraud, Vincent, and J.-L. Marion, eds. Montaigne: scepticisme, métaphysique, théologie. Paris : Presses Universitaires de France, 2004.
La Charité, Raymond C. The Concept of Judgment in Montaigne. The Hague: Martinus Nijhoff, 1968.
Copenhaver, B. P., & Schmitt, C. B., Renaissance Philosophy. Oxford: Oxford University Press, 1992.
Eva, Luiz, “Montaigne et les Academica de Cicéron,” Astérion, 11, 2013.
Floridi, Luciano. Sextus Empiricus: The Transmission and Recovery of Pyrrhonism. American Classical Studies. New York: Oxford University Press, 2002.
Foglia, Marc. Montaigne, pédagogue du jugement. Paris : Classiques Garnier, 2011.
Friedrich, Hugo. Montaigne. Edited by Philippe Desan. Translated by Dawn Eng. Berkeley: University of California Press, 1991.
Funkenstein, Amos. “Scholasticism, Scepticism, and Secular Theology,” in R. Popkin and C. Schmitt (eds.), Scepticism from the Renaissance to the Enlightenment, Wiesbaden: Harrassowitz. 1987 : 45-54.
Giocanti, Sylvia. Penser l’irrésolution : Montaigne, Pascal, La Mothe Le Vayer: Trois itinéraires sceptiques. Paris: Honoré Champion, 2001.
Grellard, Christophe. Jean de Salisbury et la renaissance médévale du scepticisme, Paris: Les Belles Lettres, 2013.
Hartle, Ann. Michel de Montaigne: Accidental Philosopher. Cambridge: Cambridge University Press, 2003.
Hartle, Ann. “Montaigne and Skepticism” in The Cambridge Companion to Montaigne, ed. Langer, Ullrich. Cambridge: Cambridge University Press, 2005.
Hartle, Ann. Montaigne and the Origins of Modern Philosophy. Evanston: Northwestern University Press, 2013.
Lagerlund, Henrik. Rethinking the History of Skepticism: The Missing Medieval Background. Studien Und Texte Zur Geistesgeschichte Des Mittelalters; Bd. 103. Leiden ; Boston: Brill, 2010.
Lagerlund, Henrik. Skepticism in Philosophy, a Comprehensive Historical Introduction. New York : Routledge, 2020.
Larmore, Charles. “Un scepticisme sans tranquillité: Montaigne et ses modèles antiques.” In Montaigne: scepticisme, métaphysique, théologie, edited by V. Carraud and J.- L. Marion, 15-31. Paris: Presses Universitaires de France, 2004.
Limbrick, Elaine, “Was Montaigne Really a Pyrrhonian?” Bibliothèque d’Humanisme et Renaissance 39, no. 1 (1977): 67-80.
Maia Neto, José Raimundo. “Academic Skepticism in Early Modern Philosophy.” Journal of the History of Ideas 58, no. 2 (1997): 199-220.
Maia Neto, José Raimundo & Richard H. Popkin (ed.), Skepticism in Renaissance and Post-Renaissance Thought: New Interpretations. Humanity Books, 2004.
Maia Neto, José Raimundo “Le probabilisme académicien dans le scepticisme français de Montaigne à Descartes” Revue Philosophique De La France Et De L’Étranger 203, no.4 (2013): 467-84.
Maia Neto, José Raimundo. “Scepticism” in Lagerlund, Henrik, and Benjamin Hill eds. Routledge Companion to Sixteenth Century Philosophy. New York: Routledge, 2017.
Naya, Emmanuel. “Renaissance Pyrrhonism, a Relative Phenomenon,” In Maia Neto J.R., Paganini G. (eds) Renaissance Scepticisms. International Archives of the History of Ideas, vol 199. Dordrecht: Springer, 2009.
O’Neill, Eileen. “Justifying the Inclusion of Women in Our Histories of Philosophy: The Case of Marie de Gournay.” In The Blackwell Guide to Feminist Philosophy (eds L.M. Alcoff and E.F. Kittay), 2007.
Paganini, G., & Maia Neto, J. R., eds., Renaissance Scepticisms. International Archives of the History of Ideas, vol 199. Dordrecht: Springer, 2009.
Paganinni, Gianni. Skepsis. Le débat des moderns sur le scepticisme. Montaigne – Le Vayer – Campanella – Hobbes – Descartes – Bayle. Paris: J. Vrin, 2008.
Perler, Dominik. Zweifel und Gewissheit: Skeptische Debatten im Mittelalter. Frankfurt am Main: Klosterman, 2006.
Popkin, R. H., The History of Scepticism from Savonarola to Bayle. Oxford: Oxford University Press, 2003.
Prat, Sebastien. “La réception des Académiques dans les Essais: une manière voisine et inavouée de faire usage du doute sceptique” in Academic Scepticism in Early Modern Philosophy, eds. Smith, Plínio J., and Sébastien Charles Archives Internationales D’histoire des Idées; 221. Cham, Switzerland: Springer, 2017: 25-43.
Schiffman, Zachary S. “Montaigne and the Rise of Skepticism in Early Modern Europe: A Reappraisal,” Journal of the History of Ideas, Vol. 45, No. 4 (Oct. – Dec. 1984).
Smith, Plínio J., and Sébastien Charles. Academic Scepticism in Early Modern Philosophy. Archives Internationales D’histoire des Idées; 221. Cham, Switzerland: Springer, 2017.
Schmitt, C. B., Gianfrancesco Pico Della Mirandola (1469–1533) and His Critique of Aristotle. The Hague: Nijhoff, 1967.
Schmitt, C. B., Cicero Scepticus: A Study of the Influence of the Academica in the Renaissance. The Hague: Nijhoff, 1972.
Schmitt, C.B., “The Rediscovery of Ancient Skepticism in Modern Times,” in Myles Burnyeat, ed., The Skeptical Tradition. Berkeley: University of California Press, 1983: 226-37.
Sève, Bernard. Montaigne: Des Règles Pour L’esprit. Philosophie D’aujourd’hui. Paris: PUF, 2007.
Villey, Pierre. Les Sources & L’évolution Des Essais De Montaigne. 1908. Reprint, Paris: Hachette & Cie, 1933.
Zupko, Jack “Buridan and Skepticism.” Journal of the History of Philosophy, 31 (2), (1993): 191-221.
George Berkeley announces at the very outset of Three Dialogues Between Hylas and Philonous that the goals of his philosophical system are to demonstrate the reality of genuine knowledge, the incorporeal nature of the soul, and the ever-present guidance and care of God for us. He will do this in opposition to skeptics and atheists.
A proper understanding of science, as Berkeley sees it, will be compatible with his wider philosophy in achieving its goals. His project is not to rail against science or add to the scientific corpus. Quite to the contrary, he admires the great scientific achievements of his day. He has no quarrel with the predictive power and hence the usefulness of those theories.
His project is to understand the nature of science including its limits and what it commits us to. A proper understanding of science will show, for example, that it has no commitment to material objects and efficient causation. Understanding this and other philosophical prejudices will undercut many of the assumptions leading to skepticism and atheism.
In exploring the nature of science, Berkeley provides insights into several of the central topics of what is now called the philosophy of science. They include the nature of causation, the nature of scientific laws and explanation, the nature of space, time, and motion, and the ontological status of unobserved scientific entities. Berkeley concludes that causation is mere regularity; laws are descriptions of fundamental regularities; explanation consists in showing that phenomena are to be expected given the laws of nature; absolute space and time are inconceivable; and at least some of the unobserved entities in science do not exist, though they are useful to science. Each of these topics is explored in some detail in this article.
Philosophy of Science emerged as a specialized academic discipline in the mid-20th Century, but philosophers as early as Plato and Aristotle developed views about science that we recognize today as addressing central topics of the discipline. Philosophy of Science addresses the nature of science including its methods, goals, and institutions. Recent Philosophy of Science has drawn heavily from the history and sociology of science (Marcum). Typical topics are the structure of explanation, theories, confirmation, the objectivity of science, the role of values in science, and the difference between science and pseudoscience. It is especially important to reflect on science since it appears to give us our very best examples of knowledge and our best tools for understanding nature.
Periods of significant scientific change, such as the introduction of general relativity and quantum mechanics or Darwin’s theory of evolution, have and continue to provoke heightened philosophical reflection. George Berkeley had the good fortune of living during one of these periods. Through a critique of Scholasticism (an amalgam of Aristotelianism and Catholicism) what is now recognized as the beginning of modern science emerged. The period was roughly from 1550 to 1750. Its luminaries included Copernicus, Kepler, Galileo, Descartes, Boyle, Torricelli, and Newton. Berkeley had a broad understanding of the science of his day including what we now call the psychology of visual perception, medicine, biology, chemistry, and physics. He also had a keen grasp of current mathematics.
Building or elaborating scientific theories was not Berkeley’s goal. He had no quarrels with the empirical content of the best theories. He welcomed their usefulness in bettering our lives. His project was to critique mistaken philosophical interpretations and mistaken popularizations of some theories, especially those that led to skepticism and atheism. His philosophical system is largely a reaction to the materialistic mechanism as espoused by many scientists and philosophers, in particular, Descartes and Locke. Berkeley’s critique rejects a key provision of the theory: an ordinary object (apple or chair) is a material substance—an unthinking something that exists independently of minds. Berkeley’s ontology only includes spirits or minds and ideas. Our senses are to be trusted and all physical knowledge comes by way of experience (3Diii 238, DM ϸ21).
This is an oversimplification, but here is not the place to consider his arguments and qualifications for immaterialism (Flage).
In the course of his reaction to materialistic mechanism and other scientific theories, Berkeley made important and novel contributions to understanding concepts crucial to the nature of science. For example, causation, laws of nature, explanation, the cognitive status of theoretical entities, and space and time. His contribution to these topics is examined below. Berkeley’s reflection on science occurs throughout his many works, from Essay on Vision to Sirus (S), but the bulk of his thought is contained in The Principles of Human Knowledge (PHK), Three Dialogues Between Hylas and Philonous (3D), and De Motu (DM). His views, on the important topics mentioned, continued to evolve throughout his writings, becoming more sensitive to actual scientific practice.
2. Causation
a. Physical Causation
Causal claims occur throughout ordinary language and science. Overcooking caused the chicken to be tough. Salt caused the freezing point of the water to rise. Diabetes is caused by insulin insufficiency. Causes as commonly understood, make their effects happen. Many verbs in English, such as the terms ‘produce’ or ‘bring about’, capture the “make happen” feature of causation.
Berkeley’s account of causation plays a central role in his philosophical system and his understanding of the methods, goals, and limits of science. Take the example of fire causing water to boil. When one examines the case, according to Berkeley, ideas of yellow, orange, and red in shimmering motion are followed by ideas of a translucent bubbly haze. In short, one set of ideas is accompanied by another set of ideas. The crucial point is that no “making happen” or “producing” is available to the senses.
All our ideas, sensations, or the things which we perceive, . . . are visibly inactive: there is nothing of power or agency included in them. So that one idea or object of thought cannot produce or make any alteration in another. To be satisfied of the truth of this, there is nothing else requisite but a bare observation of our ideas. (PHK ϸ25)
The basic argument is as follows:
Efficient causes are active.
Ideas are inert (inactive).
Therefore, ideas are not efficient causes.
The justification for b is d.
Ideas when observed are found to be inert.
Ideas do undergo changes and we do have ideas of motion, but none of this counts as activity for Berkeley. What constitutes activity in an idea? Could there not be some feature or aspect of ideas that are hidden from sense, some feature that is active? Berkeley’s answer is no.
Ideas exist only in the mind.
Therefore, there is nothing in them but what is perceived.
Causation in the physical world amounts to one set of ideas regularly followed by another set of ideas. Berkeley uses a variety of terms to mark the contrast with efficient causation: ‘natural causes,’ ‘second causes,’ ‘material causes,’ ‘instruments,’ ‘physical causes,’ and ‘occasional causes’ (S ϸ160, 245; PC Berkeley to Johnson ϸ2). There is no necessary connection between the relata in a causal relation. Berkeley suggests that a better way to conceive of the regularity among ideas in a “causal” relation is as of signs to things signified. Fire is a sign of boiling water. Additionally, signs do not make happen what they signify. The appropriateness of the sign/thing signified relation is further explored in a later section.
This account does not fit our common understanding of causation. Berkeley recognizes this and has no desire to propose that we speak differently in ordinary affairs. In fact, he often lapses into the vernacular. Our common speech presents no problems in ordinary practical affairs, but the philosopher, when being careful, knows that physical causes do not make their effects happen.
b. Efficient Causation
There is a domain where real or efficient causes occur as opposed to the mere physical regularities described above. When one intends to raise her arm and by force of will raises it, stands as an example of efficient causation. Real causation is carried out by an act of mind. Considering the example, Berkeley believes we know this is efficient causation containing the active requirement for causation, though he thinks we have no sensible idea of it.
Returning to the physical causation, the regularities among ideas are created and maintained by God’s will. Although we as creatures with minds have the ability to will certain ideas, many ideas are forced upon us, independently of our will. These are caused by God.
An important consequence of the distinction between physical and efficient causes is what natural philosophy should and should not study. Natural philosophy should focus on understanding the world in terms of physical causes. Efficient causation is the business of theology and metaphysics. Only these disciplines should consider explanations invoking efficient causation (DM ϸ41).
It is not known to what extent Berkeley influenced David Hume. Hume, the third member of the British Empiricists along with John Locke and Berkeley, developed a more detailed version of a regularity theory of causation. Though Berkeley denies any necessary connection between the causal relata in physical causation, he provides no account of our strong tendency to believe that the relation between the relata is more than constant conjunction. For Hume, the power or necessity in causation is produced from our experience; it is in us not in the objects themselves. He also speaks to the temporal and spatial requirements for the relation between cause and effect and considers what counts as an appropriate regularity (Lorkowski). Hume’s theory is importantly different from Berkeley’s in that he holds that all causation is mere regularity. Acts of the will are no exception. Using Berkeley’s terminology, on Hume’s account, all causes are physical causes.
3. Laws of Nature
The early account of laws of nature in The Principles of Human Knowledge treats them as the regularities discussed under causation:
The ideas of sense . . . have likewise a steadiness, order and coherence, and are not excited at random, . . . but in a regular train or series . . . Now the set rules, or established methods, wherein the Mind we depend on excites in us the ideas of Sense, are called the laws of nature; and these we learn by experience, which teaches us that such and such ideas are attended with such and such other ideas, in the ordinary course of things (PHK ϸ30).
The same account is repeated in the Three Dialogues. Laws are “no more than a correspondence in the order of Nature between two sets of ideas, or things immediately perceived” (3Diii 24).
Here laws of nature are low-level empirical generalizations that assert a regularity between phenomena or aspects of phenomena. They are learned by experience by both natural philosophers and ordinary people alike and are found useful in guiding their lives. Berkeley emphasizes that the relation between the relata in a law of nature is not a necessary relation. God has conjoined smoke with fire, but he could have conjoined fire with a high-frequency tone or anything else He pleased. Though Berkeley is not explicit on this matter, it does appear that laws of nature are not restricted to a universal logical form, that is, the form whenever phenomena of type A occur without exception, phenomena of type B occur. Statements expressing probabilities count as laws as well. So, both “breeding albatrosses lay one egg per year” and “most people with lung cancer are smokers” are laws. Berkeley persistently stresses that the important feature of laws is that they are useful. For Berkeley, this usefulness attests to the wisdom and benevolence of God, who has created and maintains them.
In addition to laws of modest scope whose terms refer to what is immediately perceived, “there are certain general laws that run through the whole chain of natural effects . . .” (PHK ϸ61). An example is Galileo’s Law: Any body falling from rest freely to earth covers a distance proportional to the square of the time it has spent falling. These general laws and sets of general laws such as Newton’s Laws of Motion provide a “largeness of comprehension” that occupy the attention of the natural philosopher. They enable one to see the unity in apparently diverse phenomena. For example, the unity in falling bodies, tides, and planetary orbits. Some very general and fundamental laws allow for the explanation of other laws.
In mechanical philosophy those are to be called principles, in which the whole discipline is grounded and contained, these primary laws of motion which have been proved by experiments, elaborated by reason and rendered universal. These laws of motion are conveniently called principles, since from them are derived both general mechanical theorems and particular explanations of phenomena (DM ϸ36).
These more fundamental laws are no longer simple correlations or inductive generalizations perceived and learned by experience. Instead, they are laws of great generality containing theoretical terms (such as “force”) and proved by experiments.
4. Explanation
To explain phenomena, they must be “reduced to general rules” (PHK ϸ 105) or alternatively be shown to conform to the laws of nature (PHK ϸ61). This account is a very early version of what is now called the covering law account of explanation. The sentence describing the phenomenon or event to be accounted for is called the explanandum. The sentences describing the information that does the explaining is called the explanans. According to Berkeley’s account, the explanans must contain a law of nature (DM ϸ37). It will typically also contain sentences describing a number of facts. Consider a simple example that would have been quite familiar to Berkeley: Suppose a pendulum oscillates for a period of 6.28 seconds. The explanandum, a period of 6.28 seconds, must be shown to be in conformity with a law. The relevant law is T=2π√(L/g) where T is the period, L is the length of the pendulum, and g is the acceleration due to gravity (10 meters per second2). If L is 10 meters, the period will be 6.28 seconds. The explanandum follows deductively from the explanans. The length of the pendulum being 10 meters and the law cited, explain the period being 6.28 seconds.
Explanans
(1)
T = π√L/g
(2)
L = 10 meters
———————-
Explanandum
T = 6.28 seconds
There is nothing puzzling about the period once the law and pendulum length are known. The period was to be expected.
Figure 1. Diagram of simple pendulum.
An important difference between the contemporary covering law account of explanation and Berkeley’s version is that the contemporary account requires that the sentences making up the explanans, including the law(s), be true (Hempel 89-90). As discussed in the next section, Berkeley regards some laws of nature, most notably Newton’s laws of motion, as neither true nor false. They are not the sort of things that can be true or false. They are guides, calculating devices, and useful fictions. This is not to disparage them. Berkeley regards Newton’s laws as the greatest achievement in natural philosophy and a model for future science (PHK ϸ110, S ϸ243, 245). The role of laws is to enable us to expect what will happen. Newton’s laws are remarkably successful at this goal.
Berkley argues that the goal of science is not necessarily to uncover true laws, nor will true laws be better at helping us expect phenomena. The goal of mature science is to produce general laws. They are easy to use, few in number, and give predictive control of a wide range of phenomena. The virtue of laws and the explanations they enable is serving these practical goals. His insight is that true laws may be in tension with these practical virtues. True laws may be too complex, too cumbersome to apply, too numerous to serve the practical goal of simplicity, and so forth. The first objective of laws and explanations is usefulness.
The covering law account of explanation has received a range of criticisms. This is not the place to rehearse these criticisms and evaluate their force. But there is one prominent criticism that deserves consideration. Seeing how Berkeley would respond brings together his positions on causation, laws of nature, and explanation.
Consider the pendulum example again: Intuitively there is an asymmetry between explaining the period in terms of the length of the pendulum verses explaining the length in terms of the period. L explains T. T does not explain L, but T can be calculated from L and L can be calculated from T. Using Berkeley’s position on how explanations make phenomena intelligible, given L, T is expected and given T, L is expected. So, it appears that the covering law view of explanation cannot account for the asymmetry. The covering law view lets in too much. It sanctions T explains L, yet this conflicts with strong intuitions. The problem is not merely an artifact of the pendulum case. It arises with many natural laws including Boyle’s Law, Ohm’s Law, and the laws of geometric optics, along with others.
In response to this, Berkeley would insist that there are no efficient causes in nature. The alleged asymmetry is a relic of the mistaken view that the length of the pendulum causes its period, but the period does not cause the length of the pendulum. Causal relations and laws of nature describe regularities, not what makes things happen:
. . . the connexion of ideas does not imply the relation of cause and effect, but only the mark of sign and thing signified. The fire which I see is not the cause of the pain I suffer upon my approaching it. In like manner the noise that I hear is not the effect of that motion or collision . . . , but the sign thereof (PHK ϸ65).
In customary talk, there may be an asymmetry where causes can explain effects but not vice versa, but when efficient causation is replaced with regularities between sign and thing signified, the asymmetry disappears. “Causes” can be signs of “effects” and, as in the above quotation, “effects” can be signs of “causes”. Noise is the sign of a collision.
The Berkeleyan defense of the covering law account rests on the claim that the way in which explanations make phenomena intelligible is by giving one reason to expect them or to calculate their occurrence (PHK ϸ31, S ϸ234). This is undoubtedly Berkeley’s official position. Carl Hempel, the leading contemporary defender of the covering law account of explanation, would agree with Berkeley on the point of explanation and how to handle the asymmetries. The asymmetries according to Hempel are due to “preanalytic causal and teleological ideas” (Hempel 95). These ideas are hardly the basis for a systematic and precise analysis of explanation.
In DeMotu Berkeley hints at a very different account of how explanations make phenomena intelligible:
For once the laws of nature have been found out, then it is the philosopher’s task to show that each phenomenon is in constant conformity with those laws, that is, necessarily follows from those principles. In that consist the explanation and solution of phenomena and assigning their cause, i. e. the reason why they take place (DM ϸ37).
There are two issues of concern here: 1) Berkeley asserts that the explanandum must follow necessarily from the explanans. This is inconsistent with allowing statistical laws in explanations. As has been suggested, there is no reason Berkeley cannot allow this. God created and maintains the laws of nature to help us know what to expect. Their practical nature is well served by statistical laws. 2) Much more importantly, he invokes a different rationale for how explanations make phenomena intelligible. There is a significant difference between providing grounds for expecting or calculating events and providing “the reason why they take place.” In the pendulum example, the period allows for the calculation of the length, but it does not provide the cause or reason why it is 10 meters. That rests with the designer of the pendulum or the manufacturing process.
Perhaps Berkeley has misspoken or is speaking not as a philosopher, or perhaps he is under the spell of the very view of causation he has rejected. If Berkeley wants to maintain the requirement that explanations tell us why events take place, he will need an account of the asymmetry discussed. Of course, he must do this without appeal to efficient causation. There are numerous ways to do this. For one, the length of the pendulum can be given a covering law explanation independently of the period, but an explanation of the period appears to require appeal to the length of the pendulum (Jobe). This suggestion and others, need careful development including an account of their relevance to the larger issue of explanation. The point here is that answers to the asymmetry problem might be available that do not invoke efficient causation.
5. Theories and Theoretical Entities
a. Scientific Instrumentalism and Newtonian Forces
Much of De Motu is an argument for how to understand the status of forces in Newton’s theories of motion and gravitation. In the first section Berkeley warns the reader of “. . . being misled by terms we do not rightly understand” (DM ϸ1). The suspect terms at issue occur in the science of motion. They fall into two groups: The first includes ‘urge,’ ‘conation,’ and ‘solicitation.’ These play no role in the best accounts of motion and have no legitimate role in physical science. They are “of somewhat abstract and obscure signification” (DM ϸ2) and on reflection clearly apply solely to animate beings (DM ϸ3). The second group includes ‘force,’ ‘gravitation,’ and allied terms. Berkeley’s attention is focused on this group. He expresses a worry about these terms by way of an example. When a body falls toward the center of the earth it accelerates. Some natural philosophers are not satisfied with simply describing what happens and formulating the appropriate regularity. In addition, a cause of the acceleration is assigned—gravity.
A major motivation for Berkeley writing De Motu was to resist treating forces and gravitation as efficient causes. Some of Newton’s followers and perhaps Newton himself held this view. Given the prestige of Newton’s physics, it was particularly important for Berkeley to respond. Treating forces as efficient causes would undermine Berkeley’s immaterialism, but Berkeley is not merely defending his own philosophical territory. Regardless of one’s commitment, or lack of it, to immaterialism, Berkeley raises significant issues about forces.
One could simply argue that there are no forces. So, force-talk should be abandoned. This would certainly rid the scene of forces as causes. Much the same has happened with caloric, phlogiston, ether, and witches. The terms have disappeared from highly confirmed theories along with any causal role assigned to the entities. Berkeley’s view is more subtle than this. His general thesis is that “force,” “gravity,” and allied terms lack the significance required to indicate the real nature of things. The terms are not meaningless, as they have a useful role to play in scientific theories, but they lack the sort of significance needed to support a realistic understanding of forces. They fail to indicate distinct entities or qualities.
Lisa Downing has detailed Berkeley’s argument for an anti-realistic understanding of forces (Downing 1996, 2005 238-249). The key premise is as follows:
P. Forces are unknown qualities of bodies, that is, unsensed.
From this he concludes:
C. Force terms (‘force,’ ‘gravity,’ ‘attraction’) fail to indicate (refer to) distinct qualities.
Though Berkeley takes P as obvious, he does have an argument for it. Forces as efficient causes are active qualities of bodies. They must be unsensed, for on careful examination all the sensed qualities of bodies are passive.
What licenses the move from P to C? Naming or referring to forces requires conceiving of forces. To conceive of physical entities requires a sense-based idea of them (Downing 2005 247).
Berkeley does not hold that all meaningful words stand for ideas. This view, often attributed to John Locke, is aggressively criticized by Berkeley (Pearce 194-196). Words need not bring a distinct idea to the speaker’s or hearer’s mind. In fact, force terms without standing for ideas are meaningful. Their significance comes from the usefulness they provide in Newtonian dynamics. A system of mathematical rules that employ force terms allow for precise predictions. This is accomplished lacking the kind of significance needed to secure reference. With ‘force’ failing to name anything, forces cannot be understood realistically.
Berkeley’s examination of forces is not only destructive. He had a great appreciation of the explanatory success of Newtonian dynamics. He saw that force terms play an important role in the theory. He interprets those terms instrumentally. They do not “indicate so many distinct qualities,” but they are useful in reasoning about motion:
Force, gravity, attraction and terms of this sort are useful for reasonings and reckonings about motion . . . As for attraction, it was certainly introduced by Newton, not as a true physical quantity, but only as a mathematical hypothesis (DM ϸ17).
Berkeley gives perspicuous illustrations of what he means by mathematical hypotheses and being useful in reasoning. The first occurring after the above quote concerns the parallelogram of forces. This mathematical technique allows for the computation of the resultant force. But this force is not proffered as a “true physical quantity” though it is very useful for predicting the motion of bodies (DM ϸ18). The second illustration reminds us of how considering a curve as an infinite number of straight lines (though it is not in reality) can be of great utility. For instance, it allows a geometrical proof of the common formula for the area of a circle—A = π r2, and in mechanics, it is also useful to think of circular motion as “arising from an infinite number of rectilinear directions” (DM ϸ61).
Figure 2
For numerous practical purposes a circle can be regarded as composed of many straight lines.
b. Scientific Realism and Corpuscularianism
Corpuscularianism was the dominant theoretical framework for the physical sciences in the 17th century. The basic position is a form of atomism. Bodies are material objects existing independently of human minds and composed of minute particles (corpuscles) that are unobservable. Their properties are restricted to size, shape, position, and motion (the primary qualities). Corpuscles explain the properties of bodies including their color, smell, temperature, and sound (the secondary qualities).
Given the prominence of the corpuscularian theoretical framework and Berkeley’s intimate familiarity with the works of many of the theory’s proponents (notably Rene Descartes, Robert Boyle, and John Locke), it is appropriate to ask how he understood the status of the framework’s fundamental entities—corpuscles. The received view has been that Berkeley must hold instrumentalism for all theoretical entities (Popper; Warnock 202; Newton-Smith 152; Armstrong 32-34). This position is encouraged by at least two considerations: (1) When Berkeley explicitly addresses the cognitive status of theoretical entities it is always to argue against realism. He never offers arguments for a realistic understanding of some theoretical entities. (2) Berkeley’s immaterialism maxim, esse est percipi, (to be is to be perceived) was thought to be incompatible with realism for theoretical entities.
More recent scholarship attempts to show that a realistic understanding of corpuscles is compatible with Berkeley’s wider philosophical position, if not embraced by him (Downing 1995, 2005 230-235; Garber; Winkler 238-275). Berkeley’s immaterialist version of corpuscularianism must be qualified in several important ways: First, corpuscles are not bits of matter that are mind independent. They are sets of ideas just as ordinary objects are. Second, corpuscles do not cause anything, but they can be signs of things signified. Third, Berkeley does not endorse the primary/secondary quality distinction. The ideas that make up corpuscles have the same range of qualities as the ideas that make up ordinary objects. This does not prohibit him from recognizing that the primary qualities may be more useful in formulating laws with predictive power. Fourth, corpuscles are in principle sensible. This qualification was accepted by many practicing corpuscularian scientists. Sensing corpuscles is neither logically nor scientifically impossible. It allows a response to the charge that esse est percipi rules out a realistic account of corpuscles.
At the beginning of the Principles, Berkeley spells out his account of ordinary physical objects—apples, stones, books, and so forth. A group of ideas are ‘’observed to accompany each other”, given a name and regarded as one thing (P ϸ1). An apple has a certain odor, color, shape, and texture associated with it. Berkeley immediately recognizes a problem. If things are sets or bundles of ideas, what happens to the existence of things when not sensed? “The table I write on I say exists; that is, I see and feel it: and if I were out of my study I should say it existed; meaning thereby that if I was in my study I might perceive it . . .“(P ϸ3). The counterfactual account is not just needed to explain the continuity of physical objects when unsensed. Apples have a backside and a core. When held in one’s hand only a part of the apple is seen. But under certain conditions, according to Berkeley, one would see the backside and the core. Consider an apple that has fallen from a tree and rolled under leaves never to be sensed by anyone. Quite plausibly there are such apples. Again, Berkeley can use his counterfactual analysis to deal with their existence. If one were walking through the orchard and removed the leaves, she would perceive the apple. This account of the continuity of ordinary objects is clear, but unfortunately it appears to violate common sense—something Berkeley claims to champion. Berkeley’s table goes in and out of existence. To say he would see it when he enters his study is not to say it exists when he is absent from his study. Berkeley sees this as problematic and considers various approaches to continuity in his writings. There is disagreement among scholars on what Berkeley’s preferred position is and on what position fits best with the core principles of his immaterialism (Pitcher 163-179; Winkler 207-244).
In the Three Dialogues, Berkeley hints at a position that both elaborates the counterfactual account and speaks directly to what entities actually exist. Hylas, the spokesperson for materialism, claims that immaterialism is incompatible with the scriptural account of creation. Everything exists eternally in the mind of God; hence everything exists from eternity. So how can entities both exist from eternity and be created in time? Berkeley agrees with Hylas that nothing is new or begins in God’s mind. The creation story must be relativized to finite minds. What actually exists is what God has decreed to be perceptible in accord with the laws of nature. He has made his decrees in the order of the biblical account. If finite minds would have been present, they would have had the appropriate perceptions (3Diii 253).
Obviously, God has decreed that apples are perceivable by finite minds. Given the laws of nature, the core, the backside, and buried apples, would be perceived given one is in the right location. Once God has decreed that something is perceivable, the relevant counterfactuals are supported by the laws of nature, which God created and maintains.
Berkeley’s account is situational. It depends on the observer being in the right place at the right time with no barriers interfering with the light and the observer having well-working visual faculties. If corpuscles exist God has decreed that they are observable under certain conditions. Perhaps corpuscles are analogous to the apple under the leaves. Though neither has been observed they are both observable in principle. Observing the buried apple requires removing the leaves. Observing corpuscles requires being in the right place with a sufficiently powerful microscope. It is not required that the appropriate microscope ever be invented. Economic conditions, for example, may prevent its development. What is required is that the scope is scientifically possible.
The analogy is not perfect. First, in the 18th century, some apples had been observed; no corpuscles had been observed. Second, a special apparatus is not required to see apples. Seeing corpuscles requires a very powerful microscope.
The fact that apples have generic observability (some apples have been observed) whereas no corpuscles have been observed, will only be damning if this provides a reason for corpuscles being inconceivable. As is discussed, it does not. The need for a special apparatus in the case of corpuscles can be answered. Surely eyeglasses are a permissible apparatus. The principles by which light microscopes work are known. They work basically in the same way eyeglasses work. Microscopes do not enable one to merely detect some entity or see the effects of the entity; they, like eyeglasses, enable one to see the entity.
This raises the question of how can corpuscles be treated realistically when forces cannot? In both cases, they are unsensed. There are two important differences for Berkeley: (1) Forces are unperceivable in principle whereas corpuscles are not; (2) Corpuscles can be imagined, and forces cannot be. For Berkeley, imagining is a kind of inner perceiving. Images are constructed by us from ideas that are copies of ideas originally “imprinted on the senses” (PHK ϸ1). One can imagine elephants with train wheels for legs moving about on tracks. Similarly, scientists can imagine corpuscles as tiny objects with a certain shape, size, and texture. Berkeley does not think a construction of any sort is available for forces (DM ϸ6). So, though no corpuscles have been perceived, they are conceivable and the term ‘corpuscle’ is not without meaning.
The textual evidence for Berkeley endorsing corpuscularianism comes from Principles (ϸ60-66) where Berkeley answers a particular objection to his philosophy. What purpose do the detailed mechanisms of plants and animals serve when those mechanisms are ideas caused by God and of no causal power? Similarly, why the inner wheels and springs of a watch? Why does not God simply have the hands turn appropriately without internal complexity?
Berkeley’s answer is that God could do without the inner mechanisms of watches and nature, but he chooses not to do so in order for their behavior to be consonant with the general laws that run throughout nature. These laws of manageable number have been created and maintained by God to enable us to explain and anticipate phenomena. A world without internal mechanisms would be a world where the laws of nature would be so numerous to be of little use.
Berkeley describes the mechanisms as “elegantly contrived and put together” and “wonderfully fine and subtle as scarce to be discerned by the best microscope” (P ϸ60). Admittedly he does not explicitly mention corpuscularian mechanisms, but Garber (182-184) gives several reasons why Berkeley included them. Nowhere does Berkeley deny the existence of the subtle mechanisms or suggest that they should be treated instrumentally. His descriptions of the mechanisms often mirror those of John Locke speaking of corpuscles. Perhaps most importantly, if the science of Berkeley’s day is to explain various phenomena including heat combustion and magnetism, it must refer to hidden mechanisms including corpuscles.
Siris is Berkeley’s last major work. It provides textual support for corpuscularian realism. Siris is a peculiar work. Much of it is devoted to praising the medicinal virtues of tar water (a mixture of pine tar and water), and explaining the scientific basis for its efficacy. The latter project explores parts of 18th century chemistry drawing on a number of corpuscularian hypotheses. The key point is that Berkeley never raises anti-realistic concerns about the relevant entities. He does this in the context of affirming his immaterialism and pointedly repeating his instrumental account of Newtonian forces found in De Motu (Downing 205).
Figure 3: Cartesian diagram showing how screw shaped particles accounted for magnetism.
Berkeley’s familiarity with the advances in microscopy provides further indirect support for immaterialistic corpuscularianism. Berkeley knew that there were many entities that were unobservable at one time and later became observable. There was no reason to believe that progress in microscope technology would not continue revealing further mechanisms. In fact, some of Locke’s contemporaries believed that microscopes would improve to a point where corpuscles could be seen.
The general point, one supporting realism, is that mere current unobservability does not speak against realism. To the contrary, the progressive unveiling of nature supports realism.
If Berkeley is a scientific realist about corpuscles, aether, and other entities, this might explain his lack of an argument for realism. He thought that all that was valuable in the best science was not incompatible with immaterialism. Immaterialism along with realism about entities is perhaps regarded as the norm. The outlier is Newtonian forces. They require special argument.
c. Absolute Space and Motion
Absolute motion and absolute space are not understood realistically or instrumentally by Berkeley. He recommends that natural philosophers dismiss the concepts. Relative space and motion will more than adequately serve the purposes of physics. The debate about absolute motion and space has a long and complex history. Berkeley’s critique is often regarded as an anticipation of that of Ernest Mach (Popper).
According to Newton, absolute space “. . . in its own nature and without regard to anything external, always remains similar and immovable.” Absolute space is not perceivable. It is known only by its effects. It is not a physical object or a relation between physical objects. It is a “container” in which motions occur. Absolute motion is the motion of a physical object with respect to absolute space. Relative space, as Berkeley understood it, is “. . . defined by bodies; and therefore, an object of sense” (DM ϸ52). Relative motion requires at least two bodies. One body changes its direction and distance relative to another body (DM ϸ58). If all bodies were annihilated but one, it could not be in motion.
Newton had many reasons, including theological, for endorsing absolute space. In Newtonian physics a special frame of reference must be stipulated in order to apply the laws of motion. There are many possible frames of reference—the earth, the sun, our galaxy, and so on. Are they all equally adequate? A falling object will have a different acceleration and trajectory depending on the chosen reference frame. The differences may be slight and of minimal practical importance, but present a significant theoretical problem. If Newton’s laws are to apply in every reference frame, various forces will need to be postulated from frame to frame. This appears ad hoc and leads to great complexity. To blunt the problem, Newton thought a privileged frame was needed—absolute space (Nagel 204 -205).
Berkeley argued against Newton’s position from his early writings in TheNotebooks, The Principles ofHuman Knowledge, and De Motu. As with forces, he wanted to reject absolute space as an efficient cause, but he also had theological motivations. He found the view that absolute space necessarily exists, is uncreated, and cannot be annihilated, abhorrent. It put absolute space in some respects on the level of God. Nevertheless, Berkeley’s arguments against absolute space do not involve theological principles. The focus here is on the critique in De Motu, Berkeley’s last and most thorough treatment of the topic.
Berkeley has two lines of criticism of absolute space and in turn absolute motion. The first is a general argument from his theory of language; the second responds to Newton’s demonstration of absolute space. On the first line of criticism, imagine all bodies in the universe being destroyed. Supposedly what remains is absolute space. All its qualities (infinite, immovable, indivisible, insensible and without relation and distinction) are negative qualities (DM ϸ53). There is one exception. Absolute space is extended, a positive quality. But Berkeley asks what kind of extension can neither be measured nor divided nor sensed nor even imagined? He concludes that absolute space is pure negation, a mere nothing. The term “absolute space” fails to refer to anything since it is neither sensible nor imaginable (DM ϸ53). This reasoning is similar to the argument against forces, though absolute space has no instrumental value in theorizing.
In the second line of criticism, two thought experiments of Newton designed to demonstrate the existence of absolute space and motion are examined. Though Newton admitted that absolute space was insensible, he thought it could be known through its effects. It was essential that Berkeley take up these experiments. Even though the first line of criticism showed, if cogent, that ‘absolute space’ fails to name anything in nature, further argument was required to show that it was not needed, even instrumentally, for an adequate physical account of motion.
The first thought experiment involves two globes attached by a cord spinning in circular motion. No other physical bodies exist. There is no relative motion of the globes but there is a tension in the cord. Newton believes the tension is a centrifugal effect and is explained by the globes being in motion with respect to absolute space. Berkeley’s response is to deny the conceivability of the experiment. The circular motion of the globes “cannot be conceived by the imagination” (DM ϸ59). In other words, given Newton’s description of the experiment there can be no motion of the globes. Berkeley then supposes that the fixed stars are suddenly created. Now the motion of the globes can be conceived as they approach and move away from different heavenly bodies. As for the tension in the cord, Berkeley does not speak to it. Presumably, there is no tension or motion until the stars are created.
In the much-discussed second thought experiment, a bucket half-filled with water is suspended from a tightly twisted cord. In Phase 1 the bucket is allowed to start spinning. The surface remains a plane and the sides of the bucket accelerate relative to the water. In Phase 2 the water rotating catches up with the bucket sides and is at rest relative to them. Now the surface of the water is concave having climbed the sides of the bucket. In Phase 3 the bucket is stopped. The water remains concave and is accelerated relative to the sides of the bucket. In Phase 4 the water ceases to rotate and is at rest relative to the sides.
On Newton’s understanding, the shape of the water does not depend on the water’s motion relative to the sides of the bucket. It is a plane in Phase 1 and Phase 4 and concave in Phase 2 and Phase 3. However, the concave shape of the water demands explanation. A force must be responsible for it. According to his second law (the force acting on an object is equal to the mass of the object times its acceleration), a force indicates an acceleration. Since the acceleration is not relative to the bucket sides, it must be relative to absolute space (Nagel 207-209).
Figure 4: Relevant phases in bucket experiment.
Berkeley has a response. Given a body moving in a circular orbit, its motion at any instant is the result of two motions: One along the radius and one along the tangent of the orbit. The concave shape of the water in phase 2 is due to an increase of the tangential forces on the particles of water without a corresponding force along its radius. Though Berkeley’s account of the deformation of the water by factors internal to the bucket system is an appropriate strategy for undermining Newton (showing that absolute space is unnecessary), it fails because his alternative explanation does not in fact correctly explain the deformation (Suchting 194-195, Brook 167-168).
Following Berkeley’s “solution” to the bucket experiment, he points out that given relative space, a body may be in motion relative to one frame of reference and at rest with respect to another. To determine true motion or rest, remove ambiguity, and to serve the purposes of natural philosophers in achieving a widely applicable account of motion, the fixed stars regarded at rest will serve admirably. Absolute space will not be needed (DM ϸ64).
The fixed stars are not explicitly invoked to account for the centrifugal effect in the bucket experiment as they were in the two globes experiment. It is a promising solution available to Berkeley. Karl Popper and Warren Asher, among others, assume that Berkeley understands it as a cogent response to the bucket experiment (Popper 232, Asher 458).
d. General Anti-Realism Arguments
In two very brief passages, one in De Motu and one in Siris, Berkley appears to offer arguments that would undermine realism not only for corpuscles, but for all theoretical entities. These arguments are difficult to interpret given that they are not amplified in any other works. They are intriguing for they hint at widely discussed issues in contemporary philosophy of science.
Berkeley briefly examines a pattern of inference, the hypothetico-deductive method, commonly used to justify theoretical hypotheses. The pattern of inference, as he understands it, is to derive certain consequences, C, from a hypothesis, H. If the consequences are born out (observed to occur), then they are evidence for H. Berkeley expresses skepticism that the method allows for the discovery of “principles true in fact and nature” (S ϸ228). He defends his position by making a logical point and giving an example: If H implies C, and H is true, then one can infer C. But from H implies C and C, one cannot infer H. The Ptolemaic systems of epicycles has as a consequence the movements of the planets. This, however, does not establish the truth of the Ptolemaic system.
Berkeley’s description of the hypothetico-deductive method is overly simplified. In actual scientific practice many factors are considered in accepting a hypothesis, including the number of positive predictions, the existence of negative predictions, the riskiness of the predictions, plausibility of competing hypotheses, and the simplicity of the hypothesis. Nevertheless, the method in its most refined form does not guarantee the truth of the hypothesis under consideration. If this is Berkeley’s point, it is well taken. A certain caution is warranted. But if anti-realism is to follow from the lack of certainty that the hypothesis is true, additional argument is required, including how corpuscularianism escapes anti-realism.
The passage is important in another regard. It reinforces Berkeley’s pragmatic understanding of explanation. Though the Ptolemaic system is not “true in fact”, it “explained the motions and appearances of the planets” (S ϸ238). Whether true or not, it has significant predictive power. It helps us expect how the planets will move.
A fascinating and complex passage in De Motu (section 67) has been interpreted by at least one commentator as offering an argument for instrumentalism based on the underdetermination of theory by data (Newton-Smith). For any theory, T, there is another theory, T*. T and T* are both about the same subject manner, logically incompatible, and fit all possible evidence. This lands in skepticism. Which theory is true is beyond our grasp. Berkeley cannot accept this result. A chief motivation for his philosophical system is to avoid skepticism. Skepticism, for Berkeley, is the thesis that our sense experience is not reliable. It is insufficient to determine the true nature of physical reality and often outright misleads us as to that reality. According to the underdetermination thesis, despite complete observational evidence (evidence provided by the senses) the correct theory can still not be sorted.
But given instrumentalism, the skeptical consequences of the underdetermination thesis can be avoided. Since theories are understood as calculating devices, not a set of propositions that are true or false, logical incompatibility can be avoided, and skepticism as well.
In an effort to strengthen his instrumental account of forces, Berkeley does appear to offer an underdetermination argument. “…great men advance very different opinions, even contrary opinions…and yet in their results attain the truth” (DM ϸ67). He provides an illustration: When one body impresses a force on another, according to Newton, the impressed force is action alone and does not persist in the body acted upon. For Torricelli, the impressed force is received by the other body and remains there as impetus. Both theories fit the observational evidence.
A sketch of one example hardly establishes the underdetermination thesis; an argument for the underdetermination thesis is needed. Perhaps a crucial experiment will settle the Newton/Torricelli disagreement. Perhaps the two theories differ only verbally.
Berkeley was aware that at certain moments in the history of science two or more competing theories were consistent with the known evidence, but it is a much stronger thesis to claim that the theories are compatible with all possible evidence. Although there is no textual indication that Berkeley holds this strong thesis, without it, the argument from underdetermination for instrumentalism fails.
Margaret Atherton provides an alternative to Newton-Smith’s analysis (Atherton 248-250). She does not see Berkeley employing the underdetermination thesis. Rather he is explicating how natural philosophers use mathematical hypotheses. Newton and Torricelli “attain the truth” while supposing contrary theoretical positions on how motion is communicated.
Despite Newton and Torricelli sharing the same set of observations—the same sense-based descriptions of how bodies actually move, “They use different pictures to describe what links instances of this sort together….” (Atherton 249). The same regularities are discovered regardless of which picture is operative.
This raises questions about the cognitive status of the pictures. Do they differ only verbally? Are they shorthand descriptions for the movements of bodies? If they are genuinely different calculating devices what guarantees that they will continue to fit or predict the same future observations? How to understand De Motu ϸ67 as well as Siris ϸ228 remains contentious.
6. References and Further Reading
Armstrong, David. “Editor’s Introduction” in Berkeley’s Philosophical Writings, edited by David Armstrong, Collier Books, New York, 1965, pp 7-34.
Contains a very brief introduction to the whole of Berkeley’s philosophy including his philosophy of science.
Asher, Warren O. “Berkeley on Absolute Motion.” History of Philosophy Quarterly. 1987, pp 447-466.
Examines the differing accounts of absolute motion in the Principles and De Motu.
Atherton, Margaret. “Berkeley’s Philosophy of Science” in The Oxford Handbook of Berkeley, edited by Samuel C. Rickless, Oxford University Press, Oxford, 2022, pp 237-255.
Berkeley, George. Philosophical Works, Including the Works on Vision. Edited by Michael R. Ayers. Everyman edition. London: J.M. Dent, 1975.
This is a readily available edition of most of Berkeley’s important works. When a text is without section numbers the marginal page numbers refer to the corresponding page in The Works of GeorgeBerkeley.
Berkeley, George. The Works of George Berkeley, Bishop of Cloyne. Edited by A.A. Luce and T.E. Jessop. Nine volumes. London: Thomas Nelson and Sons, 1948-1957.
Standard edition of Berkeley’s works. All references are to this edition.
Brook, Richard. “DeMotu: Berkeley’s Philosophy of Science” in The Bloomsbury Companion to Berkeley, edited by Richard Brook and Bertil Belfrage, Bloomsbury, London, 2017, pp 158-173.
Brief survey of Berkeley’s philosophy of science. Includes references to important scholarly work on the topic.
Dear, Peter. Revolutionizing The Sciences. Second Edition. Princeton University Press, Princeton, 2009.
Downing, Lisa. “Berkeley’s Case Against Realism about Dynamics” in Berkeley’s Metaphysics: Structural, Interpretive, and Critical Essays, edited by Robert Muehlmann, Pennsylvania State University press, University Park, PA, 1996, pp 197-214.
Detailed treatment of Berkeley’s antirealism for Newtonian forces.
Downing, Lisa. “Berkeley’s Natural Philosophy and Philosophy of Science” In The Cambridge Companion to Berkeley, edited by Kenneth P. Winkler, Cambridge University Press, Cambridge, 2005, pp 230-265.
Downing, Lisa. “’Siris’ and the Scope of Berkeley’s Instrumentalism”. British Journal for the History of Philosophy, 1995, 3:2, pp 279-300.
Looks at the realism/antirealism issue in the context of Siris. Argues that corpuscular theories are not subject to the anti-realism consequences of the hypothetico-deductive method.
Flage, Daniel E. “Berkeley” in Internet Encyclopedia of Philosophy.
Provides a broad discussion of Berkeley’s philosophy.
Garber, Dan. “Locke, Berkeley, and Corpuscular Scepticism” in Berkeley: Critical and Interpretative Essays, edited by Colin M. Turbayne, University of Minnesota Press, Minneapolis, 1982, pp 174-194.
Defense of realism for corpuscles in Berkeley.
Hempel, Carl. “Deductive-Nomological versus Statistical Explanation” in The Philosophy of Carl G. Hempel, edited by James H. Fetzer, Oxford University Press, New York, 2001, pp 87-145.
Jobe, Evan K. “A Puzzle Concerning D-N Explanation”. Philosophy of Science, 43:4, pp 542-547.
Lorkowski, C. M. “David Hume: Causation” in Internet Encyclopedia of Philosophy.
Thorough discussion of Hume’s account of causation.
Marcum, James A. “Thomas S. Kuhn” in Internet Encyclopedia of Philosophy.
Reviews the work of historian and philosopher of science Thomas Kuhn. Kuhn was instrumental in initiating a historiographical turn for many philosophers of science. His work challenged prevailing views on the nature of science, especially accounts of scientific change.
Nagel, Ernest. The Structure of Science. Harcourt, Brace and World, New York, 1961.
Classic introduction to the philosophy of science. Excellent on the cognitive status of theories of space and geometry.
Newton-Smith, W. H. “Berkeley’s Philosophy of Science” in Essays on Berkeley, edited by John Foster and Howard Robinson, Clarendon Press, Oxford, 1985, pp 149-161.
Argues that Berkeley gives an argument for instrumentalism from the underdetermination of theories.
Pearce, Kenneth L. “Berkeley’s Theory of Language” in The Oxford Handbook of Berkeley, edited by Samuel C. Rickless, Oxford University Press, Oxford, 2022, pp 194-218.
Discusses four accounts of Berkeley’s theory of language. Defends the use theory.
Wilson, Margaret D. “Berkeley and the Essences of the Corpuscularians” in Essays on Berkeley, edited by John Foster and Howard Robinson, Clarendon Press, Oxford, 1985, pp 131-147.
Raises concerns about interpreting Berkeley as a scientific realist for corpuscles.
Winkler, Kenneth. Berkeley An Interpretation. Clarendon Press, Oxford, 1989.
Thorough discussions of both the continuity of physical objects and corpuscularianism.
Author Information
A. David Kline
Email: akline@unf.edu
University of North Florida
U. S. A.
The Experience Machine
The experience machine is a thought experiment first devised by Robert Nozick in the 1970s. In the last decades of the 20th century, an argument based on this thought experiment has been considered a knock-down objection to hedonism about well-being, the thesis that our well-being—that is, the goodness or badness of our lives for us—is entirely determined by our pains and pleasures. The consensus about the strength of this argument was so vigorous that, in manuals about ethics, it had become canonical to present hedonism as a surely false view because of the experience machine thought experiment. However, in the second decade of the 21st century, an experimental literature emerged that successfully questioned whether this thought experiment is compelling. This suggests that the experience machine thought experiment, in addition to being central to the debate on hedonism about well-being, touches other topical debates, such as the desirability of an experimental method in philosophy and the possibility of progress in this discipline. Moreover, since the experience machine thought experiment addresses the question of the value of virtual lives, it has become particularly relevant with the technological developments of virtual reality. In fact, the debate on the experience machine thought experiment or “intuition pump” also affects the debate on the value of virtual lives in relation to technological advances.
In this article, one of the original formulations of the experience machine thought experiment (EMTE) is first presented, together with the question that it is meant to isolate, its target theory, how to best understand the argument based on it, and the implications that have historically been attributed to it. Second, a revisionist trend in the scholarship that undermines traditional confidence in the argument based on the experience machine thought experiment is introduced. Third, some objections to this revisionist trend, especially the expertise objection, are considered. Finally, some further versions of the experience machine thought experiment are discussed that have been advanced in response to the “death” of the original one.
Nozick first introduced the experience machine thought experiment in 1974 in his book Anarchy, State, and Utopia. This section focuses, however, on the formulation found in Nozick’s book The Examined Life (1989), because this version is particularly effective in capturing the narrative of the thought experiment. After this presentation, the structure of the thought experiment and the implications that it has traditionally been thought to have are summarized..
In The Examined Life (1989), Nozick presented the EMTE as follows:
Imagine a machine that could give you any experience (or sequence of experiences) you might desire. When connected to this experience machine, you can have the experience of writing a great poem or bringing about world peace or loving someone and being loved in return. You can experience the felt pleasures of these things, how they feel “from the inside.” You can program your experiences for tomorrow, or this week, or this year, or even for the rest of your life. If your imagination is impoverished, you can use the library of suggestions extracted from biographies and enhanced by novelists and psychologists. You can live your fondest dreams “from the inside.” Would you choose to do this for the rest of your life? If not, why not? (Other people also have the same option of using these machines which, let us suppose, are provided by friendly and trustworthy beings from another galaxy, so you need not refuse connecting in order to help others.) The question is not whether to try the machine temporarily, but whether to enter it for the rest of your life. Upon entering, you will not remember having done this; so no pleasures will get ruined by realizing they are machine-produced. Uncertainty too might be programmed by using the machine’s optional random device (upon which various preselected alternatives can depend).
The most relevant difference between Nozick’s two versions of the thought experiment lies in the temporal description of plugging in. In the 1974’s EMTE the plugging in is for two years, while in the 1989’s EMTE the plugging in is for life. In his testing of the 1974’s EMTE, Weijers (2014) reported that 9% of the participants averse to plugging in justified it by saying something like “getting out every two years would be depressing”. On the one hand, this kind of reply is legitimate: well-being concerns lives and to maximize a life’s net pleasure, it is fully legitimate to consider the possible displeasure felt every two years when unplugging. Yet, on the other hand, this kind of reply seems to elude the question that the thought experiment is designed to isolate. Thus, the 1989’s EMTE is more effective in tracking the choice between two lives, one spent in touch with reality and one spent inside an experience machine (EM), that the thought experiment aims at isolating.
Several studies have suggested that the majority of readers of the EMTE are averse to plugging in. Weijers (2014) found that this judgement was shared by 84% of the participants asked to respond to Nozick’s 1974’s EMTE. Similarly, 71% of the subjects facing the succinct version of EMTE developed by Hindriks and Douven (2018) shared the pro-reality judgement, a percentage different from Weijers’ but still a considerable majority. Since spending one’s life, or at least a part of it, inside the EM should be favored according to mental state theories of well-being in general and prudential hedonism—that is, hedonism about well-being—in particular, these majority’s preferences might be taken as evidence against mental state theories of well-being and prudential hedonism. In fact, people’s judgements in favor of living in touch with reality have been thought to mean that reality must be intrinsically prudentially valuable. In this context, the term “prudential” is understood as referring to what is good for a person, which is often taken to correspond to well-being. If reality is intrinsically prudentially valuable, theories of well-being that hold that only how experiences feel “from the inside” directly contributes to well-being are false. With this argument based on the EMTE and on the response it elicits in the majority of subjects, this thought experiment has been widely considered as providing a knock-down argument against mental state theories of well-being and prudential hedonism. In other words, these theories have been traditionally quickly dismissed through appeal to the EMTE. Weijers (2014), for example, compiled a non-exhaustive list of twenty-eight scholars writing that the EMTE constitutes a successful refutation of prudential hedonism and mental state theories of well-being.
2. Target Theory: Mental Statism
This section identifies the target theory of the thought experiment. Traditionally, the experience machine has been mostly understood as a thought experiment directed against prudential hedonism. It should however be noted that the points being made against prudential hedonism by the EMTE equally apply to non-hedonistic mental state theories of well-being. Mental state theories of well-being value subjective mental states¾how our experiences feel to us from the inside¾and nothing else. Put simply, what does not affect our consciousness cannot be good or bad for us. Accordingly, for mental state theories, well-being is necessarily experiential. Notice that these theories do not dispute that states of affairs contribute to well-being. For example, they do not dispute that winning a Nobel Prize makes one’s life go better. Mental state theories dispute that states of affairs intrinsically affect well-being. According to these theories, winning a Nobel Prize makes one’s life go better only instrumentally because, for example, it causes pleasure.
Different mental states theories can point to different mental states as the ultimate prudential good. For example, according to subjective desire-satisfactionism, well-being is increased by believing that one is getting what one wants, rather than by states of affairs aligning with what one wants, as in the standard version of desire-satisfactionism. Standard desire-satisfactionism—a prominent alternative to hedonism in philosophy of well-being—is usually thought to be immune from objections based on the EMTE: since most of us want to live in touch with reality, plugging into the EM would frustrate this desire and make our lives go worse. However, the supposed insusceptibility of standard desire-satisfactionism to the EMTE is questionable. In fact, given that a minority of people want to plug into the EM, these people’s lives, according to standard desire-satisfactionism, would be better inside the EM. This implication conflicts with the majority’s judgement that a life inside the EM is not a good life. Note that if a person’s desires concern only mental states, standard desire-satisfactionism becomes undistinguishable from a mental state theory of well-being.
In any case, probably because prudential hedonism is the most famous mental state theory of well-being, the EMTE has traditionally been used against this particular theory. Thus, this article refers to prudential hedonism as the target theory of the EMTE, although the argument based on it is equally applicable to any other mental state theory of well-being.
3. Some Stipulations of the Experience Machine Thought Experiment
By Nozick’s stipulation, we should be able to disregard any metaphysical and epistemological concerns that the thought experiment might elicit. Since the EMTE is meant to evoke the intuition that physical reality, in contrast to the virtual reality of the EM, is intrinsically valuable, it might seem natural to ask “what is reality?” and “how can we know it?”. If there is no such thing as reality, reality cannot be intrinsically valuable. In other words, if there is no mind-independent reality, mental state theories of well-being cannot be objected to on the ground of not intrinsically valuing mind-independent reality (the metaphysical issue).
Similarly, someone might say that even if there is a mind-independent reality, we cannot know it. In this case, reality would collapse in a supposed intrinsic value with no use in evaluating lives¾if we cannot know what is real, we cannot judge whether a life has more or less of it. For example, if we do not have knowledge of reality, we cannot say whether a life in touch with the physical world or a life inside an EM is more real (the epistemological issue).
Nevertheless, the EMTE is designed to isolate a prudential concern and stipulates that we should ignore any metaphysical or epistemological concern elicited by the narrative of the thought experiment. Thus, below, Nozick’s stipulation of common-sense conceptions of reality and our access to it are adopted (for a thought experiment with an EMTE-like narrative directed against metaphysical realism, see The Brain in a Vat Argument).
Nozick also asks readers to ignore contextual factors. For example, he claims, we should not evaluate whether a life inside an EM is worse than a life of torture. In fact, it seems reasonable to prefer a life plugged into an EM to a life of intense suffering, but this preference does not respect the thought experiment’s stipulation. To isolate the relevant prudential question, we should think of a hedonically average life. Having said that, we might doubt that our trade-off between pleasure and reality can be insensitive to contextual factors. If we are among the hedonically less privileged people, for example someone being afflicted by chronic depression or pain, it seems reasonable to want to plug in.
4. The Argument Based on the Experience Machine Thought Experiment
The argument based on the EMTE has sometimes been interpreted as a deductive argument. According to this version of the argument, if the vast majority of reasonable people value reality in addition to pleasure, then reality has intrinsic prudential value; therefore, prudential hedonism is false. The main problem with this deductive argument consists in disregarding the is-ought dichotomy: knowing “what is” does not by itself entail knowing “what ought to be”. This argument jumps too boldly from a descriptive claim—the majority of people prefer reality—to a normative claim—reality is intrinsically valuable. The deductive argument is thus invalid because the fact that reality intrinsically matters to many of us does not necessarily imply that it should be intrinsically valued by all of us. For example, the majority of us, perhaps instrumentally, value wealth, but it does not necessarily follow that is wrong not to value wealth.
Instead, the most convincing argument based on the EMTE seems to be an appeal to the best explanation. According to this version of the argument, the best explanation for something intrinsically mattering to many people is something being intrinsically valuable. In the abductive argument, the passage from the descriptive level to the normative level, from “reality intrinsically matters to the majority of people” to “reality is intrinsically valuable”, is more plausibly understood as an inference to the best explanation.
5. The Experience Machine Thought Experiment as an Intuition Pump
As explained above, according to the abductive argument based on the EMTE intuition pump, reality being intrinsically prudentially valuable is the best explanation for reality intrinsically mattering to the majority of people. One can however wonder whether this is really the best explanation available. In the first two decades of the 21st century, a trend in the scholarship on the EMTE questioned this abduction by pointing to several biases that might determine, and thus explain, people’s apparent preference for reality. In this and the next two sections, phenomena advanced by this revisionist scholarship that seem to partially or significantly bias judgments about the EMTE are presented. These distorting factors are grouped under hedonistic bias, imaginative failures, and status quo bias.
The hedonistic bias is the most speculative of the proposed biases that have been thought to affect our responses to the EMTE. According to Silverstein (2000), who argued for the influence of such a hedonistic bias on our reactions to the EMTE, the preferences apparently conflicting with prudential hedonism are themselves hedonistically motivated, because, he claimed, the preference for not plugging in is motivated by a pleasure-maximizing concern. Silverstein’s argument is based on the thesis that the desire for pleasure is at the heart of our motivational system, in the sense that pleasure determines the formation of all desires.
The existence of a similar phenomenon affecting the formation of preferences has also been put forward by Hewitt (2009). Following Hewitt, reported judgements cannot be directly taken as evidence regarding intrinsic value. In fact, we usually devise thought experiments to investigate our pre-reflective preferences. The resulting judgements are therefore also pre-reflective, which means that their genesis is not transparent to us and that reflection on them does not guarantee their sources becoming transparent. Thus, our judgements elicited by the EMTE do not necessarily track intrinsic value.
Notice that Silverstein’s argument for the claim that pleasure-maximization alone explains the anti-hedonistic preferences depends on the truth of psychological hedonism—that is, the idea that our motivational system is exclusively directed at pleasure. However, the EMTE can be taken as constituting itself a counterexample to psychological hedonism. In fact, the majority of us, when facing the choice of plugging into an EM, have a preference for the pleasure-minimizing option. What the studies on our responses to the EMTE tell us is precisely that most people have preferences conflicting with psychological hedonism. The majority of people do not seem to have an exclusively pleasure-maximizing motivational system. The descriptive claim of psychological hedonism seems to struggle with a convincing counterexample. Psychological hedonists are thus forced to appeal to unproven unconscious desires¾conscious pleasure-minimizing preferences as a result of an unconscious desire for pleasure—to defend their theory.
Nevertheless, a week version of Silverstein’s hedonistic bias, according to which pleasure-maximization partly explains the anti-hedonistic judgements, seems plausible. In fact, empirical research has shown a partial role of immediate pleasure-maximization in decision-making. This conclusion points in the direction of a weak hedonistic bias—that is, the fact that apparently non-hedonistic judgements might be partly motivated by pleasure-maximization. For example, Nozick asks to disregard the distress that choosing to plug in might cause in the short-term. According to Nozick, we should eventually be rational and accept an immediate suffering for the sake of long-term pleasure. Still, as everyday experience shows us, we do not always have such a rational attitude toward immediate suffering for a long-term gain. Some people do not go to the dentist although it would benefit them, or do not overcome their fear of flying although they would love to visit a different continent. Again, it seems doubtful that the factor “distress about plugging in” is actually disregarded just because Nozick asks to do so. Our adverse judgement about plugging in might be hedonistically motivated by the avoidance of this displeasure, regardless of Nozick asking us to disregard it. However, the claim that pleasure-maximization plays a remarkable role in our anti-hedonistic responses to the EMTE is an empirically testable claim. As a result, even if the hedonistic bias seems to be a real phenomenon, it would be speculative to advance that it crucially affects our judgements about the EMTE without appealing to empirical evidence.
6. Imaginative Failures
Thought experiments are devices of the imagination. In this section, two confounding factors involving imagination are discussed: imaginative resistance and overactive imagination. Those phenomena are empirically shown to significantly distort our judgements about the EMTE. Imaginative resistance occurs when subjects reject some important stipulation of a thought experiment. Regarding the EMTE, examples include worrying about an EM’s malfunctioning or its inability to provide the promised bliss, although the scenario is explicit that the EM works perfectly and provides blissful feelings. According to Weijers’ study (2014), imaginative resistance affected 34% of the subjects that did not want to plug into the EM. In other words, one third of the participants that chose reality appeared to disregard some of the thought experiment’s stipulations. This is important because it shows, in general, that imagined scenarios are not fully reliable tools of investigation and, in particular, that a large portion of the pro-reality judgements are potentially untrustworthy because they do not comply with the EMTE’s stipulations.
Notice that philosophers can suffer from imaginative resistance too. Bramble (2016), while arguing that prudential hedonism might not entail the choice of plugging in, claims that the EM does not provide the pleasures of love and friendship. According to him, artificial intelligence is so primitive in regard to language, facial expressions, bodily gestures, and actions that it cannot deliver us the full extent of social pleasures. While his claim seems true of the technology of the mid-2010s, it clearly violates the thought experiment’s stipulations. In addition to being implied by the 1974’s version of EMTE, Nozick says explicitly in his 1989’s version that the machine has to be imagined as perfectly simulating the pleasure of loving and being loved.
Overactive imagination is another distorting phenomenon related to imagination. This phenomenon consists in subjects imagining non-intended features of the EMTE. In his test of Nozick’s 1974’s scenario, Weijers (2014) claimed to have found that 10% of the pro-reality responses displayed signs of overactive imagination. In other words, he claimed that a non-negligible proportion of participants unnecessarily exaggerated aspects of the thought experiment’s narrative. Notice that, here, Weijers’ claim seems problematic. Weijers reported that some subjects declared that they did not want to plug in because “the machine seems scary or unnatural” and he took these declarations as indicating cases of overactive imagination. Yet, the artificiality of the EM is one of the main reasons advanced by Nozick for not plugging in: ruling out such a response as biased seems therefore unfair. Nevertheless, putting aside this issue, the possibility of the EMTE eliciting judgements biased by technophobic concerns seems very plausible. This possibility has been made more likely by the popularity of the film The Matrix, in which a similar choice between reality and comfort is presented. Yet, this movie elicits a new set of intuitions that the EMTE is not supposed to elicit. For example, political freedom is severely hampered in The Matrix. The machines, after having defeated us in a war, enslaved us. Notice the difference with the 1989’s version of EMTE where “friendly and trustworthy beings from another galaxy” serve us. Thus, the narrative of The Matrix should not be used to understand the EMTE because it elicits a further layer of intuitions, such as, for example, the (intuitive) desire not to be exploited.
Considering overall imaginative failures (imaginative resistance and overactive imagination together), in Löhr’s study (2018) on the EMTE, it affected 46% of the pro-reality philosophers and 39% of the pro-reality laypeople. Thus, given the imaginative failures that affect the EMTE, it seems that this thought experiment may legitimately be accused of being far-fetched both in its narrative—at least in its first version, as the second version clarifies the benevolent intention of the EM providers—and in its stipulations. In fact, it might be that we lack the capacity to properly form judgements in outlandish cases, such as the one the EMTE asks us to imagine.
Nevertheless, concerning the role of technophobia and fantasy in imaginative failures, consider that the technological innovations of the beginning of the 21st century render virtual reality progressively less fantastic. This increasing concreteness of virtual reality technology, compared to the 1970s when the thought experiment was first devised, might lead to a progressive reduction of the influence of these factors on responses to the EMTE. Even more, it is not implausible that one day the pro-reality judgement will not anymore be shared by the majority of people. The evidential power of thought experiments is likely to be locally and historically restricted; therefore, we cannot exclude the fact that changes in technology and culture will determine different judgements in subjects presented with the EMTE.
a. Memory’s Erasure
Remember that the EMTE’s target theory is prudential hedonism, not hedonistic utilitarianism. The offer to plug in does not concern maximizing pleasure in general, but one’s own pleasure. Well-being concerns what is ultimately good for aperson. Thus, in deliberating about what is in your best interest, you need to be certain about the persistence of the you in question. Given that, the thought experiment would be disrupted if the continuation of your personal identity were not guaranteed by the EM.
Bramble (2016) expressed precisely this worry. Remember that the EM is thought to provide a virtual reality that is experientially real; thus, the users need to be oblivious of the experiences and choices that led them to plug in. Following Nozick’s infelicitous mentioning of plugging in as a “kind of suicide”, Bramble held that the EM, in order to provide this kind of feeling of reality, might kill you in the sense that your consciousness will be replaced with a distinct one. Personal identity would therefore be threatened by the EM, and since we are trying to understand what is a good life for the person living it, it seems easy to see that ending this life would not be good. Following Bramble, it seems that we have a strong reason for not plugging in: it is bad for us to die. Similarly, Belshaw (2014) expressed concerns about the EMTE and personal identity. In particular, Belshaw claimed that to preserve a sense of reality inside an EM, the memory erasure operated by the machine should be invasive. Belshaw’s point seems stronger than Bramble’s because it does not concern a small memory-dampening. Belshaw points to a tension between two requirements of the EM: preserving personal identity and providing exceptional experiences that feel real (“You are, as you know, nothing special. So, seeming to rush up mountains, rival Proust as a novelist, become the love-interest of scores of youngsters, will all strike you as odd”). For him, if some alterations of one’s psychology do not threaten personal identity, this is not the case of the EM, where invasive alterations are required to provide realistic exceptional experiences.
Nevertheless, both Bramble’s and Benshaw’s points can be seen as cases of imaginative resistance. In fact, although the experience machine thought experiment does not explicitly stipulate that the EM’s memory erasure can occur while guaranteeing the persistence of personal identity, this can be considered as implied by it. It should be imagined that the amnestic re-embodiment—that is, the re-embodiment of the subject of experience inside the EM without the conscious knowledge that he is presently immersed in a virtual environment and possesses a virtual body—preserves personal identity. If you pay attention to the text of the thought experiment, it emerges that nothing in the wording insinuates that personal identity is not preserved. The continuation of personal identity results as implicitly stipulated by the thought-experiment. Neither Bramble’s point nor Belshaw’s does comply with this implicit stipulation and they thus end up constituting cases of imaginative resistance. In fact, whether the preservation of personal identity is technically problematic or not does not concern the prudential question at stake.
b. Moral Concerns
Drawing a clear-cut distinction between moral and prudential concerns should help refine the relevant judgements regarding the EMTE. By Nozick’s stipulation, only prudential judgements are at stake in this though experiment. However, imaginative resistance is a plausible phenomenon supported by empirical evidence. According to it, subjects do not fully comply with the stipulations of thought experiments. The possibility that judgements elicited by the EMTE are distorted by moral concerns seems therefore likely. In fact, according to experimental evidence, the absence of a clear-cut distinction between morality and well-being, such as laypeople’s evaluative conception of happiness, seems to be the default framework.
Weijers (2014) reported answers to the EMTE like “I can’t because I have responsibilities to others” among participants that did not want to connect. Similarly, Löhr (2018) mentioned pro-reality philosophers’ answers like: “I cannot ignore my husband and son,” “I cannot ignore the dependents”, or “Gf[girlfriend] would be sad”. These answers can be seen as examples of imaginative resistance. When considering the EMTE, we should by stipulation disregard our moral judgements—that is, to “play by the rules” of the thought experiment one should be able to disregard morality. In his 1974’s version, Nozick claims “others can also plug in to have the experiences they want, so there’s no need to stay unplugged to serve them. Ignore problems such as who will service the machines if everyone plugs in”. Nozick asks us to imagine a scenario where everyone could plug into an EM. Since, by stipulation, there is no need to care for others, we should disregard our preference for it. Taking moral evaluations into account in one’s decision about plugging into an EM constitutes a possible case of imaginative resistance.
However, it is far from clear that we are actually able to disregard our moral concerns. In any case, it seems unlikely that stipulating that we should not worry about something necessarily implies that we will actually not worry about it. For example, being told to suspend our moral judgement in a sexual violence case because of the perpetrator’s mental incapacity does not imply that, as jurors, we will be able to do so. Prudential value is not the only kind of value that we employ in evaluating life-choices: the majority of people value more in life than their well-being. Concerning the EMTE, common-sense morality seems to deny the moral goodness of plugging in. Common-sense morality views plugging in as self-indulgent and therefore blameworthy. Moreover, it values making a real impact on the world, such as saving lives, not just having the experience of making such an impact.
To understand the imaginative resistance observed in philosophers’ answers to the EMTE, it should be noted that the main philosophical ethical systems seem to deny the moral goodness of plugging in. It seems that even hedonistic utilitarianism, the only ethical system prima facie sympathetic to plugging in, would not consider this choice morally good. To morally plug in, a hedonistic utilitarian agent should believe that this would maximize net happiness. This seems plausible only if all the other existing sentient beings are already inside an EM (and they have no obligations toward future generations). Otherwise, net happiness would be maximized by the agent’s not plugging in, since this would allow her to eventually convince two or more other beings to plug in, and two or more blissful lives, rather than only hers, will be a greater contribution to overall happiness. Given that moral philosophical concerns seem to oppose the choice of plugging in, it appears plausible that philosophers’ judgements elicited by the EMTE are also distorted by morality.
To sum up, moral concerns constitute a plausible case of imaginative resistance distorting philosophers’ and laypeople’s judgements about the EMTE. Most people seem to agree that pleasant mental states are valuable. Yet, it is unlikely that everyone is persuaded by the claim that, all things considered, only personal pleasure is intrinsically valuable. Nevertheless, if we consider only prudential good, this claim seems importantly more convincing. In other words, if we carefully reason to dismiss our moral concerns, plugging into an EM seems a more appealing choice.
7. The Status Quo Bias
In addition to the biases mentioned above, the status quo bias has received special attention in the literature. The status quo bias is the phenomenon according to which subjects tend to irrationally prefer the status quo—that is, the way things currently are. In other words, when facing complex decision-making, subjects tend to follow the adage “when in doubt, do nothing”. This bias is thought to show up in many decisions, such as voting for an incumbent office holder or not trading in a car. Moving to the relevance of the status quo bias for the EMTE, it seems that when subjects are presented with the choice of leaving reality and plugging in, most appear averse to it. However, when they are presented with the choice of leaving the EM to “move” into reality, they also appear averse to it (see Kolber, 1994). This phenomenon seems best explained by our irrational preference for the status quo, rather than by a constant valuing of pleasure and reality. In 1994, Kolber advanced the idea of the reverse experience machine (REM). In this revised version of the thought experiment, readers are asked: “would you get off of an experience machine to which you are already connected?”. In the REM, subjects have thus to choose between staying into the EM or moving to reality while losing a significant amount of net pleasure. Since the REM is supposed to isolate the same prudential concern as the EMTE through a choice between pleasure and reality (with a proportion of pleasure and reality similar in both thought experiments), the REM should elicit the same reactions as the EMTE. The replication of the results would indicate that Nozick’s thought experiment is able to isolate this concern. Instead, when De Brigard tested a version of the REM, the results did not fulfill this prediction. While a large majority of readers of the original EMTE are unwilling to plug in, when imagining being already connected to an EM and having to decide whether to unplug or stay, the percentage of subjects that chose reality over the machine dropped significantly to 13%. De Brigard (2010) and the following literature have interpreted this result as demonstrating the influence of the status quo bias. Because of the status quo bias, when choosing between alternatives, subjects display an unreasonable tendency to leave things as they are. Applied to the EMTE, the status quo bias explains why the majority of subjects prefer to stay in reality when they are (or think they are) in reality and to stay in an EM when imagining being already inside one.
This interpretation is also supported by another empirical study conducted by Weijers (2014). Weijers introduced a scenario—called “the stranger No Status Quo scenario” (or “the stranger NSQ”)—that is meant to reduce the impact of status quo bias. This scenario is partly based on the idea that the more we are detached from the subject for whom we have to take a decision, the more rational we should be. Accordingly, the scenario NSQ asks us to decide not whether we would plug into an EM, but whether a stranger should. Moreover, the Stranger NSQ scenario adds a 50-50 time split: at the time of the choice, the stranger has already spent half of her time inside an EM and has had most of her enjoyable experiences while plugged into it. Both elements—that is, the fact that we are asked to choose for a stranger and the fact that this stranger has already spent half of her life inside an EM—are meant to minimize the influence of the status quo bias. Weijers observed that in this case a tiny majority (55%) of the participants chose pleasure over reality. In other words, a small majority of subjects, when primed to choose the best life for a stranger who has already spent half of her life into an EM, preferred pleasure over reality. This result again contradicts the vast majority of pro-reality responses elicited by Nozick’s original thought experiment. Importantly, Weijers’ study is noteworthy because it avoided the main methodological flaws of De Brigard’s (2010), such as a small sample size and a lack of details on the conduct of the experiments.
To sum up, the aforementioned studies and the scholarship on them have challenged the inference to the best explanation of the abductive argument based on the EMTE. Note that something can be considered good evidence in favor of a hypothesis when it is consistent only with that hypothesis. According to this new scholarship, the fact that the large majority of people respond to the original EMTE in a non-hedonistic way by choosing reality over pleasure is not best explained by reality being intrinsically valuable. In fact, modifications of the EMTE like the REM and the stranger NSQ scenario, while supposedly isolating the same prudential question, elicit considerably different preferences in the experimental subjects. The best explanation of this phenomenon seems to be the status quo bias, a case of deviation from rational choice that has been repeatedly observed by psychologists in many contexts.
8. Methodological Challenges
Smith (2011) criticized the above-mentioned studies for the lack of representativeness of the experimental subjects. In fact, De Brigard’s studies were conducted on philosophy students and Weijers’ studies on marketing and philosophy students, both in Anglo-Saxon universities. Obviously, these groups do not represent the whole English-speaking population, let alone the whole human population. Nevertheless, this objection seems misplaced. Although it would be interesting to know what the whole world thinks of the EMTE, or to test an indigenous population that has never had any contact with Western philosophy, that is not what is relevant for the negative experimental program concerning the EMTE—that is, the experimental program devoted to question the abductive argument against prudential hedonism based on the EMTE. Smith seems to confuse the revisionist scholarship’s goal of challenging philosophers’ previous use of intuitions with the sociological or anthropological goal of knowing what humans think.
Another methodological objection advanced by Smith (2011) concerns the fact that experimental subjects in these studies are not in the position of confronted agents. The participants are asked to imagine some fantastical scenarios rather than being in a real decision-making situation, with the affective responses that this would elicit. Again, Smith’s objection seems flawed: what Smith considers a methodological problem might actually be a methodological strength. Unconfronted agents are very likely to be more rational in the formation of their judgements about the EMTE. Once again, the experimental program on the EMTE is interested in how to refine and properly use intuitions for the sake of rational deliberation, not in the psychological project of knowing what people would choose, under the grip of affects, in a real situation. In other words, the reported judgements expressed in questionnaires, although not indicative of what intuitions we would have in front of a real EM, seem less biased by affects and more apt to be the starting point for a rational judgement about what has intrinsic prudential value.
9. The Expertise Objection
A major methodological challenge to much of experimental philosophy concerns the use of laypeople’s judgements as evidence. According to the expertise objection, the judgements reported by laypeople cannot be granted the same epistemic status as the judgements of philosophers (that is, the responses of trained professionals with years of experience in thinking about philosophical issues). Philosophers, accordingly, should know how to come up with “better” judgements. Following this objection, the responses of subjects with no prior background in philosophy, which inform the aforementioned studies, lack philosophical significance.
Although the concern appears legitimate, it seems disproved by empirical evidence from both experimental philosophy in general and experiments about the EMTE in particular. Concerning the EMTE in particular, Löhr (2018) tested whether philosophers are more proficient than laypeople in disregarding irrelevant factors when thinking about several versions of EMTE. He observed that philosophers gave inconsistent answers when presented with different versions of EMTE and that their degree of consistency was only slightly superior to laypeople. Also, philosophers were found to be susceptible to imaginative failures approximately as much as laypeople. This suggests that philosophers do not show a higher proficiency than laypeople in complying with the stipulations of the thought experiment and that their consistency between different EMTE’s scenarios is only slightly better.
The empirical evidence we possess on philosophers’ judgements in general and philosophers’ judgements concerning the EMTE in particular seems therefore to cast much doubt on the expertise objection. The current empirical evidence does not support granting an inferior epistemic status to the preferences of laypeople that inform the aforementioned studies on the EMTE. The burden of proof, it seems, lies squarely on anyone wishing to revive the expertise objection. Moreover, given the value of equality than informs our democratic worldview, the burden of proof should always lie on the individual or group—philosophers in this case—that aspires to a privileged status.
Furthermore, in addition to philosophical expertise not significantly reducing the influence of biases, philosophers might have their own environmental and training-specific set of biases. For example, a philosopher assessing a thought experiment might be biased by the dominant view about this thought experiment in the previous literature or in the philosophical community. This worry seems particularly plausible in the case of the EMTE because there is a strong consensus among philosophers not specialized in this thought experiment that one should not enter the EM. In other words, it is reasonable to hypothesize that the “textbook consensus”—that is, the philosophical mainstream position as expressed by undergraduate textbooks—adds a further layer of difficulty for philosophers trying to have an unbiased response to the EMTE.
10. The Experience Pill
In a recent study, Hindriks and Douven (2018) changed the EM into an experience pill. With this modification, an increase of pro-pleasure judgements from 29% to 53% was observed. In other words, substituting, in the narrative of the thought experiment, the science fiction technology by a pill seems to cause a significant shift in the subjects’ responses. This can be attributed to the more usual delivery mechanism and, more importantly, to the fact that the experience pill does not threaten in many respects the relationship with reality. The experience pill does not resemble psychedelic drugs such as LSD (notice that interestingly Nozick took the view of psychedelic drugs fans, together with traditional religious views, as examples of views that deeply value reality). In fact, while the experience pill drastically alters the hedonic experience, perhaps similarly to amphetamines or cocaine, it does not affect the perception of the world.
Therefore, the experience pill though experiment does not seem to propose a narrative that can be compared with the EMTE. Here, the choice is not between reality and pleasure but rather between affective appropriateness¾having feelings considered appropriate to the situation¾and pleasure. Thus, the experience pill should be seen as an interesting but different thought experiment not to be compared with the EMTE. Concerning this issue, it should be noted that the EMTE’s scenarios that are used across both armchair and experimental philosophy literature vary significantly. This is worrying because the experimental philosophy and psychology literature on intuitions seems to show that the wording of scenarios can greatly affect the responses they elicit. We might thus find that a particular wording of the scenario will get different results, adding new layers of difficulty to answering the question at stake. In other words, the inter-comparability of different scenarios adopted by different authors is limited.
11. A New Generation of Experience Machine Thought Experiments
Some authors have challenged the revisionist scholarship on the EMTE presented above by claiming that it does not address the most effective version of this thought experiment. According to them, the narrative of the original EMTE should be drastically modified in order to effectively isolate the question at stake. Moreover, they claim that a new argument based on this transformed version of the EMTE can be advanced against prudential hedonism: the experientially identical lifetime comparison argument. For example, Crisp (2006) attempted to eliminate the status quo bias and the concern that the technology may malfunction (imaginative resistance) by significantly modifying the narrative of the EMTE. He asks us to compare two lives. Life A is pleasant, rich, full, autonomously chosen, involves writing a great novel, making important scientific discoveries, as well as virtues such as courage, wittiness, and love. B is a life experientially identical to A but lived inside an EM. A and B, according to prudential hedonism, are equal in value. However, it seems that the majority of us has an intuition contrary to that. This is the starting point of the experientially identical lifetime comparison argument. Likewise, according to Lin (2016), to isolate the question that the EMTE is supposed to address, we should consider the choice between two lives that are experientially identical but differently related to reality, because this locates reality as the value in question. According to Lin, his version of the EMTE has also the advantages of not being affected by the status quo bias and not involving claims about whether we would or should plug in or not.
Rowland (2017) conducted empirical research on a version of the EMTE in which two hedonically equal lives of a stranger must be compared. Presented with Rowland’s EMTE, more than 90% of the subjects answered that the stranger should choose the life in touch with reality. Surprisingly, Rowland does not provide the possibility of answering that the two lives have equal value. Unfortunately, this methodological mistake is so macroscopic that it severely undermines the significance of Rowland’s study.
Notice that, once the narrative of the thought experiment is devised in this way, it assumes the same structure as Kagan’s deceived businessman thought experiment (Kagan, 1994). In fact, both thought experiments are based on the strategy of arguing against a view according to which B-facts are equal in value with A-facts, by devising a scenario where there is intuitively a difference of value between A-facts and B-facts. In his thought experiment, Kagan asks to imagine the life A of a successful businessman that has a happy life because he is loved by his family and respected by his community and colleagues. Then, Kagan asks to imagine an experientially equal life B where the businessman is deceived about the causes of his happiness—everyone is deceiving him for their personal gain. Lives B and A contain the same amount of pleasure, thus, according to prudential hedonism, they are equal in value. Nevertheless, we have again the intuition that life A is better than life B.
Discussing this new version of the EMTE, de Lazari-Radek and Singer (2014) concluded that our judgements about it are also biased. They attributed this biased component to morality. Life A contributes to the world while life B does not; thus, life A is morally superior to life B. Therefore, according to them, our judgement that life A is better is affected by moral considerations extraneous to the prudential question at stake. As in the case of imaginative failures regarding the original EMTE, it seems possible that the comparison intuition is based on scales of evaluation different than well-being.
Moreover, the structure of this new version of the thought experiment seems to suffer from the freebie problem. Since it is irrational to have 100% confidence in the truth of prudential hedonism, it is irrational not to prefer life A to life B. If you are not 100% confident about the truth of prudential hedonism, life A has a >0% chance of being more prudentially valuable than life B, making it unreasonable to decline the reality freebie. Note that this is especially true when the decision between the two lives is forced (that is, when there is no “equal value” option) as in Rowland’s study. Because of the freebie problem, transforming the narrative of the EMTE in this drastic way does not seem to increase its strength. Rather, it seems to make this thought experiment unhelpful to compare our judgements about two lives that roughly track the competing values of pleasure and reality.
To reiterate, since a person cannot be 100% sure about the truth of prudential hedonism, they would be a bad decision-maker if they did not choose the life with both pleasure and reality. Reality has a greater than 0% chance of being intrinsically prudentially valuable, as is presumably true of all the other candidate goods that philosophers of well-being discuss. Importantly, the original structure of the EMTE traded off more reality against more pleasure. That the vast majority of people reported a preference for reality was therefore a sign that they really valued reality, since they were ready to sacrifice something of value (pleasure) to get more of another value (reality). A properly devised EMTE, aiming at revealing subjects’ relevant preferences, has to trade off against each other non-negligible amounts of two competing goods. The supposed intrinsic value of reality can be intuitively apprehended only if you have to sacrifice an amount of pleasure computed as significant by the brain. The epistemic value of the EMTE lies in presenting two options, one capturing the pro-reality intuition and one the pro-pleasure intuition. In fact, the strength of EMTE against prudential hedonism is that the vast majority of subjects agree that connecting to an EM is not desirable even though bliss is offered by connecting to the machine. Thus, the proper design of the thought experiment involves a meaningful pairwise comparison. Pairwise comparison is the method of comparing entities in pairs to reveal our preferences toward them. This simple comparison can constitute the building block of more complex decision-making. Symmetrically, complex decision-making can be reduced to a set of binary comparisons. That is indeed what we want from the EMTE: reducing a complex decision about intrinsic prudential value to a binary comparison between two competing lives that allows us to study people’s judgements about the prudential value of two competing goods.
Another example of this new generation of the EMTE is to be found in Inglis’s (2021) universal pure pleasure machine (UPPM). Inglis imagined a machine that provides a high heroine-like constant—that is, a machine that provides pure pleasure, without producing any virtual reality—and a world where every sentient being is plugged into such a machine (universality condition). Then, Inglis asked her participants: “is this a good future that we should desire to achieve?”. Only 5.3% of the subjects presented with this question replied positively. Interestingly, notice that this study was the first to be conducted in a Chinese university. From her results, Inglis concluded that the UPPM is once again able to disprove prudential hedonism. Nevertheless, more studies are necessary for accepting confidently this conclusion. For example, the universality condition that, according to Inglis, is able to reduce biases descending from morality, might, on the contrary, work as a moral intuition pump. In fact, empirical evidence shows that moral judgements, contrary to prudential judgements, seems to be characterized by universality (for example, it is wrong for everyone to commit murder vs. it is wrong for me to play videogames). Also, the UPPM might determine significant imaginative failures, for example if subjects view the machine with no virtual reality as boring (imaginative resistance) or perceive its heroin-like bliss as a disgusting kind of existence (overactive imagination).
12. Concluding Remarks
This article reviewed the salient points of the literature on the EMTE, since its introduction in 1974 by Nozick until the beginning of the 2020s. In presenting the scholarship on this thought experiment, a historical turn was emphasized. In fact, the debate on the EMTE can be divided in two phases. In a first phase, starting with the publication of Nozick’s Anarchy, State, and Utopia in 1974 and ending about 2010, we observe a huge consensus on the strength of the EMTE in proving prudential hedonism and mental state theories of well-being wrong. In a second phase, starting more or less at the beginning of the 2010s, we witness the emergence of a scholarship specialized in the EMTE that crushes the confidence about its ability to generate a knock-down argument against prudential hedonism and mental statism about well-being. Anecdotally, it should be noticed that the philosophical community at large—that is, not specialized in the EMTE—is not necessarily updated with the latest scholarship and it is common to encounter views more in line with the previous confidence. Nevertheless, the necessity felt by anti-hedonistic scholars to devise a new generation of EMTE demonstrates that the first generation is dead. Further scholarship is needed to establish whether and to what extent these new versions are able to resuscitate the EMTE and its goal.
13. References and Further Reading
Belshaw, C. (2014). What’s wrong with the experience machine? European Journal of Philosophy, 22(4), 573–592.
Bramble, B. (2016). The experience machine. Philosophy Compass, 11(3), 136–145.
Crisp, R. (2006). Hedonism reconsidered. Philosophy and Phenomenological Research, 73(3), 619–645.
De Brigard, Felipe. (2010). If you like it, does it matter if it’s real?, Philosophical Psychology, 23:1, 43-57, DOI: 10.1080/09515080903532290
De Lazari-Radek, K., & Singer, P. (2014). The point of view of the universe: Sidgwick and contemporary ethics. Oxford University Press.
Feldman, F. (2011). What we learn from the experience machine. In R. M. Bader & J. Meadowcroft (Eds.), The Cambridge Companion to Nozick’s Anarchy, State, and Utopia (pp. 59–86), Cambridge University Press.
Forcehimes, A. T., & Semrau, L. (2016). Well-being: Reality’s role. Journal of the American Philosophical Association, 2(3), 456–468.
Hewitt, S. (2009). What do our intuitions about the experience machine really tell us about hedonism? Philosophical Studies, 151(3), 331–349.
Hindriks, F., & Douven, I. (2018). Nozick’s experience machine: An empirical study. Philosophical Psychology, 31(2), 278–298.
Inglis, K. (2021). The universal pure pleasure machine: Suicide or nirvana? Philosophical Psychology, 34(8), 1077–1096.
Kagan, S. (1994). Me and my life. Proceedings of the Aristotelian Society, 94, 309–324.
Kawall, J. (1999). The experience machine and mental state theories of well-being. Journal of Value Inquiry, 33(3), 381–387.
Kolber, A. J. (1994). Mental statism and the experience machine. Bard Journal of Social Sciences, 3, 10–17.
Lin, E. (2016). How to use the experience machine. Utilitas, 28(3), 314–332.
Löhr, G. (2018). The experience machine and the expertise defense. Philosophical Psychology, 32(2), 257–273.
Nozick, R. (1974). Anarchy, State, and Utopia. Blackwell.
Nozick, R. (1989). The Examined Life. Simon & Schuster.
Rowland, R. (2017). Our intuitions about the experience machine. Journal of Ethics and Social Philosophy, 12(1), 110–117.
Silverstein, M. (2000). In defense of happiness. Social Theory and Practice, 26(2), 279–300.
Smith, B. (2011). Can we test the experience machine? Ethical Perspectives, 18(1), 29–51.
Stevenson, C. (2018). Experience machines, conflicting intuitions and the bipartite characterization of well-being. Utilitas, 30(4), 383–398.
Weijers, D. (2014). Nozick’s experience machine is dead, long live the experience machine! Philosophical Psychology, 27(4), 513–535.
Bonaventure was a philosopher, a theologian, a prolific author of spiritual treatises, an influential prelate of the Medieval Church, the Minister General of the Franciscan Order, and, later in his life, a Cardinal. He has often been placed in the Augustinian tradition in opposition to the work of his peer, Thomas of Aquinas, and his successors in the Franciscan Order, John Duns Scouts and William of Ockham, who relied more heavily on the recent recovery of Aristotle’s philosophical texts and those of Aristotle’s commentators, notably Ibn Rushd. However, a more accurate reading of the relevant sources places Bonaventure at one end of a spectrum of a wide range of classical traditions, Pythagorean, Platonic, Neo-Platonic, Augustinian, and Stoic, as well as that of Aristotle and the commentators, in his effort to develop a distinct philosophy, philosophical theology, and spiritual tradition that remains influential to this day. His philosophy was part and parcel of his greater effort to further the knowledge and love of God; nevertheless, he clearly distinguished his philosophy from his theology, although he did not separate them, and this distinction is the basis for his status as one of the most innovative and influential philosophers of the later Middle Ages—a list that also includes Aquinas, Scotus, Ockham, and, perhaps, Buridan.
Bonaventure derived the architectonic structure of his thought from a Neo-Platonic process that began in the logical analysis of a Divine First Principle, continued in the analysis of the First Principle’s emanation into the created order, and ended in an analysis of the consummation of that order in its reunion with the First Principle from which it came. He was a classical theist, indeed, he contributed to the formation of classical theism, and he advanced the depth of that tradition in his development of a logically rigorous series of epistemic, cosmological, and ontological arguments for the First Principle. He argued that the First Principle created the heavens and earth in time and ex nihilo, contra the dominant opinion of the philosophers of classical antiquity, and he based his argument on the classical paradoxes of infinity. He emphasized the rational soul’s apprehension of the physical realm of being—although he was no empiricist—and argued for a cooperative epistemology, in which the rational soul abstracts concepts from its apprehension of the physical realm of being, but it does so in the light, so to speak, of a divine illumination that renders its judgments of the truth of those concepts certain. He revised a classical eudaimonism, steeped in Aristotelian virtue theory, in the context of the Christian doctrines and spiritual practices of the thirteenth century. He weaved these and other elements into a memorable account of the soul’s causal reductio of the cosmos, its efficient cause, final cause, and formal cause, to its origin in the First Principle, and the soul’s moral reformation that renders it fit for ecstatic union with the First Principle.
Giovanni, later Bonaventure, was born in 1217—or perhaps as late as 1221—in the old City of Bagnoregio, the Civita di Bagnoregio, on the border between Tuscany and Lazio in central Italy. The view is striking. The Civita stands atop a scarped hill of volcanic stone that overlooks a valley at the foot of the Apennines. Bonaventure’s home has since collapsed into the valley, but a plaque remains to mark its former location. His father, Giovanni di Fidanza, was reportedly a physician and his father’s status as a member of the small but relatively prosperous professional classes provided the young Giovanni opportunity to study at the local Franciscan convent. His mother, Maria di Ritello, was devoted to St. Francis of Assisi (d. 1226), and her devotion provides the context for one of Giovanni’s few autobiographical reflections. He tells us that he suffered a grave illness when he was a young boy, but his mother’s prayers to St. Francis, who had passed in 1226, saved him from an early death (Bonaventure, Legenda maior prol. 3). He thus inherited his mother’s devotion to Francis, affectionately known as the Poor One (il Poverello).
Giovanni arrived in Paris in 1234 or perhaps early in 1235 to attend the newly chartered Université de Paris. He may well have found the city overwhelming. Philip II (d. 1223) had transformed France into the most prosperous kingdom in medieval Europe and rebuilt Paris to display its prosperity. He and his descendants oversaw a renaissance in art, architecture, literature, music, and learning. The requirements for the degree in the arts at the University focused on the trivium of the classical liberal arts, grammar, rhetoric, and logic. But they also included the quadrivium, arithmetic, geometry, music, and astronomy, which emphasized the role of number and other mathematical concepts in the structure of the universe, and Aristotle’s texts on philosophy and the natural sciences—the students and masters of the university routinely ignored the prohibitions to study Aristotle and his commentators first issued in 1210. Giovanni made good use of these “arts” throughout his career. Priscian’s grammar, the cadences of Cicero and other classical authors, deductive, inductive, and rhetorical argument, the prevalence of the concept of numerical order, and a firm grasp of the then current state of the natural sciences inform the entire range of his works.
Giovanni’s encounter with Alexander of Hales (d. 1245), an innovative Master of Theology at the University, would set the course for his future. Alexander entered the Franciscan Order in 1236 and established what would soon become a vibrant Franciscan school of theology within the University. Giovanni, who regarded Alexander as both his “master” and “father”, followed him into the Order in 1238, or perhaps as late as 1243, and began to study for an advanced degree in theology. He took the name Bonaventure when he entered the Order to celebrate his “good fortune”.
Alexander set the standard for Bonaventure and a long list of other students who emphasized a philosophical approach to the study of the scriptures and theology with particular attention to Aristotle and Aristotle’s commentators, whose entire corpus, with the notable exception of the Politics, would have been available to Bonaventure as a student of the arts, as well as the Liber de Causis, an influential Neo-Platonic treatise attributed to Aristotle. Alexander was fundamentally a Christian Platonists in the tradition of Augustine, Anselm, and the School of St. Victor; nevertheless, he was one of the first to incorporate Aristotelian doctrines into the broader Platonic framework that dominated the Franciscan school of thought in the thirteenth century.
Bonaventure continued his studies under Alexander’s successors, Jean de la Rochelle, Odo Rigaud, and William of Melitona. He began his commentaries on the scriptures, Ecclesiastes, Luke, and John, in 1248, and his commentary on the Sentences of Peter Lombard, the standard text for the advanced degree in theology, in 1250. He completed his studies in 1252 and began to lecture, engage in public disputations, and preach—the critical edition of his works includes over 500 sermons preached throughout the course of his life. He received his license to teach (licentia docendi) and succeeded William as Master and Franciscan Chair of Theology in 1254. The Reduction of the Arts to Theology is probably a revision of his inaugural lecture. His works from this period also include a revised version of the Commentary on the Sentences of Peter Lombard, his most extensive treatise in philosophical theology, and a series of disputations Onthe Knowledge of Christ, in which he presented his first extensive defense of his doctrine of divine illumination, On the Trinity, in which he summarized his arguments for the existence of God, and OnEvangelical Perfection, in which he defended the Franciscan commitment to poverty.
But the secular masters of the University—those professors who did not belong to a religious order—refused to recognize his title and position. They had long been at odds with members of the religious orders who often flaunted the rules of the University in deference to those of their own orders. When the secular masters suspended the work of the University in a dispute with the ecclesial authorities of Paris in 1253, the religious orders refused to join them. The secular masters then attempted to expel them from the University. Pope Alexander IV intervened and settled the dispute in favor of the religious orders. The secular masters formally recognized Bonaventure as Chair of Theology in August of 1257, but Bonaventure had already relinquished his title and position. The Franciscan friars had elected him Minister General of the Order in February of 1257.
His initial task as Minister General proved difficult. His predecessor, John of Parma (d. 1289), endorsed some of the heretical tendencies of Joachim of Fiore (d. 1202), who had foretold that a New Age of the Holy Spirit would descend on the faithful and transcend the prominence of Christ, the papacy, Christ’s vicar on earth, and the current ecclesial leadership who served the papacy. John and other Franciscans, notably Gerard of Borgo San Donnino, had identified Francis as the herald of that New Age and his disciples, the Franciscans, as the Order of the Spirit. The papacy and other members of the ecclesial hierarchy formally condemned some aspects of Joachim’s doctrine at the Fourth Lateran Council in 1215 and issued a more thorough condemnation, in response to Franciscan support of his doctrine, at the Council of Arles in 1260. Bonaventure would display some degree of sympathy for their claims. He, too, insisted that Francis was the Angel of the Sixth Seal who had heralded the start of the New Age of the Spirit, but he also insisted Francis’ disciples remain in full obedience to the current ecclesial hierarchy (Bonaventure, Legenda maior prol. 1).
His second challenge stemmed from a dispute that emerged within the Order during Francis’ lifetime. Francis had practiced a life of extreme poverty in obedience to Christ’s admonition to the rich young man to “sell everything you have and distribute the proceeds to the poor… and come and follow me” (Luke 18:18-30). Francis, like the rich young man, had been rather wealthy until he renounced his father’s inheritance in obedience to the admonition and spent the rest of his life as a charismatic preacher. But many of his followers argued for some degree of mitigation to their life of extreme poverty so they could better serve in other capacities, as administrators, teachers, and more learned preachers. The debate came to a head shortly after Francis’ death, since many of the friars who practiced a more rigorous commitment to poverty also supported Joachim’s apocalypticism. Bonaventure strongly supported Francis’ commitment to poverty as evidenced in his initial Encyclical Letter to the Provincial Ministers of the Order in 1257 and his codification of the Franciscan Rule in the Constitutions of Narbonne in 1260; nevertheless, he also permitted some degree of mitigation for specific purposes in specific contexts—the books, for example, students needed to complete their studies at Paris and other universities to become administrators, teachers, and preachers. Bonaventure maintained the peace in these disputes largely through his own commitment to poverty. But that peace would collapse shortly after his death when the Fraticelli, also known as the Spiritual Franciscans, who argued for a more rigorous life of poverty, and the Conventuals, who argued for a degree of mitigation, would split into factions. Many of the Fraticelli would oppose the papacy and the established hierarchy in their zeal for poverty and suffer censure, imprisonment, and, on occasion, death. Boniface VIII pronounced them heretics in 1296 and Clement V, who had also suppressed the Templars, sentenced four of the Fraticelli to burn at the stake in Marseille in 1318.
Bonaventure resided in the convent of Mantes sur Seine, to the west of Paris, throughout his term as Minister General and visited the university often—it was the center of the European intellectual world. He also travelled widely, in frequent visits to the friars throughout France, England, Italy, Germany, the Low Countries, and Spain, and he did so on foot, the standard means of transportation for those who had pledged themselves to Francis’ Lady Poverty. His works from this period reveal his careful attention to his friars’ spiritual needs. He published the Breviloquium, a short summary of his philosophical theology, at the bequest of the friars in 1257, shortly after his election, and a number of spiritual treatises in which he displays a deft ability to weave his philosophical theology in a sophisticated and often moving prose. These include the Soliloquies, a series of the soul’s dialogues with its innermost self in its effort to further its knowledge and love of God, the Threefold Way, a treatise on spiritual development, the Tree of Life, a series of meditations on Christ’s life, death, and resurrection that furthers the late medieval emphasis on the suffering of Christ, and the Longer Life of Saint Francis of Assisi, which would become the most influential biography of the saint until the nineteenth century. But the most influential of these texts is the Soul’s Journey into God (Itinerarium Mentis in Deum), a short summary of the ascent of the soul on the steps of Bonaventure’s reformulation of the Platonic Ladder of Love that ends in an ecstatic union with God. Those interested in Bonaventure’s thought should begin their reading with the Itinerarium.
His final challenge as Minister General dealt directly with the proper relationship between reason and faith. Aristotle and his commentators, the so-called radical Aristotelians, had argued for a number of doctrines that contradicted the orthodox reading of the Christian scriptures. He met this challenge in a series of Collations, academic conferences in which he singled out their errors, the Collations on the Ten Commandments, the Seven Gifts of the Holy Spirit, and the Six Days of Creation—the last of these remains unfinished. These errors included the eternity of the world, the absolute identity of the agent intellect, the denial of Platonic realism in regard to the theory of metaphysical forms, the denial of God’s direct knowledge of the world, the denial of the freedom of the will, and the denial of reward or punishment in the world to come. Bonaventure provided detailed arguments against each of these positions in his Collations and other works, but his principal argument was the concept of Christ the Center (medium) of all things, neither Aristotle, whom Bonaventure regarded as the Philosopher par excellence, nor his commentators (Bonaventure, Hexaëmeron 1:10-39). Thus, Christ’s teaching and, through extension, the entire scriptures, remained the only reliable guard against the tendency of the human intellect to error.
Pope Clement IV attempted to appoint Bonaventure Bishop of York in 1265, but Bonaventure refused the honor. Clement’s death in 1268 then precipitated a papal crisis. Louis IX of France and his younger brother, Charles of Anjou, attempted to intervene, but the Italian Cardinals and a number of other factions resisted. Bonaventure supported a compromise candidate, Teobaldi Visconti, whose election in 1271 brought the crisis to an end. Teobaldi, now Pope Gregory X, appointed Bonaventure the Cardinal Bishop of Albano in 1273, perhaps in gratitude for his support, and called on him to lead the effort to reunify the Roman Catholic Church and the Orthodox Church at the Second Council of Lyon. Once again, his efforts proved instrumental. The Council celebrated the reunion on July 6, 1274. It would not last. The representatives of the Emperor, Michael VIII Palaiologos, and the Orthodox Patriarch had agreed to the terms of union without the support of their clergy or the faithful. Bonaventure passed away unexpectedly shortly thereafter, on July 15, 1274, while the Council was in session. Gregory and the delegates of the Council celebrated his funeral mass. Pope Sixtus IV declared him a saint in 1482 and Sixtus V declared him a Doctor of the Church, the Doctor Seraphicus, in 1588.
b. Influence
Historically, Bonaventure remains the preeminent representative of the Christian Neo-Platonic tradition in the thirteenth century and the last influential representative of that tradition. He was also the last single individual to master the three critical components of the Christian intellectual tradition, philosophy, theology, and spirituality. His prominent disciples in the thirteenth century include Eustace of Arras, Walter of Bruges, John Peckham, William de la Mare, Matthew of Aquasparta, William of Falgar, Richard of Middleton, Roger Marsten, and Peter John Olivi.
Bonaventure fell out of favor in the fourteenth century. Scotus, Ockham, and other, less influential philosophers possessed less confidence in reason’s ability to ascend the Ladder of Love without the assistance of faith. They began to dismantle the close knit harmony between the two that Bonaventure had wrought, and set the stage for the opposition between them that emerged in the Enlightenment.
Nevertheless, the Franciscans revived interest in Bonaventure’s thought in response to his canonization in the fifteenth century and again in the sixteenth. The Conventual Franciscans, one of the three current branches of the medieval Order of Friars Minor, established the College of St. Bonaventure in Rome in 1587 to further interest in Bonaventure’s thought. They produced the first edition of his works shortly thereafter in 1588-1599, revised in 1609, 1678, 1751, and, finally, in 1860. The Conventuals and other Franciscans also supported the effort to establish medieval philosophy as a distinct field of academic inquiry in the nineteenth century, and rallied to include Bonaventure in the standard canon of medieval philosophers. The Observant branch of the Friars Minor founded the College of St. Bonaventure in Quaracchi, just outside Florence, in 1877 to prepare a new critical edition of Bonaventure’s works in support of this effort. It appeared in 1882-1902 and remains, with some relatively minor revisions, the foundation of current scholarship.
Bonaventure’s philosophy continued, and still continues, to command considerable interest, particularly among historians of medieval thought and philosophers in the Roman Catholic and other Christian traditions in their effort to distinguish philosophy from theology and develop a metaphysics, epistemology, ethics, and even aesthetics within their respective traditions. Notable examples include Malebranche, Gioberti, and other ontologists who revived a robust Platonism to argue that the human intellect possesses direct access to the divine ideas, Tillich and other Christian existentialists who developed Bonaventure’s epistemology into an existential encounter with the “truth” of the Divine Being, and, most recently, Delio, among others, who have relied on Bonaventure and the wider Franciscan intellectual tradition in their attempt to solve current problems in environmental ethics, health care, and other areas of social justice.
2. The Light of Philosophy
Bonaventure’s reputation as a philosopher had been the subject of debate throughout the nineteenth and early twentieth centuries. He never penned a philosophical treatise independent of his theological convictions, such as Aquinas’ On Being and Essence, and he imbedded his philosophy within his theological and spiritual treatises to a greater extent than Aquinas, Scotus, Ockham, and other medieval philosophers. Nevertheless, he clearly distinguished philosophy from theology and insisted on the essential role of reason in the practice of theology, the rational reflection on the data of revelation contained in the Christian scriptures. This distinction provides the basis for a successful survey of his philosophy. But its integral role in his larger enterprise, the rational reflection on the data of revelation, requires some degree of reference to fields that normally fall outside the scope of philosophy as practiced today, namely, his theology, spirituality, and, on occasion, even mysticism.
Bonaventure classified philosophy as a rational “light” (lux), a gift from God, “the Father of Lights and the Giver of every good and perfect gift” (Bonaventure, De reductione artium 1). It would prove critical in Bonaventure’s overarching goal to further his reader’s knowledge and love of God, and an indispensable “handmaiden” to theology, the greater “light” and the “queen of the sciences”. It reveals intelligible truth. It inquires into the cause of being, the principles of knowledge, and the proper order of the human person’s life. It is a critical component in a Christian system of education (paideia) with its roots in the thought of Clement of Alexandria, Augustine, Boethius, and Capella, who first delineated the classical system of the seven liberal arts. But it is also important to note that it possessed a wider range of denotation than it does today: according to Bonaventure, philosophy included grammar, rhetoric, and logic, mathematics, physics, and the natural sciences, ethics, household economics, politics, and metaphysics, in sum, rational investigation into full extent of the created order and its Creator independent of the data of revelation.
The light of reason also played a critical role in Bonaventure’s approach to theology (Bonaventure, De reductione artium 5). Alexander and his heirs in the Franciscan school at Paris had pioneered the transformation of theology into a rationally demonstrative science on the basis of Aristotle’s conception of scientia. Bonaventure brought those efforts to perfection in a rigorous causal analysis of the discipline (Bonaventure, 1 Sent. prol., q. 1-4). Its subject is the sum total of all things, God, the First Principle, the absolute first cause of all other things, and the full extent of God’s creation revealed in the scriptures and the long list of councils, creeds, and commentaries on the doctrine contained in its pages. Its method, the “sharp teeth” of rational inquiry, analysis, and argument (Aristotle, Physics 2.9). And its goal, the perfection of the knowledge and love of God that ends in an ecstatic union with God. But for what purpose? Why engage in a rational demonstration of the faith rather than a pious reading of the scriptures? Bonaventure listed three reasons: (1) the defense of the faith against those who oppose it, (2) the support of those who doubt it, and (3) the delight of the rational soul of those who believe it. “There is nothing,” Bonaventure explained, “which we understand with greater pleasure than those things which we already believe” (Bonaventure, 1 Sent. prol., q. 2, resp.). Nevertheless, reason was a handmaiden who knew her own mind. Bonaventure routinely admonished his readers against the “darkness of error” that diminished the light of intelligible truth and led them into the sin of pride: “Many philosophers,” he lamented, “have become fools. They boasted of their knowledge and have become like Lucifer” (Bonaventure, Hexaëmeron 4:1).
The evaluation of Bonaventure’s status as a philosopher had been closely bound to his attitude toward Aristotle and Aristotle’s commentators. Mandonnet had placed Bonaventure within a Neo-Platonic school of thought, largely Augustinian, that rejected Aristotle and his commentators and failed to develop a formal distinction between philosophy and theology (Quinn, Historical Constitution, 17-99). Van Steenberghen had argued that Bonaventure relied on a wide range of sources, Platonic, Neo-Platonic, and Aristotelian, for his philosophy, but it was an eclectic philosophy that served only to provide ad hoc support for his theological doctrines. Gilson had argued that Bonaventure developed a Christian Neo-Platonic philosophy, largely Augustinian, distinct from theology but in support of it, and he did so in a deliberate effort to distance himself from the radical Aristotelians who opposed the doctrinal positions of the Christian tradition. The debate on particular aspects of Bonaventure’s status as a philosopher and his debt to Aristotle and Aristotle’s commentators continues, but current consensus recognizes that Bonaventure developed a distinct and cohesive philosophy in support of his theology, and that he relied on a wide range of sources, Pythagorean, Platonic, Neo-Platonic, Stoic, Augustinian, and even Aristotelian to do so.
3. The First Principle
Bonaventure began the comprehensive presentations of his philosophy and philosophical theology in the beginning (in principio), in a statement of faith that testifies to the existence of the First Principle (Primum Principium) of Genesis, the God of Abraham, Isaac, and Jacob or, more specifically, God the Father, the first person of the Christian Trinity (Bonaventure, 1 Sent. d. 2, a. 1, q. 1; Breviloquium 1.1; and Itinerarium prol. 1). But he also insisted that this Principle is the fundamental cause of each and every other thing in heaven and earth and so, through the rational reductio of each thing to its efficient, formal, and final cause, this Principle is known to the human intellect independent of divine revelation. It is also common, in some form, to the philosophical traditions of classical antiquity, Pythagorean, Platonic, Neo-Platonic, Peripatetic, and Stoic. Indeed, Bonaventure absorbed much of that heritage in his own exposition of the existence and nature of the One God.
a. The Properties of the First Principle
He also developed a philosophical description of the fundamental properties of the First Principle on the basis of that classical heritage (Bonaventure, Itinerarium 5.5). The First Principle is being itself (ipsum esse). It comes neither from nothing nor from something else and is the absolute first cause of every other thing (Bonaventure, 1 Sent. d. 28, a. 1, q. 2 ad 4). If not, it would possess some degree of potential and not be absolute being. It is also eternal, simple, actual, perfect, and one in the sense of its numerical unity and the simplicity of its internal unification. If not, it would, again, possess some degree of potential and thus not be absolute being. Bonaventure developed slightly different lists of these divine properties throughout his works, but they all share the common root in the concept of absolute being.
b. The Theory of the Forms
Bonaventure’s arguments for the existence of the First Principle depended on his revision of the Platonic theory of the forms and an analysis of truth on the basis of those formal principles. Plato had developed his theory in response to a problem Heraclitus first proposed. All things in the physical realm of being are in a constate state of change. So much so, that when we claim to know them, we fail. The things we claim to know no longer exist. They have changed into something else. Our claim of knowledge, then, is at best a fleeting glimpse of the past and a fiction in the present. But Plato argued that we do, in fact, know things and we know some of them with certainty, such as mathematical principles and evaluative concepts like justice. If so, what accounts for them? Plato proposed his theory of forms to answer the question. The forms (eíde) are the paradigmatic exemplars of the things they inform. They exist in a permanent realm of being independent of the things they inform, and they persist in spite of the changes within those things. The mind grasps the forms through its recollection (anamnesis) of them or, in the testimony of Diotima in the Symposium, in an ecstatic vision of those forms in themselves.
But Plato and his successors, notably Aristotle, continued to debate particular aspects of the theory. Do the forms exist within a divine realm of being (ante rem) independent of the things they inform? Do they serve as exemplars so that the individual instantiations of those forms in the physical realm of being imitate them? Do they exist in the things they inform so that those things participate, in some way, in the forms (in re)? Do they exist in the mind that conceives them (post rem)? If so, how does the mind acquire those forms? Or, as later philosophers would argue, are they merely a linguistic expression (flatus vocis)? Or some combination of the above?
Bonaventure relied on Plato’s theory, transmitted through Augustine, and Aristotle’s criticisms of that theory to develop a robust “three-fold” solution that embraced the full spectrum of possibilities (Bonaventure, Itinerarium 1.3; Christus unus magister 18). The forms, he argued, exist eternally in the Eternal Art (ante rem), in the things they inform (in re), and in the mind who apprehends them (post rem), and this included their expression in the speech of the person who apprehends them (flatus vocis). They serve as exemplars so that the individual instantiations of them in the physical realm of being imitate them, and they participate in the presence of those forms in re. Finally, the mind acquires them, Bonaventure argued, through the cooperative effort of the rational soul that abstracts them from its sensory apprehension of the things they inform and the illumination of the Eternal Art that preserves the certainty of their truth (Bonaventure, Itinerarium 2.9).
Bonaventure’s commitment to a robust theory of Platonic realism in regard to the forms has earned him the title as the last of the great Platonists of the Middle Ages, but the praise is a thinly veiled criticism. It often implies his endorsement of a dead end in philosophical metaphysics in contrast to more enlightened philosophers, such as Aquinas, Scotus, and Ockham, who would reject a robust Platonism in their anticipation of a more thoroughly rational and naturalistic metaphysics. But it is important to note that Platonism endured. Renaissance Platonists, with the benefit of the full scope of the Platonic corpus, would reinvigorate a tradition that survived and often thrived in subsequent generations and continues to do so, particularly in the field of the philosophy of mathematics, to the present day.
c. Truth and the Arguments for the First Principle
Bonaventure’s arguments for the existence of the First Principle remain impressive for their depth and breadth (Houser, 9). He classified his arguments for the existence of the First Principle on the basis of the metaphysical forms, ante rem, in re, and post rem, and a type of correspondence theory of truth mapped onto the three fundamental divisions of the Neo-Platonic concept of being (esse): the cosmological truth of physical being, the epistemological truth of intelligible being, and the ontological truth of Divine Being (Bonaventure, 1 Sent. d. 8, p. 1, a. 1, q. 1). Cosmological truth depends on the correspondence between an object and its form in the divine mind (ante rem), intelligible truth on the correspondence between an object and its intelligible form in the human mind (post rem), and ontological truth on the correspondence between an object and the form within it (in re) that renders it into a particular type of thing. But the First Principle, the absolute origin of every other thing, does not possess the material principle of potential. It is the “pure” act of being—a concept Bonaventure will develop throughout his arguments. It and It alone is the perfect instantiation of its form, so to speak, and, thus, It and It alone is necessarily true in Itself.
i. The Epistemological Argument
Bonaventure began with the epistemological argument which, he claimed, is certain, but added that the arguments in the other categories are more certain. His initial formulation of the argument asserted that the rational soul, in its self-reflection, recognizes the “impression” (impressio) of the First Principle on its higher faculties, its memory, intellect, and will, and their proper end in the knowledge and love of that First Principle (Bonaventure, Mysterio Trinitatis q.1, a.1 fund. 1-10). The argument is not as viciously circular as it appears. Bonaventure contended that the soul possesses an innate desire for knowledge and love that remains unsated in the physical and intelligible realms of being. These realms, as Bonaventure will explain in more detail in his revision of the argument, possess a degree of potential that necessarily renders them less than fully satisfying. But, as Aristotle had frequently insisted, nature does nothing in vain. Thus, per the process of elimination, the soul finds satisfaction in the knowledge and love of a Divine Being.
He revised the epistemological argument in his later works into a more sophisticated, and less circular, argument from divine illumination (Bonaventure, Scientia Christi q. 4; Christus unus magister 6-10, 18; and Itinerarium 2.9).
The rational soul (mens) possesses knowledge of certain truth.
Bonaventure presumed that the soul possesses certain truth. Paradigmatic examples include the principles of discrete quantity, the point, the instant, and the unit, and the logical axioms of the Aristotelian sciences, such as the principle of non-contradiction. He also cited Plato’s account of the young boy who possessed an innate knowledge of the principles of geometry that enabled him, without the benefit of formal education, to successfully double the area of a square.
The rational soul is fallible and the object of its knowledge mutable and thus fails to provide the proper basis for certain truth.
Bonaventure appears to have some sympathy for Heraclitus’ argument that the world is in a constant state of change. If so, our knowledge of the world at any given time is accurate for only a fleeting passage of time. But the thrust of his argument in support of this premise stems from his conception of truth in relation to his theory of the metaphysical forms that exist in the divine realm of being (ante rem), in the intelligible (post rem), and in the physical (in re). Empirical observation reveals that the physical realm of being is always in some degree of potency in relation to its participation in the metaphysical forms in re. This potential testifies to its existence ex nihilo—it is the root cause of its possession of some degree of potency. Its truth depends on the degree to which it participates in its metaphysical form in re and the degree to which it imitates its form ante rem. Thus, its truth necessarily falls short of its ideal. The soul, also created ex nihilo, also possesses some degree of potency in relation to its ideal. Thus, its abstraction of these imperfect forms from the physical realm of being is fallible.
Therefore, the rational soul relies on a divine “light” to render itself infallible and the object of its knowledge immutable.
Bonaventure divided the sum total of the cosmos into three realms of being without remainder: physical, intelligible, and divine. Thus, per the process of elimination, the soul relies on an “eternal reason” from the divine realm of being for its possession of certain truth.
ii. The Cosmological Argument
Bonaventure relied on Aristotle’s analysis of the concept of being (esse) into being-in-potency and being-in-act for his cosmological argument (Mysterio Trinitatis q. 1, a. 1 fund. 11-20). He began with a series of empirical observations. The physical ream of being possesses a number of conditions that reveal it is being-in-potency. It is posterior, from another, possible, and so on. Its possession of these properties indicates that it depends on something else to account for its existence, something prior, something not from another, something necessary. In fact, its being-in-potency reveals its origin ex nihilo, and it continues to possess some degree of the “nothingness” from which it came—contra Aristotle, who had argued for the eternal existence of a continuous substratum of matter on the basis of Parmenides’ axiom that nothing comes ex nihilo. Thus, it “cries out” its dependence on something prior, something not from another, something necessary. This being cannot be another instance of physical being, since each and every physical thing is posterior, from another, possible, and so on. It cannot be intelligible being which shares the same dependencies. Thus, again per the process of elimination, it is the Divine Being.
iii. The Ontological Argument
Bonaventure was the first philosopher of the thirteenth century to possess a thorough grasp of the ontological argument and advance its formulation (Seifert, Si Deus est Deus). He did so in two significant directions. First, he provided an affirmative compliment to the emphasis on the logical contradiction of the reductio ad absurdum common among traditional forms of the argument (Bonaventure, Mysterio Trinitatis q. 1, a.1 fund. 21-29 and Itinerarium mentis in Deum 5.3). He began with the first of Aristotle’s conceptual categories of being, non-being, and argued that the concept of non-being is a privation of the concept of being and thus presupposes the concept of being. The next category, being-in-potency, refers to the potential inherent in things to change. An acorn, for example, possesses the potential to become a sapling. The final category, being-in-act, refers to the degree to which something has realized its potential. The thing that had been an acorn is now a sapling. Thus, the concept of being-in-potency depends on the concept of being-in-act. But if so, then the concept of being-in-act depends on a final conceptual category, the concept of a “pure” act of being without potential and this final concept, Bonaventure argued, is being itself (ipsum esse).
Second, he reinforced the argument to include the transcendental properties of being, the one, the true, and the good. The concept of the transcendentals (transcendentia) was a distinctive innovation in the effort of medieval philosophers and theologians to reengage the sources of the syncretic philosophical systems of late antiquity, such as Porphyry’s Introduction to Aristotle’s Categories, Boethius’ On the Cycles of the Seven Days—often cited in its Latin title, De hebdomadibus—and Dionysius’ On the Divine Names, to identify the most common notions (communissima) of the concrete existence of being (ens) that “transcended” the traditional Peripatetic division of things, or perhaps the names of things, into the categories of substance and its accidents: quantity, quality, relation, and so on. Each and every particular thing that exists in the physical realm of being is one, true, and good. But it is imperfectly one, true, and good. It is one-in-potency, truth-in-potency, and good-in-potency. It depends on one-in-act, truth-in-act, and good-in-act, and thus on the “pure” act of being the one, the true, and the good.
His final step is to locate this being itself, the one, the true, and the good, within the Neo-Platonic division of being. It does not fall within the category of the physical realm of being “which is mixed with potency”. It does exist within the intelligible realm of being, but not entirely so. If it existed in the rational soul and only in the soul, it would exist only as a concept, an intelligible fictum, and thus possess “only a minimal degree of being”. And so, per the process of elimination, being itself is the Divine Being.
d. Epistemic Doubt
Bonaventure insisted that these arguments are self-evident (per se notum) and thus indubitable. But if so, what accounts for doubt or disbelief? Bonaventure located the root of doubt in a series of objective failures, a vague definition of terms, insufficient evidence in support of the truth of propositions, or a formal error in the logical process (Bonaventure, 1 Sent. d. 8, p. 1, a. 1, q. 2 resp.). None of these, he argued, applied in the case of his arguments for the existence of God. Nevertheless, he recognized the reality of those who would deny them. But if so, what accounts for their denial? Either a subjective failure to grasp the terms, the propositions, or the arguments or, perversely, a willful ignorance to do so.
4. The Emanation of the Created Order
Bonaventure’s account of creation depended on Plato’s Timaeus which he read in a careful counterpoint with Genesis as well as Aristotle’s Physics and other texts in natural philosophy. It begins with a revision of Plato’s myth of the Divine Architect (Bonaventure, De reductione artium 11-14). The First Principle, God the Father, similar to Plato’s Divine Architect, carefully studied the metaphysical forms in God the Son, the Eternal Art “in whom He has disposed all things” (Wisdom 11:21). The Father then fashioned the artifacts (artificia) of the created realm of being in imitation of those formal exemplars and declared them good in terms of their utility and their beauty. Finally, He created human beings in the image of Himself, so that they would recognize the presence of the artist in the work of His hands and praise Him, serve Him, and delight in Him.
Bonaventure’s description of the Divine Architect differs from Plato’s in one crucial respect. His fidelity to the orthodox formula of the ecumenical councils compelled him to insist that the First Principle exists in one and the same substance (ousia) with the Eternal Art (Breviloquium 1.2). The distinction between them, in the phrase of the councils, is entirely personal (hypostatic). Bonaventure argued that this distinction is the result of their respective modes of origin. God the Father, the first of the divine hypostases, is the absolute First Principle without beginning, and the Son, the second divine hypostasis, comes from the Father. Philosophically, Bonaventure’s distinction between the two on the basis of origin seems too metaphysically thin to account for a real distinction between them. Nevertheless, his commitment to orthodox monotheism precluded a substantial distinction between the two.
Bonaventure also insisted that the First principle created the world in time and out of nothing (Bonaventure, 2 Sent. d. 1, p. 1, a. 1, q. 2). Aristotle proposed the first detailed argument for the eternity of the world and most ancient philosophers, notably Proclus, endorsed it. Jewish and Christian philosophers, notably Augustine, developed the doctrine of creation in time and ex nihilo to oppose it. But the rediscovery of Aristotle’s natural philosophy in the twelfth century revived the debate. Bonaventure was the first of the philosopher-theologians of the later Middle Ages to possess a firm grasp of the classical arguments for and against the proposition, particularly Philoponus’ reflection on the paradoxes of infinity, the impossibility of the addition, order, traversal, comprehension, and simultaneous existence of an infinite number of physical entities (Dales, 86-108). He may not have thought the argument against the eternity of the world on the basis of these paradoxes was strictly demonstrative; nevertheless, he clearly thought that the eternal existence of a created world was an impossibility.
Bonaventure relied on three closely related concepts, the Aristotelian principle of matter, the Stoic concept of causal principles, and the Neo-Platonic concept of metaphysical light, to further develop his account of creation (Bonaventure, Breviloquium 2.1-12). He relied on Aristotle’s principle of matter, distinct from matter in the sense of concrete, physical things, to account for the continuity of those things through change. The metaphysical forms within them rendered them into something particular and directed the changes within them in the course of time. The principle of matter received those forms, and rendered those concrete, physical particulars into stable receptacles of change. It was the foundation (fundamentum) of the physical realm of being. But Bonaventure also argued, contra Aristotle, that the rational souls of human persons and other intelligent creatures in the intelligible realm of being, angels and even the devil and his demons, possess the potential to change and thus they possess the same principle of matter. Their metaphysical forms rendered them into something intelligible, distinct from other concrete, physical things, but those forms subsist in the same material principle. Bonaventure’s doctrine of universal hylomorphism, in which both physical and intelligible creatures possess the principles of matter and form, is an essential feature in his distinction between the First Principle, who does not change (Malachi 3:6), and Its creation.
Early Jewish and Christian philosophers developed the Stoic concept of causal principles or “seeds” (rationes seminales) to account for the potential within concrete particulars to change. The metaphysical forms that informed those changes exist in potentia within them. They exist as metaphysical seeds, so to speak, that will develop into forms in re in the fullness of time in response to a secondary agent, such as the human artist who, in imitation of the Eternal Art, creates artifacts that are useful and beautiful. Bonaventure insisted on the presence of these metaphysical seeds in his rejection of Ibn Sina’s doctrine of occasionalism, in which the First Principle, and the First Principle alone, is the efficient cause of each and every change in the created realm of being. Bonaventure argued, contra Ibn Sina, that the dignity of the human person, created in the image of God, demands that they, too, serve as efficient causes in the created order and cooperate with Him in their effort to know and love Him.
Bonaventure developed his light metaphysics in opposition to Aristotle who had argued that light was an accidental form that rendered things bright, not a substantial form (Bonaventure, 2 Sent. d. 13). Aristotle’s approach accounted for bright things, such as the sun, the moon, and the stars, but did not allow for the existence of light in itself. Bonaventure favored the less popular but more syncretic approach of Grosseteste, who argued that a metaphysical light (lux) was the substantial form of corporeity. It bestowed extension on physical things and rendered them visible—the fundamental properties of all physical things. It also prepared them for further formation. Bonaventure was a proponent of the “most famous pair” (binarium famosissimum), the logical entailment of the thesis of (near) universal hylomorphism and the plurality of metaphysical form that distinguished the Franciscan school of thought throughout the thirteenth century. He argued that all created things possess the metaphysical attributes of matter and form—his advocacy of the doctrine of divine simplicity precluded his application of the thesis to the Divinity. He also argued that a series of forms determined the precise nature of each thing. The form of light (lux) was common to all physical things, but other forms rendered them into particular types of physical things according to an Aristotelian hierarchy of the forms, the nutritive form common to all living things, the sensory form common to all animals, and the rational form, or soul, which distinguishes the human person from other terrestrial creatures.
The First Principle weaved these threads together in Its creation of the corporeal light (lumen) on the first day of creation. This light was a single, undifferentiated physical substance in itself, not an accidental property of something else, extended in space and, in potential, time. It possessed the inchoate forms that would guide its further development and stood, the principle of matter made manifest, as the corner stone of the physical cosmos.
The First Principle divided this primal light into three realms of physical being (naturae), the luminous on the first day of creation, the translucent on the second, and the opaque on the third, and then filled them with their proper inhabitants on the subsequent days of creation. The luminous realm consists of the purest form of metaphysical light (lux) and corresponds to the heavens, bright with the light of its form and a modest amount of prime matter. The translucent realm consists of air and, to a lesser extent, water, and contains a less pure degree of the primordial light in its mixture with prime matter. The opaque consists of earth and contains the least pure degree of light in its mixture. He relied on Aristotelian cosmology to further divide the cosmos into the empyrean heaven, the crystalline heaven, and the firmament; the planetary spheres, Saturn, Jupiter, Mars, the Sun, Venus, Mercury, and the Moon; the elemental natures of fire, air, water, and the earth; and, finally, the four qualities, the hot, the cold, the wet, and the dry, the most basic elements of Aristotelian physics. The heavenly spheres, he explained, correspond to the luminous realm. The elemental natures of air, water, and earth, correspond to the sublunar realms and contain the birds of the air, the fish of the sea, and each and every thing that crawls on the earth. Fire is a special case. Although elemental, it shares much in common with the luminous, and thus consumes the air around it in its effort to rise to the heavens.
The process came to its end in the formation of the human person in the image of God. Bonaventure adopted a definition of the human person common among Jewish, Christian, and Islamic philosophers and theologians throughout the Middle Ages: the human person is a composite of a soul (anima) and body, “formed from the mire (limus) of the earth” (Breviloquium 2.10). The human soul is the metaphysical forma of its body. It perfects its body in so far as its union with its body brings the act of creation to its proper end in the formation of the human person, the sum of all creation, in the image of God. It then directs its body in the completion of its principal task, to enable the human person to recognize creation’s testimony to its Creator so that it might come to its proper end in union with its Creator.
He distinguished his definition of the human composite from his peers in his juxtaposition of two convictions that initially seem to oppose one another: the ontological independence of the soul as a substantial, self-subsisting entity and the degree to which he emphasized the soul’s disposition to unite with its body. Plato and his heirs who had insisted on the soul’s substantial independence tended to denigrate its relationship with the body. Plotinus’ complaint is indicative if hyperbolic: Porphry, his biographer, told us that he “seemed ashamed of being in the body” (Porphyry, On the Life of Plotinus 1). Bonaventure rejected this tendency. He agreed that the soul is an independent substance on the basis of his conviction that it possesses its own passive potential. It is able to live, perceive, reason, and will independently of its body in this life and the next and, after its reunion with a new, “spiritual” body, in its eternal contemplation of God. The soul, Bonaventure insisted, is something in itself (Bonaventure, 2 Sent. d. 17, a. 1, q. 2). The human spirit is a fully functioning organism with or without its corporeal body.
But he also argued that the soul is the active principle that brings existence to the human composite in its union with its body and enables it to function properly in the physical realm of being (Bonaventure, 4 Sent. d. 43, a.1, q.1 fund. 5). Thus, the soul possesses an innate tendency to unite with its body (unibilitas). The soul is ordered to its body, not imprisoned within it. It realizes its perfection in union with its body, not in spite of it; and with its body, it engages in the cognitive reductio that leads to its proper end in the knowledge of God and ecstatic union with God. Its relationship with its body is so intimate that it no longer functions properly at the moment of its body’s death. It yearns for its reunion with its risen body in the world to come, a clear, impassible, subtle, and agile body that furthers its access to the beatific vision.
5. The Epistemological Process
Bonaventure began his account of the epistemological process with a classical theme common throughout the Middle Ages: the human person, body and soul, is a microcosm (minor mundus) of the wider world (Bonaventure, Itinerarium 2.2). Its body consists of the perfect proportion of the fundamental elements that comprise the physical realm of being, the primordial light (lux) of the first day of creation that regulates the composition of the other four elements: the earth that renders the body into something substantial, the air that gives it breath, the bodily fluids that regulate its health, and the fire that instills the physiological basis for its passions. Its soul (anima) renders it into the most well-developed of all creatures in its capacities for nutrition, which it shares with the plant kingdom, sensation, which it shares with the animal kingdom, and reason, which belongs to rational creatures alone. But above all, it possesses the capacity to know all things throughout the full extent of the created order, the luminous, translucent, and opaque realms of the cosmos.
Bonaventure divided the epistemological process through which the rational soul (mens), the definitive aspect of the human person, comes to know all things into three distinct stages: apprehension, delight, and rational judgment.
a. Apprehension
Bonaventure’s theory of sense apprehension (apprehensionem) depended on the current state of the physical sciences in the early thirteenth century, psychology, biology, physiology, neurology, and physics. He located the start of the process in the rational soul’s sensory apprehension of the physical realm of being. But Bonaventure was not an empiricist. He admitted the soul possesses the rational principles that enable it to reason in its estimation of the physical realm of being and the ability to know itself and other intelligible beings, namely, angels, the devil, demons, and the Divine Being. The internal sense is the first to engage the physical realm of being. It determines the degree of threat the physical realm poses and thus serves to protect the soul and its body from harm. The next series of senses includes the more familiar senses of sight, hearing, smell, taste, and touch. Each sense is a “door” (porta) that opens onto a particular element or combination of elements. Sight opens to the primordial light of the first day of creation. Hearing opens to air, smell to “vapors”, an admixture of air, water, and fire, the remnant of the elemental particle of heat, taste to water, and touch to earth. Each sense also apprehends common aspects of physical things, their number, size, physical form, rest, and motion.
Bonaventure insisted that, for most things, the human person invokes each of the senses in tandem with the others. Each sense opens onto particular properties inherent within physical things and when the rational soul applies them in conjunction with one another, they provide a comprehensive grasp of the universe in its totality. Some things, like the morning star, remain so bright, so pure, that it is accessible only to sight. But most things contain a more thorough mixture of the primordial light of creation and the more substantial elements of earth, air, fire, and water that in themselves consist of the fundamental particles of medieval physics, the hot, the cold, the wet, and the dry. The rational soul’s apprehension of the macrocosmus as a whole demands the use of all its senses.
Bonaventure’s metaphor of the door reveals his debt to an Aristotelian intromission theory of sense perception. The sense organs, the eyes, ears, nose, and so on, are passive, but the metaphysical light within the objects of the senses render them into something active. They shine, so to speak, and impress an impression (similitudo) of themselves onto the media that surround them, the light, fire, air, water, or in some cases, earth. Each impression is an exemplum of an exemplar, like the wax impression of a signet ring. These impressions in the media impress another exemplum of themselves onto the exterior sense organs, the eyes, ears, nose, and so on. These impress an image of themselves onto the inner sense organs of the nervous system and, finally, onto the apprehension, the outermost layer of the mind. The physical realm of being is filled with the “brightness” of its self-disclosure, “loud” with its cries, pungent, savory, and tangible. The soul cannot escape its self-disclosure. It remains in “tangible” contact with each and every thing it apprehends—albeit through a series of intermediary impressions.
These sensory species or similitudines within the soul’s apprehension are “intentions” in the sense of signs, rarified, information bearing objects within itself, not merely the soul’s awareness of the objects of its apprehension. They contain information about the way things look, sound, smell, taste, and feel, information about their size, shape, whether they are in rest or in motion, hic et nunc, the imprint of the concrete reality of physical being. But the soul’s apprehension of them is an apprehension of the impression of a series of impressions of the object, not the object in itself, and the subtle decline in the accuracy of each impression accounts for the errors of perception.
b. Delight
Bonaventure emphasized the role of delight (delectatio) in this process to a greater extent than his peers (Lang, Bonaventure’s Delight in Sensation). He identified three sources of the rational soul’s delight in its apprehension of the “abstracted” impression: the beautiful (speciositas), the agreeable (suavitas), and the good (salubritas). (It is clear from the context of the passage that the soul delights in its “abstraction” of a sensory impression at this stage of the process, not in its abstraction of a metaphysical form from those sensory impressions.) His immediate source for the innovation was the incipit of Aristotle’s Metaphysics: “All men [and women] naturally desire to know” and the further claim that the “delight” they take in their senses is evidence of that desire (Aristotle, Metaphysics 1.1). But he derived his classification of those delights from a long tradition of Pythagorean, Platonic, and Peripatetic texts on the proper objects of natural desire, distilled in Ibn Sina’s On First Philosophy.
The first of these three, the soul’s delight in its apprehension of beauty, was wholly original. It doesn’t appear in the pertinent sources. It refers to the “proportion” (proportio) between the sensory impression and its object, the sensory impression of a sunset, for example, and the sunset itself. Thus, beauty is subject to truth. The greater the degree of similarity between the sensory impression of an object and the object, the greater degree its truth and thus its beauty. The second, agreeableness, was common in the pertinent sources. It refers to the proportion between the sensory impression and the media through which it passes. A pleasant light, for example, is proportional to its media. A blinding light is disproportional. The third, goodness, was also common. It refers to the proportion between a sensory impression and the needs of the recipient, like a cool glass of water on a hot day.
Bonaventure aligned particular delights with particular senses, beauty with sight, agreeableness with hearing or smell, and goodness with taste or touch, but he does so “through appropriation in the manner of speech” (appropriate loquendo), that is, to “speak” about what is “proper” to each sense, not what is exclusive to each of them. The soul’s delight in the beauty of an object is most proper to sight, not restricted to it, and so, too, for the other forms of delight and their proper senses. The soul can also delight in the beauty of the sound of well-proportioned verse, the smell of a well-proportioned perfume, the taste of well-proportioned ingredients, or even the touch of a well-proportioned body. The soul is able to access beauty through all its senses and the loss of one or more of them does not deny it the opportunity to delight in the beauty of the world.
Finally, it is important to note that he distinguished between the beautiful, agreeable, and good properties within the soul’s apprehensions of things, not beautiful, agreeable, or good things. The same objects are, at once, beautiful, agreeable, and good.
The similarity between Bonaventure’s distinction of the soul’s delight in speciositas, suavitas, and salubritas and Kant’s seminal distinction of the beautiful (das Schöne), the agreeable (das Angenehme), and the good (das Gute) is striking (Kant, Critique of Judgment 5). But the lists are not coordinate. Bonaventure’s concept of suavitas is comparable to Kant’s concept of das Angenehme, but neither his conceptualization of speciositas to Kant’s das Schöne nor his conceptualization of salubritas to Kant’s das Gute. Bonaventure’s conceptualization of the soul’s pleasure in the apprehension of beauty, speciositas, depends on the degree of correspondence between the soul’s apprehension of the sensible species and its proper object, not on the free play of the mind’s higher cognitive faculties, the intellect and the imagination; and his conceptualization of the pleasure in salubritas depends on the wholesomeness of the object, not on the degree of esteem or approval we place upon it. Nor is there evidence for Kant’s familiarity with Bonaventure’s text. The best explanation for the similarity is that Kant relied on the common themes of antiquity, namely, the beauty of proper proportions and the pleasure in the contemplation of them, and perhaps common texts, but not Bonaventure’s direct influence on Kant.
c. Judgment
Bonaventure brought his account of this epistemological process to completion in its third stage, rational judgment (diiudicatio). It is in this stage, and only this stage, that the soul determines the reason for its delight, and it does so in its abstraction of concepts, the metaphysical forms in re, from the sensory species in its apprehension. He developed an innovative two-part process to account for the rational soul’s abstraction of the sensible species: an Aristotelian abstraction theory of concept formation and a Platonic doctrine of divine illumination that rendered its judgments on the basis of those species certain. Most other philosophers depended on one or the other, or principally one or the other, not both.
i. The Agent Intellect
Bonaventure depended on the common distinction between two fundamental powers (potentia) of the intellect for his development of an abstraction theory of concept formation: the active power (intellectus agens) and the passive (intellectus possibilis). The active power abstracts the intelligible forms from the sensory species and impresses them in the intellect. The passive power receives the impressions of those intelligible forms. But he also insisted that the agent power subsists in one and the same substance (substantia) with the possible. He did so to distinguish his theory from those who identified the active power with a distinct substance, the Divine Being or a semi-divine intelligence. If so, this would render the human person entirely passive in its acquisition of knowledge and reduce its dignity as a rational creature in the image of God. Thus, the phrase “intellectus agens” refers to a distinction (differentia) in the action of one and the same intellectual faculty. It is a natural “light” that “shines” on the intelligible properties of the sensible species and reveals them. It makes them “known” and then “impresses” them upon the intellect. It also depends on the potential of the intellect to do so. The agent power, in itself, cannot retain the impression of the forms, and the intellect’s potential, in itself, is unable to abstract them. The intellect requires the interdependent actions of both its active and passive powers to function properly.
ii. Divine Illumination
Bonaventure insisted that the rational soul possesses the innate ability to abstract the intelligible species from the impressions of its sensory apprehension and thus come to know the created order without the assistance of the Divine Being or other, semi-divine intelligences. But he also insisted that the soul requires the assistance of the illumination of the forms in the Eternal Art (ante rem) to do so with certitude.
Bonaventure presented his doctrine of divine illumination in the context of his epistemological argument for the existence of God. The rational soul is fallible, and the object of its knowledge in the physical realm of being is mutable. Thus, it relies on a divine “light” that is infallible and immutable to render its abstraction of the metaphysical forms from its sensory apprehension infallible and the object of its knowledge immutable. But precisely how this occurs has been the subject of a wide range of debate (Cullen, Bonaventure 77-87). Gioberti had placed Bonaventure within the tradition of Malebranche and other advocates of ontologism, who had argued that the soul has direct access to the divine forms as they exist in the Eternal Art ante rem. Portalie had placed him within the tradition of Augustine who, so Portalie insisted, had argued that the Eternal Art impresses the forms directly on the rational soul. (Note, however, that Augustine’s theory of illumination is also the subject of a wide range of debate.) Gilson argued for a formalist position in which the rational soul depends on the light of the divine forms to judge the accuracy, objectivity, and certainty of its conceptual knowledge, but denied its role in the formation of concepts.
Gendreau pioneered the current consensus that endorses an interpretative synthesis between Portalie’s reading of Bonaventure’s doctrine and Gilson’s (Gendreau, The Quest for Certainty). Bonaventure explicitly affirmed that the “light” of the forms in the Eternal Art, but not the forms as they exist in the Eternal Art ante rem, “shines” on the soul to “motivate” and “regulate” its abstraction of the intelligible forms from its sensory apprehension of the physical realm of being. But he explicitly denied that it is the “sole” principle in “its full clarity”. Furthermore, if the soul possessed direct access to the divine forms in the Eternal Art (ante rem) or if that Art impressed those forms directly onto its faculties (post rem), there would be no need for the agent power of the intellect to abstract the forms from the sensory species (in re). Bonaventure would have undermined his careful effort to delineate the subtle distinctions between the agent power of the intellect and the possible in their role in concept formation.
Thus, Bonaventure developed a cooperative epistemology in which the Eternal Art projects some type of image of the divine forms ante rem onto the rational soul’s higher cognitive faculties, its memory, intellect, and will. But this projection is not the “sole” principle of cognition. The rational soul “sees” its abstraction of the intelligible forms within itself (post rem) in the “light” of the projection of the forms in the Eternal Art, and this light enables it to overcome the imperfection of its abstraction of the intelligible forms and renders its knowledge of them certain—although it does not see the forms ante rem in themselves, that is, in their full clarity. Nevertheless, Bonaventure did not think that the projection of this light of the Eternal Art was fully determinative. The soul could and would occasionally err in its judgment of the intelligible forms within itself even in light of the projection of the Eternal Art through either natural defect or willful ignorance.
6. Moral Philosophy
The goal of Bonaventure’s moral philosophy is happiness, a state of beatitude in which the soul satisfies its fundamental desire to know and love God (Bonaventure, 4 Sent. d. 49, p. 1, a. 1, q. 2). He admitted that the human person must attend to the needs of its body—at least in this lifetime. His emphasis on the virtue of charity and the care of the lepers, the poor, and others in need, in body and soul, attests to that commitment (Bonaventure, Legenda maior 1.5-6). But while necessary, the satisfaction of these physical needs remains insufficient. The rational soul is the essential core of the human person and thus the soul and its needs set the terms for its happiness in this life and the next.
The structure of the rational soul established its proper end. Its rational faculties, its memory, intellect, and will, worked in close cooperation with one another in its effort to know and love the full extent of the cosmos in the physical realm of being, the intelligible, and the divine until it comes “face to face” with the divine in an ecstatic union that defies rational analysis. Bonaventure readily admitted that the soul finds delight in its contemplation of the physical realm of being. Indeed, he encouraged the proper measure of the soul’s delight in “the origin, magnitude, multitude, plenitude, operation, order, and beauty” of the full extent of the physical realm of being (Bonaventure, Itinerarium 1:14). But, Bonaventure argued, the soul’s knowledge and love of the physical realm of being fails to provide full satisfaction. Even if the soul could plumb the full extent of its depths, it would still fail to satisfy. The physical realm of being comes from nothing (ex nihilo) and is therefore fundamentally nothing in itself. “Everything is vanity (vanitas), says the teacher” (Ecclesiastes 1:1). It is fleeting (vanum) and vain (vanitas) and, at most, provides a fleeting degree of satisfaction (Bonaventure, Ecclesiastae c. 1, p. 1, a. 1). So, too, the intelligible realm. Even if the soul could come to a full comprehension of itself, it comes from nothing. Thus, per the process of elimination, the soul finds its satisfaction only in its knowledge and love of Divine Being, Being Itself, the Pure Act of Being, first, eternal, simple, actual, perfect, and unsurpassed in its unity and splendor.
Bonaventure reinforced this argument with a careful analysis of Aristotle’s conception of happiness in the first book of the Nicomachean Ethics. He pointed out that the Philosopher had defined happiness as the soul’s practice or possession of its proper excellence (areté). Thus, the human person, precisely as a rational animal, finds happiness in its rational contemplation of truth and, in particular, the highest truth of the first, eternal, immutable cause of every other thing, the contemplation of the Thought that Thinks Itself.
Bonaventure accepted much of Aristotle’s account of happiness, but relied on Augustine’s critique of eudaimonism in the City of God to point out three critical errors: (1) it lacks the permanence of immortality, (2) the contemplation of abstract truth is insufficient, and (3) the soul cannot attain its proper end in itself (Bonaventure, Breviloquium 2.9). He argued, contra Aristotle, that the soul’s perfect happiness is found in its eternal knowledge and love of the highest truth in an ecstatic union with the concrete instantiation of that truth in the Divine Being, not merely the rational contemplation of that Divine Truth. He also argued, contra Aristotle, that the soul relied on the assistance (gratia) of the Divine Being to ensure that it comes to its proper end in union with the Divine Being.
Bonaventure proposed two means to attain this end. The first was the rational reductio of the physical realm of being and, through self-reflection, the intelligible, to its fundamental causes, efficient, formal, and final, in the First Principle. The second, the practice of the virtues, the moral counterpart to the soul’s rational ascent in its effort to know and love the First Principle and come to its proper end in union with that Principle.
Bonaventure relied primarily on Aristotle to derive his definition of virtue in its proper sense: a rationally determined voluntary disposition (habitus) that consists in a mean [between extremes] (Bonaventure 2 Sent. d. 27, dub. 3). The definition requires some explanation. First, Bonaventure insisted that the higher faculties of the rational soul, its memory, intellect, and will, worked in close cooperation with one another to exercise the free decision of its will (liberum arbitrium). The process begins in the memory, the depths of the human person, in which the First Principle infused the dispositions of the virtues. The process continues in the intellect that, with the cooperation of the light of divine illumination, recognizes those dispositions and directs its will to act on them. The process comes to its end in an act of the will that freely chooses to put them into practice.
Second, Bonaventure argued that all the virtues reside in the rational faculties of the soul, its memory, intellect, and will, and not in its sensory appetites, such as its natural desire for those things that benefit its health. He provided a number of reasons to support his claim, but two are of particular importance. First, some of the virtues may be prior to others in the sense that love, discussed below, is the form of all the other virtues, but all of them are equal in the degree to which they provide a source of merit. Bonaventure had argued that the rational soul is unable to reform itself. Thus, it relies on divine grace to help it reform itself in a process of cooperative development in which the soul’s efforts merit divine assistance. Second, the rational faculties render free decision possible, and free decision, in cooperation with divine grace, is the essential criterion for merit.
Finally, Bonaventure’s insistence on the mean between extremes in the practice of virtue corrects the tendency to read him and other medieval philosophers through the lens of a dichotomy in which the pilgrim soul must choose between heaven and earth. Bonaventure, as mentioned, encouraged the rational soul to delight in the physical realm of being in its proper measure. Nevertheless, his conception of this proper measure is often rather closer to dearth than excess. He practiced a degree of asceticism that while not as extraordinary as his spiritual father, St Francis, exceeded even the standards of his own day. His conception of a middle way consisted in the minimum necessary for sustenance and the practice of one’s vocation (Bonaventure, Hexaëmeron 5.4). He provided a memorable rebuke to illustrate this standard. In response to the criticism that a person requires a modest degree of possessions to practice the mean between the extremes of dearth and excess, Bonaventure replied that having sexual intercourse with half the potential partners in the world is hardly the proper mean between having intercourse with all of them and none. One, he argued, would suffice.
Bonaventure also argued, contra Aristotle, that virtue in its most proper sense referred to an infused disposition of the soul in fidelity to the Platonic tradition passed down through Augustine and the orthodox doctrines of the Christian theological tradition. But this posed a problem. Is Aristotle correct in his claim that virtue is an acquired disposition of the soul or is Augustine correct? Bonaventure’s solution is not entirely clear. He appeared to reject Aristotle on this point and almost every other in the critical edition of his final but unfinished treatise, the Collations on the Six Days of Creation. But DeLorme has edited another reportatio of the Collations in which Bonaventure provided a more subtle critique of Aristotle and his commentators. The weight of evidence suggests that DeLorme’s edition of the Collations is more accurate. Bonaventure appears to have argued that virtue is an infused disposition of the soul, but it does not fully determine the free decision of its will. Rather, the First Principle plants the seeds (rationes seminales) of virtue in the soul, and that soul must carefully cultivate them, with the further assistance of divine grace, to bring them to fruition.
Bonaventure presented long lists of virtues, gifts of the sprit, and beatitudes in the development of his moral philosophy (Bonaventure, 3 Sent. d. 23-36; Breviloquium 5.4; Hexaëmeron 5:2-13 and 6.6-32). These include: the theological virtues, faith, hope, and love; the cardinal virtues, justice, temperance, fortitude, and prudence; the intellectual virtues of science, art, prudence, understanding, and wisdom—some virtues, such as prudence, appear in more than one category; the gifts of the spirit, fear of the lord, piety, knowledge, fortitude, counsel, understanding, and wisdom; and the beatitudes, poverty of spirit, meekness, mourning, thirst for justice, mercy, cleanliness of heart, and peace. The authors of the secondary literature on Bonaventure’s moral philosophy tend to restrict themselves to the virtues, but the distinction between them and other dispositions of the soul is slight. Bonaventure derived the theological virtues from the scriptures and placed them in the same broad category as the cardinal and intellectual virtues. Furthermore, all three of the categories—the virtues, gifts, and beatitudes—dispose the soul to the rational consideration of the mean, and they do so to order the soul to its proper end.
Bonaventure insisted that all of the virtues and, per extension, the gifts and beatitudes, retain the same degree of value in relation to their end. Nevertheless, some of them are more fundamental than others, namely, love, justice, humility, poverty, and peace.
Love is the first and most important of the virtues (Bonaventure, Breviloquium 5.8). It is the metaphysical form of the other virtues and common to all of them. It brings them into being and renders them effective. Without love, the other virtues exist in the rational soul in potentia, and thus fail to dispose the soul’s will to its proper end. It also provides the fundamental impetus (pondus inclinationis) that inclines the affections of the will to the First Principle, itself, others, and, finally, its body and the full extent of the physical realm of being. Bonaventure’s ethics, like Francis’, includes a substantial degree of regard for the wider world in itself and as a sign (signum) that testifies to the existence of the First Principle in its causal dependence on that Principle.
Justice is the “sum” of the virtues (Bonaventure, De reductione artium 23). It inclines the will to the good. It further refines the proper order of its inclination to the First Principle, itself, others, and the physical realm of being and, finally, it establishes the proper measure of its affection to the First Principle, itself, and others.
Humility is the “foundation” of the virtues and the principal antidote to pride (Bonaventure, Hexaëmeron 6.12). It is the soul’s recognition that the First Principle brought it into being ex nihilo and of its inherent nothingness (nihilitatem). It thus enables the will to overcome its inordinate love for itself and love the First Principle, itself, and others in their proper order and measure.
Bonaventure relegated poverty, the most characteristic of Francis’ virtues, to a subordinate position relative to love and the other virtues to correct the tendency of some of his Franciscan brothers and sisters to take excessive pride in their practice of poverty (Bonaventure, Perfectione evangelica q. 2, a. 1). Bonaventure encouraged the practice of poverty, but argued that it is the necessary but insufficient instrumental cause of love, humility, and all the other virtues, not an end in itself. It corrects the tendency to cupidity, the narcissistic cycle in which the soul’s regard for itself dominates its regard for other things. Indeed, it is poverty that is the mean between extremes and not, as the opponents of the mendicant orders had argued, the violation of the mean.
Peace is the disposition of the will to its final end (Bonaventure, Triplica via 7). The disorder of the soul led to conflict between the soul and the First Principle, itself, others, and its body. The practice of love, justice, and the long list of virtues, gifts, and beatitudes restored the proper order of its will and dissolved that conflict. Peace is the result of that effort. It is the tranquility of the perfection of the rectitude of the will. It is the state of the soul’s complete satisfaction of its desires in its union with the First Principle.
Bonaventure delineated the soul’s progress in its practice of these virtues, gifts, and beatitudes in his reformulation of the Neo-Platonic process of the triple way (triplica via): the purgation, illumination, and perfection that renders the soul fit for its proper end in ecstatic union with the First Principle (Bonaventure, Triplica via 1). The first stage consists in the purgation of sin in which the practice of the virtues rids the soul of its tendencies toward vice, for example, the practice of love in opposition to greed, justice to malice, fortitude to weakness, and so on. The second stage consists in the imitation of Christ, Francis, and other moral exemplars. He authored a number of innovative spiritual treatises in which he asked his readers to contemplate the life of Christ, Francis, and others who modeled their lives on Christ, and then to imagine their participation in the life of Christ, to imagine that they, too, cared for the lepers, for example, to foster their practice of the virtues (Bonaventure, Lignum vitae, prol. 1-6; Legenda maior, prol. 1.5-6). The third and final stage consists in the perfect order of the soul in relation to the First Principle, itself, others, its body, and the full extent of the physical realm of being. It restored its status as an image of the First Principle (deiformitas) and rendered the soul fit for union with that Principle in the perfection of its well-ordered love (Bonaventure, Breviloquium 5.1.3).
Bonaventure’s reformulation of this hierarchic process differed from its original formulation in the Neo-Platonic tradition in three significant ways. First, the original process had been primarily epistemic and referred to the rational soul’s purgation of the metaphysical forms from the physical realm of being (in re), its illumination of those forms in the intelligible realm of being (post rem), and the perfection of those forms in the divine realm (ante rem). Second, he allotted a more significant role for the imitation of Christ and other moral exemplars in the process than even his predecessors in the Christian tradition. Finally, he insisted that the soul progresses along the three ways simultaneously. The soul engages in purgation, illumination, and perfection throughout its progress in its effort to reform itself into the ever more perfect image of the First Principle.
7. The Ascent of the Soul into Ecstasy
“This is the sum of my metaphysics: It consists in its entirety in emanation, exemplarity, and consummation, the spiritual radiations that enlighten [the soul], and leads it back to the highest reality” (Bonaventure, Hexaëmeron 1.17).
Bonaventure had argued that the rational soul’s proper end is union with its Creator. But he also argued that the rational soul, created ex nihilo, possesses a limit to its intellectual capacities that prevents the application of its proper function, reason, in the full attainment of its proper end in union with God. The human mind, Bonaventure argued, cannot fully comprehend its Creator.
Plotinus and his heirs in late antiquity, principally Proclus, developed an elegant three-part formula that provided Bonaventure with the raw material to resolve the dilemma: It began with (1) the existence of the First Principle, the One (to Hen), the foundation of the Neo-Platonic cosmos, continued in (2) the emanation (exitus) of all other things from the First Principle, and ended in (3) its recapitulation (reditus) into the First Principle.
Bonaventure did not possess direct access to the formula. He relied on Augustine, Dionysius, and Dionysius’s heirs in the medieval west, Hugh, Richard, and Thomas Gallus of the School of St. Victor, to access the formula, and refine it into a viable solution. He contracted the first two movements of the process into one, “emanation” (emanatio), and reformulated his contraction in two significant ways (Bonaventure, Hexaëmeron 1.17). First, Plotinus and other classical Neo-Platonists envisioned a linear exitus, the First Principle “expresses” Itself in a series of distinct hypostases, the Nous, the Psyché, and its further expression into the intelligible and physical realms of being from eternity (ab aeterno). Bonaventure divided that exitus into two distinct movements: (1) the “emanation” of the First Principle (Principium) that exists in one substance with Its Eternal Art and Spirit ab aeterno and (2) the further “emanation” of that First Principle, in its perfect perichoresis—its reciprocal coinherence—with Its Art and Spirit, in Its creation of the intelligible and physical realms of being in time and ex nihilo.
Second, he interposed a middle term, so to speak, in the process, “exemplarity” (exemplaritas). The created realm of being exemplifies its origins in the First Principle in Its perfect perichoresis with Its Art and Spirit through a carefully graded series of resemblances (Bonaventure, 1 Sent. d. 3). The first degree of its resemblance, the shadow (umbra), exemplified its indeterminate causal dependence on the First Principle. The second degree, the vestige (vestigium), exemplified its determinate causal dependence, efficient, formal, and final, on the First Principle. The third degree, the image (imago), exemplified its explicit dependence on the First Principle in Its perfect perichoresis with Its Eternal Art and Spirit. Bonaventure would abandon the first degree of resemblance, the shadow, in his latter works and introduce a fourth, the moral reformation of the soul into a more perfect image, the similitude (similitudo), fit for union with the First Principle.
Bonaventure also reimagined the final stage of the Neo-Platonic process as a “consummation” (consummatio) that consisted of two movements: (1) the soul’s recognition of the carefully graded series of resemblances, the “spiritual radiations that enlighten the soul” and testify to its causal dependence on the “highest reality” of the First Principle, in Its perfect perichoresis with Its Eternal Art and Spirit, and (2) its transformation into a more perfect image (similitudo) of the First Principle that fits it for union with that Principle, Its Art, and Spirit. Thus, he explained, the process curves into itself “in the manner of an intelligible circle” and ends in principium (Bonaventure, Mysterio Trinitatis q. 8 ad 7).
Bonaventure provided a particularly rich account of his reformulation of this Neo-Platonic process in his most celebrated text, the Itinerarium mentis in Deum. It is a difficult text to categorize. It is a philosophical text, but not exclusively. It is a philosophical text steeped in a Neo-Platonic Christian tradition that relies heavily on the data of revelation contained in the Christian scriptures and the spiritual practices of the thirteenth century to construct a Platonic Ladder of Love in the context of that syncretic tradition. Bonaventure’s distinction between philosophy and theology provides the means to distinguish the philosophical core of the text from its theological setting—with occasional reference to its theological dimensions to provide a comprehensive analysis of each rung of that ladder.
Bonaventure derived the initial division of the rungs of that ladder from the Neo-Platonic division of the cosmos that permeates so much of his thought: the rational soul’s contemplation of the vestige (vestigium) of the First Principle in the physical realm of being (esse), its contemplation of the image (imago) of the First Principle in the intelligible realm of being, and its contemplation of the First Principle in Itself in the divine realm of being that prepares it for union with that Principle. The path is a deft harmony of Dionysius’ contrast between the soul’s cataphatic contemplation of creation—in which it applies its intellect—and its apophatic contemplation of the divine—in which it suspends its intellect in mystical union (McGinn, “Ascension and Introversion”). The soul moves from its contemplation of the physical realm of being outside itself, to its contemplation of the intelligible realm of being within itself, and ends in the contemplation of the divine above itself.
He further subdivided each of these three stages of contemplation into two, for a total of six steps. The first step in each stage focuses on that stage’s testimony to (per) the First Principle, the second to the presence of the First Principle in that stage. This pattern dissolves in the soul’s contemplation of the First Principle in Itself in the third stage. The first step on this stage focuses on the contemplation of the First Principle as the One God of the Christian tradition. The second stage focuses on the contemplation of the emanation of the One God in Three Hypostases or, more commonly, Persons. The ascent comes to its end in a seventh step in which the soul enters into an ecstatic union with the First Principle in Its perfect perichoresis with Its Eternal Art and Spirit. The philosophical core of the text is particularly apparent on steps one, three, and five, and the theological core on steps two, four, and six. The two come together on the seventh step.
It is also important to reiterate that Bonaventure insisted on the necessity of grace for the soul to achieve its goal contra Plotinus and his immediate heirs in classical antiquity, Porphyry, Iamblichus, and Proclus. Thus, Bonaventure included a series of prayers and petitions to the First Principle, the incarnation of the Eternal Art in the person of Christ, St. Francis, Bonaventure’s spiritual father, and other potential patrons to “guide the feet” of the pilgrim soul in its ascent into “that peace that surpasses all understanding” (Philippians 4:7) in its union with the First Principle.
The first step of the soul’s ascent consists in its rational reductio of the vestige of the physical realm of being to its efficient, formal, and final cause in the First Principle. Bonaventure relied on yet another reformulation of a Neo-Platonic triad to align each of these causes with particular properties of the First Principle: the power of the First Principle as the efficient cause that created the physical realm of being ex nihilo, the wisdom of the First Principle as the formal cause that formed the physical realm of being, and the goodness of that Principle as the final cause that leads it to its proper end in union with Itself. The rational soul relies on the testimony of the entire physical realm of being to achieve this union, “the origin, magnitude, multitude, plenitude, operation, order, and beauty of all things” (Bonaventure, Itinerarium 1.14), even though the rest of that realm will end in a final conflagration (Bonaventure, Breviloquium 7.4). It will have served its purpose and persist only in the memory of rational beings.
Bonaventure paired this philosophical argument with an analogy that takes the reader into the theological dimensions of the text: The power, wisdom, and goodness of God suggests some degree of distinction within the First Principle. The power of the First Principle points to God the Father as the efficient cause of all other things, Its wisdom to the Son, the Eternal Art, as the formal cause, and Its goodness to the Spirit as the final cause through “appropriation in the manner of speech” (appropriate loquendo). Bonaventure insisted that, properly speaking, the One God, the First Principle in Its perfect perichoresis with Its Art and Spirit, is the efficient, formal, and final cause of all things, but he also insisted that it is proper to attribute particular properties to each of the Divine Persons to distinguish them from one another. Nevertheless, he admitted that the analogical argument remained inconclusive. The rational soul, without the light of divine revelation, is able to realize that creation testifies to the power, wisdom, and goodness of the First Principle, but it is not able to realize that the power, wisdom, and goodness of that Principle testifies to Its existence in Three Persons.
The second step consists in the soul’s contemplation of the epistemological process, its apprehension, delight, and judgment of the sensory species of the physical realm of being. Its contemplation of this process reveals that it depends on the presence of the “light” of the Eternal Art in its cooperative effort to discern certain truth through the careful consideration of the propositions of the epistemological argument: It possesses certain truth, but it is fallible and the object of its knowledge mutable, so it relies on the “light” of the Eternal Art to render itself infallible and the object of its knowledge immutable. But the thrust of this step is the derivation of the first of three analogies between the epistemological process and distinct types of mysticism. If the soul looks at the “light” of the Eternal Art, so to speak, rather than the intelligible forms post rem it illumines, then the epistemological process becomes the occasion for an epistemic mysticism in which the soul apprehends, delights, and judges a Divine Species of the Eternal Art, although not the Eternal Art Itself, in its epistemological union with the Eternal Art.
The third step consists in the soul’s contemplation of itself as an image (imago) of the First Principle in its higher faculties of memory, intellect, and will. These, too, testify to the power of the First Principle as the efficient cause that created it ex nihilo, the wisdom of the First Principle as the formal cause that formed it, and the goodness of that Principle as the final cause that leads it to its proper end in its union with Itself. But the analogical argument is more prominent on this step. The rational soul is one substance (ousia) that consists of three distinct faculties, memory, intellect, and will, and this suggests that the First Principle, which is one in substance, consists of three distinct persons, Father, Son, and Spirit. But again, the analogical argument remains inconclusive without the benefit of the light of revelation.
The fourth step consists in the soul’s contemplation of its moral reformation into a more perfect image or similitude (similitudo) of the First Principle through its progress on the triplica via of purgation, illumination, and perfection in its practice of the virtues. Bonaventure insisted that its moral reformation depends on the presence of the Eternal Art in the person of Christ as a moral principle, similar to its dependence on the Eternal Art as an epistemological principle, to motivate and guide its pursuit of perfection. But the thrust of this step is the derivation of the second of three analogies between the epistemological process and distinct types of mysticism. The soul’s progress along the triplica via restores its “spiritual” senses. It is able to see, hear, smell, taste, and touch the Divine Species of the Eternal Art in the form of the mystical presence of Christ, delight in that Species, and judge the reasons for its delight in a type of nuptial mysticism, like “the Bride in the Song of Solomon,” Bonaventure explains, who “rests wholly on her beloved” (Song of Solomon 8:5)—a thinly veiled reference to the intimacy of sexual union.
The fifth step consists in the soul’s direct contemplation of the First Principle as Being Itself in its careful analysis of the propositions of his reformulation of the ontological argument. The concept of beings falls into three initial categories: non-being, being-in-potency, and being-in-act. The concept of non-being is a privation of being and presupposes the concept of being. The concept of being-in-potency presupposes the concept of being-in-act. If so, the concept of being-in-act depends on the concept of a “pure” act of being without potential and this final concept is being itself (ipsum esse). But this “pure” act of being does not fall within the category of the physical realm of being “which is mixed with potency”. It does exist within the intelligible realm of being, but not entirely so. If it existed in the rational soul and only in the soul, it would exist only as a concept, and thus possess “only a minimal degree of being”. And so, per the process of elimination, being itself is the Divine Being.
Bonaventure extended this ontological argument on the sixth step to provide rational justification for the theological doctrine of the One God in Three Persons. He began with the Neo-Platonic concept of the One (to Hen) as the Self-Diffusive Good. He derived this definition principally from the Neo-Platonic tradition, particularly Dionysius, for whom the “Good” was the perfect and preeminent name of God, the name that subsumed all other names (Dionysius, Divine Names 3.1 and 4.1-35). But he also derived it from his notion of the transcendental properties of being that “transcended” the traditional Peripatetic division of things into the categories of substance and accident and thus applied to all beings, physical, intelligible, and divine. He identified three and only three of these “highest notions” of being: unity, truth, and goodness—although he listed others, notably beauty, as second, third, or fourth order properties of being (Aertsen, “Beauty in the Middle Ages”). So, the Divine Being Itself, the highest Being, is also the highest unity, truth, and goodness. The good is self-diffusive per definitionem and the highest good, the most self-diffusive—a proposition he inherited from the Neo-Platonists. Thus, Divine Being Itself diffuses Itself in a plurality of Divine Hypostases, God the Father, Son, and Spirit.
Bonaventure brought his account of the soul’s ascent to its proper end in a direct encounter with the First Principle in Itself, in Its perfect perichoreses with Its Eternal Art and Spirit. He stands in a long tradition of philosophers who had attempted to provide a description of that mystical experience: Plato, Plotinus, Dionysius, and Bonaventure’s immediate predecessors, Hugh, Richard, and Thomas Gallus of the School of St. Victor. All of them have fallen short—perhaps necessarily so. Bonaventure began his attempt with an analogy of the epistemological process, the apprehension, delight, and judgment of the sensory speices, but he deliberately undermined his own effort. He relied on two rhetorical devices he derived from Dionysius’ Mystical Theology to do so. The first is a series of denials of the soul’s intellectual capabilities that he drew from Dionysius’ practice of negative theology, the soul sees, but it does so in a dark light, it hears, but in the silence of secrets whispered in the dark, it learns, but it learns in ignorance. The second is a series of metaphors, the fire of the affections of the will, the blindness of the intellect and its slumber, the hanging, crucifixion, and death of the soul’s cognitive faculties in its inability to comprehend the incomprehensible.
Bonaventure’s rhetoric, similar to the excess of Plato, Plotinus, and others in the same tradition, has supported a wide range of interpretation (McGinn, Flowering, 189-205). Some scholars emphasized the cognitive dimensions of the soul’s contemplation of the First Principle even if the object of its vision exceeded its cognitive capabilities in a vision, so to speak, of a light so bright it blinded the intellect so that it seemed to see nothing. Others emphasized the affective dimensions of the experience. The soul’s contemplation of the First Principle is a type of experiential knowledge in which the affections of the will outpace the intellect in “that peace which surpasses all understanding”. Still others disengaged the rational faculties of the soul from the experience in their entirety.
McGinn laid the groundwork for the current consensus that argues for a mean between these extremes. The soul’s cognitive faculties remain intact, but the object of their contemplation exceeds their capabilities. The soul knows, but it is an experiential knowledge, not propositional. It may even strive to know in the so-called proper, propositional intension of the concept—after all, it possesses the inclination to do so. But it fails. It knows the First Principle in Its eternal perichoresis with Its Art and Spirit in the sense that it experiences the real presence of that Principle. But it cannot apprehend that Principle, it cannot abstract an intelligible species of that Principle, it cannot imagine, compound, divide, estimate, or remember that Principle. Nevertheless, it experiences the immediate presence of that Principle, Its Art, and Its Spirit that remains forever inexplicable—an experience that ignites its affections to an unfathomable degree of intensity. “Let it be, let it be”, Bonaventure pleaded as he brought his account of the soul’s ascent to a close. “Amen” (Bonaventure, Itinerarium 7.6).
8. References and Further Reading
a. Critical Editions
Doctoris Seraphici S. Bonaventurae Opera Omnia. 10 vols. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
This is the current standard critical edition of Bonaventure’s works. Since its publication, scholars have determined that a small portion of its contents are spurious. See A. Horowski and P. Maranesi, listed below, for recent discussions of the question.
Breviloquium. In Opuscula Varia Theologica, 199-292. S. Bonaventurae Opera Omnia. Vol. 5. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The Breviloquium is a short summary of Bonaventure’s philosophical theology.
Christus unus omnium magister. In Opuscula Varia Theologica, 567-574. S. Bonaventurae Opera Omnia. Vol. 5. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
Christ the One Master of All is an academic sermon that contains a discussion of Bonaventure’s theory of the forms and divine illumination.
Collationes in Hexaëmeron. In Opuscula Varia Theologica, 327-454. S. Bonaventurae Opera Omnia. Vol. 5. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The Collations on the Six Days of Creation is Bonaventure’s final and one of his most important texts in philosophical theology. It remained unfinished at the time of his death. This reportatio of the Collationes contains a harsh criticism of Aristotle and the radical Aristotelians. See also DeLorme’s edition below.
Collationes in Hexaëmeron et Bonaventuriana Quaedam Selecta. Edited by F. Delorme. In Bibliotheca Franciscana Scholastica Medii Aevi. Vol. 8. Quaracchi: Collegium S. Bonaventurae, 1934.
DeLorme based his edition of the Collationes in Hexaëmeron on a single manuscript. It contains a less harsh criticism of Aristotle and the radical Aristotelians. Scholars remain divided on the question of which reportatio is more authentic.
Commentarius in librum Ecclesiastae. In Commentarii in Sacram Scripturam, 1-103. S. Bonaventurae Opera Omnia. Vol. 6. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
Bonaventure’s Commentary on the Book of Ecclesiastes contains a discussion of the concept of non-being and the inherent nothingness of the world.
Commentarius in I Librum Sententiarum: De Dei Unitate et Trinitate. Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 1. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The First Book of the Commentary on the Sentences is Bonaventure’s most extensive discussion of his philosophy and philosophical theology of the One God, the First Principle, in Three Persons.
Commentarius in II Librum Sententiarum: De Rerum Creatione et Formatione Corporalium et Spiritualium . Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 2. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The Second Book of the Commentary on the Sentences contains Bonaventure’s most extensive discussion on creation.
Commentarius in IV Librum Sententiarum: De Doctrina Signorum. Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 1. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The fourth book of the Sentences, On the Sacraments, contains Bonaventure’s exhaustive treatise on sacramental theology, but it also contains passages on his philosophy and philosophical psychology of the human person.
Itinerarium mentis in Deum. In Opuscula Varia Theologica, 293-316. S. Bonaventurae Opera Omnia. Vol. 5. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The Itinerarium is Bonaventure’s treatise on the soul’s ascent into God and his most popular work.
Lignum vitae. In Opuscula Varia ad Theologiam Mysticam, 68-87. Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 8. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The Tree of Life is Bonaventure’s innovative life of Christ and an often neglected source for his virtue theory.
Quaestiones disputatae de scientia Christi. In Opuscula Varia Theologica, 1-43. S. Bonaventurae Opera Omnia. Vol. 5. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The Disputed Questions on the Knowledge of Christ contains information on philosophical psychology and epistemology. The fourth question is a detailed discussion of divine illumination.
Quaestiones disputatae de mysterio Ss. Trinitatis. In Opuscula Varia Theologica, 45-115. S. Bonaventurae Opera Omnia. Vol. 5. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The Disputed Questions on the Mystery of the Trinity contains a detailed series of debates on the existence and nature of the First Principle. The first article of each quesiton is philosophical. The second theological.
Quaestiones disputatae de perfectione evangelica. In Opuscula Varia Theologica, 117-198. S. Bonaventurae Opera Omnia. Vol. 5. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The Disputed Questions on Evangelical Perfection is an important text in moral philosophy and philosophical theology.
Opusculam de reductione artium ad theologiam. In Opuscula Varia Theologica, 317-326. S. Bonaventurae Opera Omnia. Vol. 5. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
On the Reduction of the Arts to Theology contains a discussion of philosophy and its distinction from philosophical theology.
De triplici via. In Opuscula Varia ad Theologiam Mysticam, 3-27. Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 8. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
The Triple Way is Bonaventure’s treatise on spiritual and moral reformation.
Doctoris Seraphici S. Bonaventurae Opera Theologica Selecta. 5 vols. Quaracchi: Collegium S. Bonaventurae, 1934-1965.
This is a smaller edition of the Commentary on the Sentences and three short works, the Breviloquium, the Itinerarium, and the De reductione artium ad theologiam. The text is complete but the critical appartus is significantly reduced.
Legenda Maior. In Analecta Franciscana 10 (1941): 555-652.
This is the revised critical edition of the Longer Life of St. Francis, and another often neglected source for Bonaventure’s virtue theory.
b. Translations into English
Bonaventure: The Soul’s Journey into God, The Tree of Life, The Life of St. Francis. Translated by E. Cousins. New York: Paulist Press, 1978.
Cousins’ translations of these short but influential works is refreshingly dynamic but faithful.
“Christ, the One Teacher of All”. In What Manner of Man: Sermons on Christ by Bonaventure, 21-55. Translated by Z. Hayes. Chicago: Franciscan Herald Press, 1974.
Breviloquium. Edited by D. V. Monti. Works of Bonaventure. Vol. 9. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2005.
Collations on the Hexaemeron. Edited by J. M. Hammond. Works of Bonaventure. Vol. 18. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2018.
Commentary on Ecclesiastes. Edited by R. J. Harris and C. Murray. Works of Bonaventure. Vol. 7. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2005.
Commentary on the Sentences: The Philosophy of God. Edited by R. E. Houser and T. B. Noone. Works of Bonaventure. Vol. 16. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2013.
This rather large volume contains only a small selection of texts from the Commentary on the First Book of the Sentences.
Disputed Questions on Evangelical Perfection. Edited by R. J. Harris and T. Reist. Works of Bonaventure. Vol. 13. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2008.
Disputed Questions on the Knowledge of Christ. Edited by Zachary Hayes. Works of Bonaventure. Vol. 4. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 1992.
Disputed Questions on the Mystery of the Trinity. Edited by Z. Hayes. Works of Bonaventure. Vo. 3. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 1979.
Itinerarium Mentis in Deum. Edited by P. Boehner and Z. Hayes. Works of Bonaventure. Vol. 2. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2002.
On the Reduction of the Arts to Theology. Edited by. Z. Hayes. Works of Bonaventure. Vol. 1. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 1996.
The Threefold Way. In Writings on the Spiritual Life, 81-133. Edited by F. E. Coughlin. Works of Bonaventure. Vol. 10. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2006.
Works of Bonaventure. 18 vols. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 1955.
The Franciscan Institute of St. Bonaventure University, a major research center for Franciscan Studies, began to publish this series in 1955. The pace of publication has increased in recent years, but the series remains incomplete—Bonaventure authored a vast amount of material. This is the standard series of translations in English.
c. General Introductions
Bettoni, E. S. Bonaventura. Brescia: La Suola, 1944. Translated by A. Gambatese as St Bonaventure (Notre Dame, IN: University of Notre Dame Press, 1964).
Bettoni’s St. Bonaventure is the best short work on Bonaventure’s life and thought. Unfortunately, it is out of print.
Bougerol, G. Introduction a Saint Bonaventure. Paris: J. Vrin, 1961. Revised 1988. Translated by J. de Vinck as Introduction to the Works of St. Bonaventure (Paterson, NJ: St. Anthony Guild Press, 1963.
Bougerol’s Introduction is an insightful commentary on the literary genres of Bonaventure’s works. Note that the English translation is of the first French edition, not the second.
Cullen, C. M. Bonaventure. Oxford: Oxford University Press, 2006.
Cullen’s Bonaventure is the most recent comprehensive introduction to his life and thought.
Delio, I. Simply Bonaventure. Hyde Park: New City Press, 2018.
Delio’s Simply Bonaventure, now in its second edition, is intended for those with little or no background in medieval philosophy or theology.
Gilson, É. La philosophie de saint Bonaventure. Paris: J. Vrin, 1924. Revised 1943. Translated by I. Trethowan and F. J. Sheed as The Philosophy of St. Bonaventure (London: Sheed and Ward, 1938. Reprinted 1940, 1965).
Gilson’s Philosophy of St. Bonaventure is foundational. He was the first to insist on Bonaventure’s careful distinction between philosophy and theology and to identify Bonaventure as the principal representative of Christian Neo-Platonism in the Middle Ages. Note that the English translation is of an earlier edition.
d. Studies
Aertsen, J. A. “Beauty in the Middle Ages: A Forgotten Transcendental?” Medieval Philosophy and Theology 1 (1991): 68-97.
Aertsen, J. A. . Medieval Philosophy as Transcendental Thought: From Philip the Chancellor to Francisco Suárez. Leiden: Brill, 2012.
Aertsen’s is an exhaustive study of the concept of the transcendentals with reference to Bonaventure and other philosopher-theologians in the latter Middle Ages. He refutes the widespread assumption that Bonaventure had listed beauty as a transcendental on par with the one, the true, and the good.
Baldner, S. “St. Bonaventure and the Demonstrability of a Temporal Beginning: A Reply to Richard Davis.” American Catholic Theological Quarterly 71 (1997): 225-236.
Baldner, S. “St. Bonaventure and the Temporal Beginning of the World.” New Scholasticism 63 (1989): 206-228.
Baldner’s pieces are two of the more important relatively recent discussions of the question. See also Dales, Davis, and Walz.
Bissen, J. M. L’exemplarisme divin selon saint Bonaventure. Paris: Vrin, 1929.
This is the foundational study of Bonaventure’s exemplarism and remains unsurpassed in breadth. See also Reynolds.
Bonnefoy, J. F. Une somme Bonaventurienne de Theologie Mystique: le De Triplici Via. Paris: Librarie Saint-François, 1934.
This is the seminal analysis of Bonaventure’s treatise on the soul’s moral reformation.
Bowman, L. “The Development of the Doctrine of the Agent Intellect in the Franciscan School of the Thirteenth Century.” The Modern Schoolman 50 (1973): 251–79.
Bowman provides one of the few extensive treatments of Bonaventure’s doctrine of the agent intellect.
Burr, D. The Spiritual Franciscans: From Protest to Persecution in the Century after Francis of Assisi. University Park, PA: The Pennsylvania State University Press, 2001.
The first chapter provides a summary of the state of the conflict between the Fraticelli and the Conventuals during Bonaventure’s tenure as Minister General.
Cullen, C. M. “Bonaventure’s Philosophical Method.” In A Companion to Bonaventure, 121-163. Edited by J. M. Hammond, J. A. Wayne Hellmann, and J. Goff. Leiden: Brill, 2014.
Cullen provides a precise summary of Bonaventure as a philosopher and his method.
Dales, R. C. Medieval Discussions of the Eternity of the World. Leiden: Brill, 1990.
Dales locates Bonaventure in the larger stream of thought on this question. A companion volume includes the relevant Latin texts.
Davis, R. “Bonaventure and the Arguments for the impossibility of an Infinite Temporal Regression.” American Catholic Philosophical Quarterly 70 (1996): 361-380.
Delio, I. A Franciscan View of Creation: Learning to Live in a Sacramental World. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2003.
Delio derives her sacramental view of creation from a careful consideration of the thought of Francis, Clare, Bonaventure, and Scotus.
Gendreau, B. “The Quest for Certainty in Bonaventure.” Franciscan Studies 21 (1961): 104-227.
Gendreau first proposed the current solution to the problem of Bonaventure’s theory of divine illumination. Compare with Speer.
Horowski, A. “Opere autentiche e spurie di San Bonaventura.” In Collectanea Franciscana 86 (2016): 461-606.
This is the most recent assessment of the current state of the critical edition of Bonaventure’s works. See also Maranesi.
Houser, R. E. “Bonaventure’s Three-Fold Way to God.” In Medieval Masters: Essays in Honor of E. A. Synan, 91-145. Houston: University of St. Thomas Press, 1999.
Houser’s analysis of Bonaventure’s arguments for the existence of God emphasizes their logical structure and highlights Bonaventure’s command of the formal logic of the Aristotelian tradition.
Johnson, T. J., K. Wrisley-Shelby, and M. K. Zamora, eds. Saint Bonaventure: Friar, Teacher, Minister, Bishop: A Celebration of the Eighth Centenary of His Birth. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2020.
A collection of papers delivered at a major conference to celebrate the eighth centenary of Bonaventure’s birth at St. Bonaventure University. It provides a thorough overview of the current state of research into Bonaventure’s philosophy, philosophical theology, and mysticism.
Lang, H. “Bonaventure’s Delight in Sensation.” New Scholasticism 60 (1986): 72-90.
Lang was the first to highlight the role of delight in Bonaventure’s account of the epistemological process.
Malebranche, N. De la recherche de la verité. 1674-1675. Translated by T. M. Lennon and P. J. Olscamp, as The Search after Truth (Cambridge: Cambridge University Press, 1997).
Malebranche presented his famous, or perhaps infamous, doctrine of the vision of God in Book 3. He was incorrect in his interpretation of Bonaventure’s epistemology. See Gendreau.
Maranesi, P. “The Opera Omnia of St. Bonaventure: History and Present Situation.” In A Companion to Bonaventure, 61-80. Edited by J. M. Hammond, J. A. Wayne Hellmann, and J. Goff. Leiden: Brill, 2014.
This is an indispensable assessment of the current state of the critical edition of Bonaventure’s works.
McEvoy, J. “Microcosm and Macrocosm in the Writing of St. Bonaventure.” In S. Bonaventura 1274-1974, 2:309-343. Edited by F. P. Papini. Quaracchi: Collegium S. Bonaventurae, 1973.
McEvoy places this theme in its wider context.
McGinn, B. “Ascension and Introversion in the Itinerarium mentis in Deum.” In S. Bonaventura 1274-19cv74, 3:535-552. Edited by F. P. Papini. Quaracchi: Collegium S. Bonaventurae, 1973.
McGinn, B. The Flowering of Mysticism. New York: Crossroad, 1998.
McGinn provides a thorough introduction to the structure and content of Bonaventure’s Itinerarium in the context of the mystical practices of the latter Middle Ages with particular attention to the cognitive dimensions—or lack thereof—of the soul’s ecstatic union with the First Principle.
McKenna, T. J. Bonaventure’s Aesthetics: The Delight of the Soul in Its Ascent into God (Lexington Books: London, 2020).
This is the first comprehensive analysis of Bonaventure’s philosophy and philosophical theology of beauty since Balthasar’s Herrlichkeit (1961).
Monti, D. V. and K. W. Shelby, eds. Bonaventure Revisited: Companion to the Breviloquium. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2017.
A helpful commentary on Bonaventure’s own summary of his philosophical theology in the Breviloquium.
Noone, T. “Divine Illumination.” In The Cambridge History of Medieval Philosophy, I: 369-383. Edited by R. Pasnau. Cambridge: Cambridge University Press, 2010.
Noone provides a helpful overview of the doctrine of divine illumination in the medieval west.
Noone, T. “St. Bonaventure: Itinerarium mentis in Deum.” In Debates in Medieval Philosophy: Essential Readings and Contemporary Responses, 204-213. Edited by J. Hause. London: Routledge, 2014.
Noone provides insight into Bonaventure’s sources for his analysis of the epistemological process.
Panster, K. “Bonaventure and Virtue.” In Saint Bonaventure Friar, Teacher, Minister, Bishop: A Celebration of the Eighth Centenary of His Birth, 209-225. Edited by T. J. Johnson, K. Wrisley Shelby, and M. K. Zamora. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2021.
Panster provides an insightful overview of the current state of research on Bonaventure’s virtue theory.
Pegis, A. C. “The Bonaventurean Way to God.” Medieval Studies 29 (1967): 206-242.
Pegis, an expert on Thomas Aquinas, was one of the first to recognize and clearly distinguish Bonaventure’s approach from Aquinas’.
Quinn, J. F. The Historical Constitution of St. Bonaventure’s Philosophy. Toronto: Pontifical Institute of Medieval Studies, 1973.
Quinn’s Historical Constitution includes a detailed historiographical essay on early approaches to Bonaventure’s thought. But he devotes most of the volume to an extensive if somewhat controversial analysis of Bonaveture’s epistemology—in spite of its title.
Reynolds, P. L. “Bonaventure’s Theory of Resemblance.” Traditio 49 (2003): 219-255.
Reynolds’ is an analytic approach to Bonaventure’s theory of exemplarity that highlights Bonaventure’s command of formal logic.
Schaeffer, A. “Corrigenda: The Position and Function of Man in the Created World According to Bonaventure.” Franciscan Studies 22 (1962): 1.
Schaeffer, A. “The Position and Function of Man in the Created World According to Bonaventure.” Franciscan Studies 20 (1960): 261-316 and 21 (1961): 233-382.
Schaeffer’s remains one of the most detailed analyses of Bonaventure’s philosophy and philosophical psychology of the human person.
Schlosser, M. “Bonaventure: Life and Works.” In A Companion to Bonaventure, 7-59. Edited by J. M. Hammond, J. A. Wayne Hellmann, and J. Goff. Leiden: Brill, 2014.
Schlosser considers the current state of research on Bonaventure’s biography.
Seifert, J. “Si Deus est Deus, Deus est: Reflections on St. Bonaventure’s Interpretation of St. Anselm’s Ontological Argument.” Franciscan Studies 52 (1992): 215-231.
Seifert was the first to recognize the full force of Bonaventure’s version of the argument.
Speer, A. “Bonaventure and the Question of a Medieval Philosophy.” Medieval Philosophy and Theology 6 (1997): 25-46.
Speer provides a candid discussion of the question. See also Cullen on Bonaventure’s philosophical method.
Speer, A. “Illumination and Certitude: The Foundation of Knowledge in Bonaventure.” American Catholic Philosophical Quarterly 85 (2011): 127–141.
Speer provides further insight into Bonaventure’s doctrine of divine illumination. See also Gendreau.
Tillich, P. Systematic Theology. 3 vols. Chicago: The University of Chicago Press, 1973.
Tillich acknowledges his debt to the mystical aspect Bonaventure’s doctrine of divine illumination in the introduction to the first volume of the series.
Walz, M. D. “Theological and Philosophical Dependencies in St. Bonaventure’s Argument against an Eternal World and a Brief Thomistic Reply.” American Catholic Philosophical Quarterly 72 (1998): 75-98.
Animism is a religious and ontological perspective common to many indigenous cultures across the globe. According to an oft-quoted definition from the Victorian anthropologist E. B. Tylor, animists believe in the “animation of all nature”, and are characterized as having “a sense of spiritual beings…inhabiting trees and rocks and waterfalls”. More recently, ethnographers and anthropologists have moved beyond Tylor’s initial definition, and have sought to understand the ways in which indigenous communities, in particular, enact social relations between humans and non-human others in a way which apparently challenges secular, Western views of what is thought to constitute the social world. (This new approach in anthropology is sometimes called the “new animism”.) At a minimum, animists accept that some features of the natural environment such as trees, lakes, mountains, thunderstorms, and animals are non-human persons with whom we may maintain and develop social relationships. Additionally, many animist traditions regard features of the environment to be non-human relatives or ancestors from whom members of the community are descended.
Animism, in some form or other, has been the dominant religious tradition across all human societies since our ancestors first left Africa. Despite the near ubiquity of animistic beliefs and practices among indigenous peoples of every continent, and despite the crucial role of animism in the early emergence and development of human religious thought, contemporary academic philosophy of religion is virtually silent on the subject. This article outlines some key ideas and positions in the current philosophical and social scientific discourse on animism.
Animist religious traditions have been particularly prevalent among hunter-gatherer societies worldwide. A variety of different and conflicting religious traditions across the globe have been labeled “animist”. So, animism is not a single religious tradition, but is instead a category to which various differing traditions appear to belong. Just as “theism” is a term that extends to cover any belief system committed to the existence of a god, “animism” is a term that extends to cover any belief system satisfying the appropriate definition (such as the classical Tylorian definition given in the introduction to this article). Note that the terms “theism” and “animism” are not mutually exclusive: an animist may or may not also be a theist. There is some dispute, particularly among anthropologists, as to whether there is a single definition that works to draw the wide variety of traditions typically considered as animist under a single umbrella.
Contemporary social scientific discussion of animism has witnessed a renaissance beginning in the late twentieth century, and this has led different authors to consider a range of alternative ways in which we might conceive of the characteristic qualities of animist thought. Some noteworthy recent contributors to this debate are Nurit Bird-David, Philippe Descola, Tim Ingold, Graham Harvey, and Stewart Guthrie. Before surveying a few of the conceptions currently discussed in this literature, it is worthwhile to be clear on what is not meant by the term “animism”.
a. Hylozoism, Panpsychism, and Vitalism
When attempting to define “animism”, it is important to first disentangle the concept from three closely related philosophical doctrines: Hylozoism, Panpsychism, and Vitalism. Animism is often conflated with these three doctrines as scholarly concepts of animism have traditionally drawn from the work of Tylor, and particularly from his conception of animism as a belief in the “animation of all nature,” a doctrine which he also labels “universal vitality”. Phrases such as these, with their allusions to a “world consciousness”, have given rise to the mistaken impression that animism is a doctrine about the entire universe being fundamentally alive or sentient or filled with soul.
Hylozoism is the view that the universe is itself a living organism. It is a doctrine often attributed (although erroneously, [see Fortenbaugh 2011, 63]) to the third director of Aristotle’s Lyceum, Strato of Lampsacus, who argued that motion in the universe was explicable by internal, unconscious, naturalistic mechanisms, without any need for an Aristotelian prime mover (ibid, 61). This characterization of the universe’s motion as sustained by internal, unconscious mechanisms is seen as analogous to the biological mechanisms and processes sustaining life. However, religious animists typically reject the claim that all things are living, and they also reject that the universe as a whole is a living being. Typically, the animist takes particular features of the natural world as endowed with personhood or some form of interiority, often having their own cultural lives and communities akin to those of human beings.
Panpsychism is the view that “mentality is fundamental and ubiquitous in the natural world” (Goff and others 2017, §2.1). Mind is, on this view, a building block of the universe. Unlike the animist, the panpsychist does not take features of the natural world to have a fully-fledged interior or cultural life akin to that of human beings. Additionally, it is not characteristic of animism to take mental properties to be fundamental to the universe or to be distributed in all systems or objects of a given type. For example, the animist need not accept that all rocks have an interior life, but only that some particular rocks do (perhaps a rock with unusual features, or one which moves spontaneously or unpredictably).
Vitalism is the out-of-favour scientific view that biological phenomena cannot be explained in purely mechanical terms and that a complete explanation of such phenomena will require an appeal to spiritual substances or forces. Proponents of the view included Francis Glisson (1597-1677), Xavier Bichat (1771-1802), and Alessandro Volta (1745-1827). Vitalists hold that all living things share in common a spiritual quality or fluid (famously dubbed by Henri Bergson as the “élan vital”). With this élan vital in hand, it was thought that phenomena that appeared recalcitrant to purely mechanical explanation (for example, the blossoming of flowers, the reproduction of worms, the musings of humans, the growth of algae, and so forth) could be explained in other, more spiritual terms. Animists, unlike vitalists, need not be committed to the existence of any sort of metaphysically special spirit or soul phenomena. Additionally, animists very often take non-biological phenomena (rivers, winds, and the like) to be animate.
b. Modernist Animism
In his Natural History of Religion, David Hume speaks of a tendency for primitive human beings “to conceive all beings like themselves.” Natural phenomena are attributed to “invisible powers, possessed of sentiment and intelligence” and this can be “corrected by experience and reflection”. For Tylor, “mythic personification” drives the primitive animist to posit souls inhabiting inanimate bodies. In a similar vein, Sigmund Freud writes that the animist views animals, plants, and objects as having souls “constructed on the analogy of human souls” (1950, 76). James Frazer’s Golden Bough is a particularly good example, with animism being referred to as “savage dogma” (1900, 171). Rites and rituals relating to animism are described as “mistaken applications” of basic principles of analogy between the human world and the natural world (62). For these writers, animism is understood as a kind of promiscuous dualism and stray anthropomorphism. The animist is committed to a superstitious belief in anthropomorphic spirits, which reside within non-human animals or altogether inanimate objects. It is considered an erroneous view.
Although this position has faced criticism for presenting the animist’s worldview as a kind of mistake (see, for example, Bird-David 1999), similar modernist conceptions of animism persist, particularly in the evolutionary psychology of religion. A notable modern proponent is Stewart Guthrie, who takes animist belief as a problem requiring an explanation. The problem needing explained is why people are so disposed to ascribe agency and personhood to non-agents and non-persons. Setting the problem in this light, Guthrie rejects post-modern and relativist tendencies in the contemporary anthropological literature, which seek to investigate animism as an ontology that differs from, but is not inferior to, naturalistic scientific understandings of the world. This post-modernist approach, Guthrie argues, makes “local imagination the arbiter of what exists,” and thereby abandons many important realist commitments inherent in the scientific project (2000, 107).
Guthrie’s own view is that animistic thinking is the result of an evolutionarily adaptive survival strategy. Animistic interpretations of nature are “failures of a generally good strategy” to perceive agents in non-agentive phenomena. The strategy is generally good since, as he puts it, “it is better for a hiker [for example] to perceive a boulder as a bear than to mistake a bear for a boulder” (2015, 6). If we are mistaken in seeing agents everywhere, the price to pay is small. Whereas if we are not mistaken, the payoff is high. This idea has been developed further by other cognitive scientists of religion, such as Justin Barrett, who accounts for this propensity as resulting from what he calls a hyperactive agency detection device (usually abbreviated to HADD): an innate and adaptive module of human cognition.
This modernist or positivist view of animism can be contrasted with several post-modernist views, which are surveyed below.
c. Enactivist Animism
Another approach to animism takes it as a kind of non-propositional, experiential state. Tim Ingold characterizes animism as a lived practice of active listening. Animism, he says, is “a condition of being alive to the world, characterized by a heightened sensitivity and responsiveness, in perception and action, to an environment that is always in flux, never the same from one moment to the next” (2006, 10). Borrowing a phrase from Merleau-Ponty, Ingold characterizes the lived experience of animism as “the sense of wonder that comes from riding the crest of the world’s continued birth”. The animist does not so much believe of the world that it contains spooky “nature spirits”; rather, she participates in a natural world, which is responsive and communicative. Animism is not a system to which animists relate, but rather it is immanent in their ways of relating.
On this enactivist account, the crucial thread of animist thinking is not characterized by a belief in spirits or a belief in the intentionality of non-intentional objects. Animist thinking is instead construed as a kind of experience—the living of a particular form of life, one which is responsive and communicative with the local environment, and one which engages with the natural environment as subject, not object. Thus, there is a distinctive and characteristically interpersonal quality of animist phenomenology. The animist’s claim—say, that whales are persons—is not a belief to which she assents, nor is it a hypothesis which she might aim to demonstrate or falsify. On the contrary, the animist does not know that whales are persons, but rather knows how to get along with whales.
This understanding of animism echoes the philosophy of religion of Ludwig Wittgenstein, who famously rejected the notion that the substantive empirical claims of religion should be understood as attempts at objective description or explanation. Instead, religion should be understood as a frame of experience or “world picture”. A similarly permissive approach to animism and indigenous religion has recently been championed by Mikel Burley, who stresses the importance of evaluating competing religious world-pictures according to their own internal criteria and the effects that such world pictures have on the lived experience of their adherents (see Burley [2018] and [2019]).
A similar view can also be found in work outside of the analytic philosophical tradition. Continental philosophers such as Max Horkheimer and Theodor Adorno, for example, argue that the modern scientistic worldview alienates us from our environment and is the cause of widespread disenchantment by way of the “extirpation of animism” (2002, 2). Thus, it is our experience of the world around us which is diminished on the scientistic frame, and this disenchantment can be cured by taking up an animistic frame of experience. Martin Buber is another philosopher who stresses the fundamentally spiritual nature of what he calls the “I-Thou” aspect of experience (a subject-subject relation), which can be contrasted with the “I-It” aspect (a subject-object relation). The pragmatist philosopher William James uses the very same terms in his expression of what is characteristic of a religious perception of the universe: “The universe is no longer a mere It to us, but a Thou” (2000 [1896], 240).
It is through the animist’s experience of the world as fundamentally grounded in interpersonal relations that her experience is characterized as distinct from the Western, naturalistic world picture, in which interpersonal encounters are austerely restricted to encounters between human beings.
d. Animism as Ontology
For many post-modern anthropologists, the purpose of research is understood to be a mediation between different but equally valid constructions of reality or “ontologies”. This modern shift of emphasis is sometimes labeled the “ontological turn” in anthropology. According to theorists in this school, animism should be understood as consisting in a distinct ontology with distinct commitments. Philippe Descola is one writer who characterizes animism as just one competing way among several in which the culturally universal notions of “interiority” (subjective or private states of experience) and “exteriority” (physical states) can be carved up. For Descola, the animist views elements of the external world as sharing a common interiority while differing in external features. This can be contrasted with the naturalist’s worldview, which holds that the world contains beings which are similar in their physicality (being made of the same or similar substances), yet which differ in their interiority. Thus, for the animist, while humans and trees, for example, differ in exteriority, they nevertheless share in common the possession of similar interior states. In ‘animic’ systems, humans and non-humans possess the same type of interiority. Since this interiority is understood to be common to both humans and non-humans alike, it follows that non-humans are understood as having social characteristics, such as respecting kinship rules and ethical codes.
On such an account, the animist takes the interiority of any given creature to differ from human interiority only to the extent that it is grounded in different cognitive and perceptual instruments. A tree, for example, cannot change location at will, and so has an interior life very different from that of a human being or a raven. Nevertheless, trees, humans, and ravens share in common the quality of interiority.
A more radical view in the same vein, dubbed “perspectivism”, is described by Viveiros de Castro, who notes that among various Amerindian indigenous religions, a common interiority is understood to consist in a common cultural life, and it is this common culture which is cloaked in diverse exterior appearances. This view turns the traditional, Western, naturalistic notion of the unity of nature and the plurality of culture on its head. Instead, the unity of culture is a fundamental feature of the animists’ world. Whereas in normal conditions, humans see humans as humans, and animals as animals, it is not the case that animals see themselves as animals. On the contrary, animals may see humans as possessing the exteriority of animals, while viewing themselves as humans. Jaguars, for example, see the blood they drink as beer, while vultures see maggots as grilled fish. They see their fur, feathers, claws, beaks, and so forth as cloaks and jewellery. In addition, they have their own social system organized in the same way as human institutions are (Viveiros de Castro, 1998, 470).
Although a particularly interesting position in its own right, perspectivism seems to apply to only a limited number of Amerindian cultures which are the objects of Viveiros de Castro’s studies, and so perspectivism may not serve as a broad and inclusive account of animism that could act as an umbrella for the broad range of traditions which are, on their face, animistic.
Both Descola’s and Viveiros de Castro’s accounts assume that the animist ascribes interiority to non-humans as well as to non-living creatures. However, it is unclear whether all or even most putatively animist communities share in this view. At least some communities regarded as animist appear to enact social relationships with non-human persons, yet do not appear to be committed to any dualist ontological view according to which non-human persons are actually sentient or have their own unique interior states or souls (for a discussion with reference to indigenous Australian animism, see Peterson 2011, 177).
e. Social-Relational Animism
An increasingly popular view understands animism, not as depending upon some abstract notion of interiority or soul, but rather as being fundamentally to do with relationships between human and other-than-human persons. Irving Hallowell, for example, emphasizes an ontology of social relations that holds between the world’s persons, only some of whom are human (1960, 22). Thus, what is fundamental to the animist’s worldview is a commitment to the existence of a broad set of non-human persons. This approach has been championed by Graham Harvey, who summarizes the animist’s belief as the position that “the world is full of persons, only some of whom are human, and that life is always lived in relationship with others” (2005, xi). That is not to say that animists have no concept of objecthood as divorced from personhood, but rather that animist traditions seriously challenge traditional Western views of what sorts of things can count as persons.
A version of this view has been championed by Nurit Bird-David (1999) who takes animism to be a “relational epistemology”, in which social relations between humans and non-humans are fundamental to animist ontology. What is fundamental to the animist’s worldview is the subject-subject relation, in contrast to the subject-object relation taken up in a naturalist’s understanding of the world. This in no way hinges on a metaphysical dualism that makes any distinction between spirits/souls and bodies/objects. Rather, this account hinges on a particular conception of the world as coming to be known principally via socialization. The animist does not hypothesize that some particular tree is a person, and socialize accordingly. Instead, one personifies some particular tree as, when and because one socializes with it. Thus, the commitment to the idea of non-human personhood is a commitment that develops across time and through social interaction.
The animist’s common adoption of kinship terms (such as “grandfather”) for animals and other natural phenomena may also be elucidated on this picture. In earlier writing, Hallowell (1926) describes the extent to which “bear cults” of the circumpolar region carefully avoid general terms, such as “bear” or “animal”, when addressing bears both pre- and post-mortem. Instead, kinship terms are regularly adopted. This can be explained on the assumption that the social role of the kinship term is being invoked (“grandfather”, for example, refers to one who is wise, who deserves to be listened to, who has authority within the social life of the community, and so on). Indeed, in more recent writing, Bird-David speculates whether an understanding of animist belief as fundamentally built on the notion of relatives rather than persons may more accurately account for the sense in which the animist relates with non-human others (2017). For the animist, on this revised account, the world does not so much consist in a variety of human and non-human persons, who differ in their species-specific and special forms of community life; rather, the world is composed of a network of human and non-human relatives, and what is fundamental to the animist’s worldview is this network as well as the maintenance of good relations within it.
2. The Neglect of Animism
Unlike theism, animism has seldom been the focus of any sustained critical or philosophical discourse. Perhaps unsurprisingly, where such traditions of critical discourse have flourished, a realist interpretation of animist belief has been received negatively. 17th century Japanese commentaries on Shinto provide a rare example of such a critical tradition. Writers such as Fukansai Habian, Arai Hakuseki, and Ando Shoeki critically engaged with the mythological and animistic aspects of Shinto, while also illuminating their historical and political subtexts. Interestingly, in his philosophical discourse On Shinto, Habian produces several naturalistic debunking arguments against animism, among which is the argument that the Japanese have developed their peculiar ontology in the same way as other island peoples, who all developed similar mythologies pertaining to a familial relationship with the unique piece of land on which they find themselves. He goes on to argue that the Japanese cannot truly be descendants of the Sun, since in the ordinary course of events, humans beget humans, dogs beget dogs, and so on. Thus, human beings could not actually be born of the Sun. It follows that the Japanese race could not be the descendants of Ameratsu, the Sun kami. Another critical argument from Habian runs that the heavenly bodies cannot be animate beings, since their movements are too linear and predictable. Were the moon truly animate, he argues, we should witness it zig-zagging as does an ant (Baskind and Bowring 2015, 147). Such debates, naturally, had little impact in the West where they were largely inaccessible (and anyways considered irrelevant). After the voyages of discovery and during the age of empire, the steady conversion of colonized peoples to the proselytizing theistic traditions of colonial powers added credence to the notion that primitive animist religions had indeed been discarded for more sophisticated religious rivals (particularly, Christianity and Islam).
The modernist view (championed by the likes of Hume, Tylor, and Frazer) according to which animism is an unsophisticated, primitive, and superstitious belief was carried over wholesale into contemporary 20th century analytic philosophy of religion. One might expect that as religious exclusivism waned in popularity in philosophy and popular culture, animism would come to be appreciated as one valid religious perspective among many other contenders. Yet even permissive religious pluralists, such as the philosopher John Hick, denied that primitive animistic traditions count as genuine transformative encounters with transcendental ultimacy or “the Real” (1989, 278). One recent attempt to reconcile religious pluralism and indigenous religious traditions can be found in the work of Mikel Burley, although it remains to be seen what impact this approach will have on the field.
Other philosophers of religion (such as Kevin Schilbrack [2014] and Graham Oppy [2014]) argue that the philosophy of religion needs to critically engage with a greater diversity of viewpoints and traditions, which would include animism, as well as ancestor worship, shamanism and the like. Of course, it is not incumbent on these philosophers to celebrate animism. But it is important that that the field dubbed “philosophy of religion” engage with religion as a broad and varied human phenomenon. The cursory dismissal that animism receives within the discipline is, apparently, little more than a hangover of colonial biases.
The view that animist traditions fail to compete with the “great world religions” remains surprisingly pervasive in mainstream philosophy of religion. A reason for this may have to do with a prevailing conception of animist traditions as having no transformative or transcendental aspects. They are immanentist religions, which seldom speak of any notions of salvation or liberation as a central religious aim. Instead, there is a focus on the immediate needs of the community and on good working relationships which stand between human persons and the environment. Because animists are immanentists, their traditions are seen as failing to lead believers to the ultimate religious goal: salvation. However, it is clear enough that there are no non-circular grounds on which to base this appraisal. Why would we judge transcendentalist religions superior, or more efficacious, compared to immanentist ones, unless we were already committed to the view that the ultimate goal of religion is salvation?
3. Public Arguments for Animism
Some philosophical arguments can be mounted in support of animism. Some of these arguments hinge on evidence which is publicly available (call them “public arguments”). Others may hinge on what is ultimately private or person-relative evidence (call them “private arguments”). Two closely related public arguments may be proffered in support of animism:
a. Argument from Innateness
Within the field of psychology, it has been observed that children have a “tendency to regard objects as living and endowed with will” (Piaget 1929, 170). Young children are more inclined to ascribe agency to what most adults regard as inanimate objects. Evidence suggests that this tendency to attribute agency decreases markedly between three and five years of age (Bullock 1985, 222). This tendency to attribute agency is not the result of training by caregivers. Implicit in much of the psychological research is the idea that the child’s perception of widespread agency is naive, and corrected by the path of development to adulthood. This was an argument already stated by the Scottish enlightenment philosopher Thomas Reid in the 18th century. Yet it is unclear on what grounds, apart from our pre-existing naturalistic commitments, we might base this appraisal.
Against the view that childhood animism is corrected by experience, David Kennedy writes that the shift towards naturalist modernism has left the child’s animist commitments in a state of abandonment. Kennedy asks somewhat rhetorically: “Do young children, because of their different situation, have some insight into nature that adults do not? Does their “folly” actually represent a form of wisdom, or at least a philosophical openness lost to adults, who have learned, before they knew it, to read soul out of nature?” (Kennedy 1989). The idea that childhood animism is corrected by experience is the natural consequence of a commitment to a modernist conception of animism, but it would be a harder position to maintain according to the alternative conceptions surveyed above.
b. Argument from Common Consent
The traditional argument from common consent runs that because the majority of religious believers are theists, theism is probably true. A revised common consent argument may be launched, however, which runs that since separate and isolated religious communities independently agree to the proposition that features of the natural world (such as rocks, rivers and whales) are persons, it follows, therefore, that animism is probably true (Smith 2019). This argument relies on the social epistemic claim that independent agreement is prima facie evidence for the truth of some proposition. A similar argument supporting ancestor worship can be found in a recent article by Thomas Reuter (2014). Moreover, it is argued that since the widespread distribution of theists has been brought about by the relatively recent proselytization of politically disempowered peoples, such widespread agreement is not compelling evidence for the truth of theism. Indeed, even contemporary defenders of the common consent argument for the existence of God accept that independent agreement is stronger evidence for the truth of some proposition than agreement generated by some other means, such as “word of mouth” or indoctrination (see, for example, Zagzebski [2012, 343]). It would seem, then, that even on their own terms, contemporary proponents of the common consent argument for the existence of God ought to consider animism as a serious rival to theism.
4. Private Arguments for Animism
Recently, it has been popular to move beyond public defenses of religious belief and toward private or person-relative defenses. Such defenses typically charge that believers are warranted to accept their religious beliefs, even if they lack compelling discursive arguments or public evidence that their views are reasonable to believe or probably true. Alvin Plantinga’s Warranted Christian Belief is the best-known work in this vein.
If such defenses are not inherently epistemologically suspicious, then it remains open for the animist to argue that while there may be no overwhelmingly convincing arguments for animism, animist beliefs are nevertheless internally vindicated according to the standards that animists themselves hold, whatever those standards might be. It could be argued that animist belief is properly basic in the same way that Plantinga takes theistic belief to be (Plantinga 1981). In addition, it may be argued that animist beliefs are not defeated by any external challenges (Smith [2019], for example, gives rebuttals to evolutionary debunking arguments of animism). Thus, it might be argued that animism is vindicated not by external or discursive arguments according to which animism can be shown to be probably true, but by epistemic features internal to the relevant animistic belief system. The animist may argue that although animist belief is thereby justified in a circular manner, this is in no way inferior to the justification afforded to other fundamental beliefs (beliefs about perception, for example), since epistemic circularity is a feature of many of our most fundamental beliefs (William Alston [1986] defends Christian mystical practices by appeal to this kind of tu quoque argument).
5. Pragmatic Arguments for Animism
It is the nature of pragmatic arguments to present some aim as worthwhile, and to recommend some policy conducive to the achievement of that aim. Animist belief has been recommended by some writers as conducive to achieving the following three aims.
a. Environmentalism
An understanding of the environment as rich with persons clearly has implications for conservation, resource management, and sustainability. The scope of human moral decision-making, which may affect the well-being of other persons, is broadened beyond a concern only for human persons. It is on these grounds that philosophers and environmental theorists have argued that a shift towards animism is conducive to the success of worldwide conservation and sustainability efforts. Val Plumwood, for example, argues that an appreciation of non-human others is nothing short of a “basic survival project”. She writes that “reversing our drive towards destroying our planetary habitat,” may require “a thorough and open rethink which has the courage to question our most basic cultural narratives” (2010, 47). The argument runs that within a positivist, scientistic paradigm, reverence and appreciation for the natural world is replaced by a disregard or even an antipathy, according to which the natural world is understood as a mere resource for human consumption.
Much of the appeal of this view appears to hinge on the popular belief that many indigenous societies lived in harmony with nature and that this harmony is a direct result of their understanding of the outside world as an extension of their own society and culture. Against this ecological “noble savage” view, some scholars have charged that this romanticized picture of the animist is unrealistic, as there seems to be at best a tenuous causal connection between traditional animist belief systems and enhanced conservation practices (Tiedje 2008, 97). Any link between animism and environmentalism will also hinge importantly on precisely which natural phenomena are understood to be persons, and whether such persons require much or any respect at all. A tradition that views a fire as a subject and a grassland as a mere object is unlikely to be concerned when the former consumes the latter.
b. Feminism
It has been argued that the liberation of women is a project which cannot be disentangled from the liberation of (and political recognition of) the environment. The objectification of nature is seen as an aspect of patriarchy, which may be undone by the acceptance of an ethics of care which acknowledges the existence of non-human persons. The frame of thinking in which patriarchy flourishes depends upon a system of binary opposition, according to which “nature” is contrasted with “reason”, and according to which anything considered to fall within the sphere of the former (women, indigenous peoples, animals, and so forth) is devalued and systematically disempowered (Mathews 2008, 319). Thus, animism, in so far as it rejects the traditional binary, is perceived as an ally of (a thoroughly intersectional) feminism.
Moreover, the argument is made that animism, as an epistemological world picture, itself constitutes a feminist challenge to patriarchal epistemologies and the conclusions drawn from them. So, whereas monotheistic religious traditions are taken to be grounded in abstract reasoning about ultimate causes and ultimate justice (supposedly “masculine” reasoning), animism is taken to be grounded in intuition and a concern for the maintenance of interpersonal relationships. Likewise, while an austere philosophical naturalism views the external world as fundamentally composed of unconscious, mechanistic, and deterministic causal objects whose real natures are grasped by sense perception and abstract reasoning, an animist epistemology is sensitive to the fundamentality of knowing others, and so shares common cause with feminist epistemological approaches (such as Stuckey [2010, 190]).
c. Nationalism and Sovereignty
Given the intimate connection that the animist draws between communities and their local environment, animism has been endorsed in promoting nationalistic political agendas as well as in reasserting indigenous sovereignty over contested ancestral lands. In New Zealand, for example, legal personhood has been granted to both a mountain range (Te Urewera Act 2014) and a river (Te Awa Tupua (Whanganui River Claims Settlement) Act 2017). In both cases, legal personhood was granted in accordance with the traditional animist commitments of local Māori, and the acts were thereby seen as reasserting indigenous sovereignty over these lands.
Nationalist political movements have also made appeal to animism and neo-paganism, particularly in hostile quests of expansion. Since animist traditions draw strong connections between environment and culture, land and relatedness, there is fertile ground for such traditions to invoke exclusive rights to the use and habitation of the environment. The promotion of Volkisch neo-paganism, for example, was used to motivate Nazi arguments for German Lebensraum, or living space—the expansion into “ancestral” German lands (Kurlander 2017, 3-32). Similarly, Shinto was instituted as the state religion in Japan in 1868 to consolidate the nation after the Meiji restoration. It was further invoked to defend notions of racial superiority up to the Second World War. As direct descendants of Amaterasu (the sun kami), the Japanese race had a claim to racial superiority, particularly over other Asian races. This claim to Japanese racial supremacy, itself a consequence of animist aspects of Shinto mythology, was often used in defense of the expansion of the Japanese empire throughout the Asia-Pacific region (Holtom 1947, 16).
6. References and Further Reading
Alston, W. (1986) ‘Epistemic Circularity’ Philosophy and Phenomenological Research. 47 (1): pp. 1—30.
Baskind, J. and Bowring, R. (2015) The Myotei Dialogues: A Japanese Christian Critique of Native Traditions. Boston: Brill.
Bird-David, N. (1999) ““Animism” Revisited: Personhood, Environment, and Relational Epistemology” Current Anthropology. 40: (S1). pp. S67-S91.
Bird-David, N. (2017) Us, Relatives: Scaling and Plural Life in a Forager World. Oakland: University of California Press.
Buber, M. (1970) in Kaufman, W. (trans.). I and Thou. New York: Charles Scribner’s and Sons.
Bullock, M. (1985) “Animism in Childhood Thinking: a New Look at an Old question” Developmental Psychology. 21: (2) pp. 217-225.
Burley, Mikel (2019) A Radical Pluralist Philosophy of Religion. New York: Bloomsbury Academic.
Descola, P. (2013) in Lloyd, J. (trans.) Beyond Nature and Culture. Chicago. Chicago University Press.
Fortenbaugh, W. (2011) Strato of Lampsacus: Text, Translation and Discussion. New York: Routledge.
Frazer, J. (1900) The Golden Bough: A Study in Magic and Religion. New York: Macmillan and Co.
Freud, S. (1950) Totem and Taboo. London: Routledge and Kegan Paul Ltd.
Guthrie, S. (2015) Faces in the Clouds: A New Theory of Religion. Oxford: Oxford University Press.
Guthrie, S. (2000) “On Animism” Current Anthropology. 41 (1): pp. 106-107.
Hallowell, I. (1926) “Bear Ceremonialism in the Northern Hemisphere” American Anthropologist. 28 (1): 1-175.
Hallowell, I. (1960) “Ojibwa Ontology, Behavior and World View” in Harvey, G. (ed.) (2002) Readings in Indigenous Religions. London. Continuum: pp. 18-49.
Harvey, G. (2005) Animism: Respecting the Living World. New York: Columbia University Press.
Hick, J. (1989) An Interpretation of Religion. New Haven: Yale University Press.
Holtom, D. C. (1963) Modern Japan and Shinto Nationalism (3rd edition). New York: Reprinted with special arrangement with University of Chicago Press by Paragon Book Reprint Corp.
Horkheimer, M. and Adorno, T. (2002) Dialectic of Enlightenment. Stanford: University Press.
Hume, D. in G. C. A. Gaskin (ed.) (2008). Dialogues Concerning Natural Religion, and The Natural History of Religion. Oxford. Oxford University Press.
Ingold, T. (2006) “Rethinking the Animate, Re-Animating Thought” Ethnos. 71 (1): pp. 9-20.
James, W. (1896) “The Will to Believe” in Stuhr, J. (ed.) (2007) Pragmatism and Classical American Philosophy: Essential Readings and Interpretive Essays 2nd ed. Oxford: Oxford University Press. pp. 230-243.
Kennedy, D. (1989) “Fools, Young Children, Animism, and the Scientific World Picture” Philosophy Today. 33 (4): pp. 374-381.
Kurlander, E. (2017) Hitler’s Monsters. New Haven: Yale University Press.
Mathews, F. (2008) “Vale Val: In Memory of Val Plumwood” Environmental Values. 17 (3): pp. 317-321.
Oppy, G. (2014) Reinventing Philosophy of Religion: An Opinionated Introduction. New York: Palgrave Macmillan.
Peoples, H., Duda, P. and Marlowe, F. (2016) “Hunter-Gatherers and the Origins of Religion” Human Nature. 27. pp. 261-282.
Peterson, N. (2011) “Is the Aboriginal Landscape Sentient? Animism, the New Animism and the Warlpiri” Oceania 81 (2): pp. 167-179.
Piaget, J. (1929) The Child’s Conception of the World. New York: Harcourt Brace.
Plantinga, A. (1981) “Is Belief in God Properly Basic?” Noûs. 15 (1): pp. 41-51.
Plumwood, V. (2010) “Nature in the Active Voice” in Irwin, R. (ed.) Climate Change and Philosophy: Transformational Possibilities. London: Continuum. pp. 32-47.
Reid, T. 1975. Inquiries and Essays. (K. Lehrer and R. E. Beanblossom, Eds.). Indianapolis: Bobbs-Merrill Company, Inc.
Reuter, T. (2014) “Is Ancestor Veneration the Most Universal of All World Religions? A Critique of Modernist Cosmological Bias” Wacana. 15 (2): pp. 223-253.
Skrbina, D. (2018) “Panpsychism” Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/panpsych/ (Accessed 25-May-2018).
Schilbrack, K. (2014) Philosophy and the Study of Religions: A Manifesto. Oxford: Wiley Blackwell.
Smith, T. (2019) “The Common Consent Argument for the Existence of Nature Spirits” Australasian Journal of Philosophy.
Stucky, P. (2010) “Being Known by a Birch Tree: Animist Refigurings of Western Epistemology” Journal for the Study of Religion, Nature and Culture. 4 (3): 182-205.
Tiedje, K. (2008) “Situating the Corn Child: Articulating Animism and Conservation from a Nahua Perspective” Journal for the Study of Religion, Nature and Culture. 2 (1): pp. 93-115.
Tylor, E. B. (1929) Primitive Culture: Researches into the Development of Mythology, Philosophy, Religion, Language, Art and Custom. Vol. 1. London: John Murray.
Viveiros De Castro, E. (1998) “Cosmological Deixis and Amerindian Perspectivism” Journal of the Royal Anthropological Institute. 4 (3): pp. 469-488.
Zagzebski, L. (2012) Epistemic Authority: A Theory of Trust, Authority, and Autonomy in Belief. Oxford: University Press.
In the fifth century B.C.E., Zeno offered arguments that led to conclusions contradicting what we all know from our physical experience—that runners run, that arrows fly, and that there are many different things in the world. The arguments were paradoxes for the ancient Greek philosophers. Because many of Zeno’s arguments turn crucially on the notion that space and time are infinitely divisible, he was the first person to show that the concept of infinity is problematical.
In his Achilles Paradox, Achilles races to catch a slower runner—for example, a tortoise that is crawling in a straight line away from him. The tortoise has a head start, so if Achilles hopes to overtake it, he must run at least to the place where the tortoise presently is, reasons Zeno, but by the time he arrives there, it will have crawled to a new place, so then Achilles must run at least to this new place, and so forth. According to this reasoning, Achilles will never catch the tortoise, says Zeno. Whether Zeno and Parmenides themselves denied motion is very controversial, but subsequent scholars over the centuries assumed this, so it has been the majority position. One minority position is that they were not denying motion, but only showing that their opponents were committed to this.
We cannot escape the Achilles paradox by jumping up from our seat and chasing down a tortoise, nor by saying Zeno’s opponents should have constructed a new argument in which Achilles takes better aim and runs toward a place ahead of the tortoise. Because Zeno was correct in saying Achilles needs to run at least to all those places where the tortoise once was, what is required is an analysis of Zeno’s own argument.
This article explains his ten known paradoxes and considers the treatments that have been offered. In the Achilles Paradox, Zeno assumed distances and durations are infinitely divisible in the sense of having an actual infinity of parts, and he assumed there are too many of these parts for the runner to complete. Aristotle‘s treatment says Zeno should have assumed instead that there are only a potential infinity of places to run to, so that at any time the hypothetical division into parts produces only a finite number of parts, and the runner has time to complete all these parts. Aristotle’s treatment was generally accepted until the late 19th century. The current standard treatment, the so-called “Standard Solution,” implies Achilles’s path contains an actual infinity of parts, but Zeno was mistaken to assume this is too many parts for a runner to complete. This treatment employs the mathematical apparatus of calculus which has proved its indispensability for the development of modern science. The article ends by exploring newer treatments of the paradoxes—and related paradoxes such as Thomson’s Lamp Paradox—that were developed since the 1950s.
Zeno was born in about 490 B.C.E. in the city-state of Elea, now Velia, on the west coast of southern Italy; and he died in about 430 B.C.E. He was a friend and student of Parmenides, who was twenty-five years older and also from Elea. He was not a mathematician.
There is little additional, reliable information about Zeno’s life. Plato remarked (in Parmenides 127b) that Parmenides took Zeno to Athens with him where he encountered Socrates, who was about twenty years younger than Zeno, but today’s scholars consider this encounter to have been invented by Plato to improve the story line. Zeno is reported to have been arrested for taking weapons to rebels opposed to the tyrant who ruled Elea. When asked about his accomplices, Zeno said he wished to whisper something privately to the tyrant. But when the tyrant came near, Zeno bit him, and would not let go until he was stabbed. Diogenes Laërtius reported this apocryphal story seven hundred years after Zeno’s death.
b. His Book
According to Plato’s commentary in his Parmenides (127a to 128e), Zeno brought a treatise with him when he visited Athens. It was said to be a book of paradoxes defending the philosophy of Parmenides. Plato and Aristotle may have had access to the book, but Plato did not state any of the arguments, and Aristotle’s presentations of the arguments are very compressed. The Greek philosophers Proclus and Simplicius commented on the book and its arguments. They had access to some of the book, perhaps to all of it, but it has not survived. Proclus is the first person to tell us that the book contained forty arguments. This number is confirmed by the sixth century commentator Elias, who is regarded as an independent source because he does not mention Proclus. Unfortunately, we know of no specific dates for when Zeno composed any of his paradoxes, and we know very little of how Zeno stated his own paradoxes. We do have a direct quotation via Simplicius of the Paradox of Denseness and a partial quotation via Simplicius of the Large and Small Paradox. In total we know of less than two hundred words that can be attributed to Zeno. Our knowledge of these two paradoxes and the other seven discussed in this article comes to us indirectly through paraphrases of them, and comments on them, primarily by his opponents Aristotle (384-322 B.C.E.), Plato (427-347 B.C.E.), Proclus (410-485 C.E.), and Simplicius (490-560 C.E.). The names of the paradoxes were created by later commentators, not by Zeno. A thousand years after Zeno, one comment by Hesychius suggested that there were perhaps three more books by Zeno than the one mentioned by Plato, but scholars do not generally accept this claim because at least three of the book names by Hesychius are believed to be the name for just one book.
c. His Goals
In the early fifth century B.C.E., Parmenides emphasized the distinction between appearance and reality. Reality, he said, is a seamless unity that is unchanging and cannot be destroyed, so appearances of reality are deceptive. Our ordinary observation reports are false; they do not report what is real. This metaphysical theory is the opposite of Heraclitus’ theory, but evidently it was supported by Zeno. Although we do not know from Zeno himself whether he accepted his own paradoxical arguments or exactly what point he was making with them, or exactly what the relationship was between Parmenides’ views and Zeno’s, the historically most influential position is Plato‘s. Plato said the paradoxes were designed to provide detailed, supporting arguments for Parmenides beliefs by demonstrating that Greek common sense confidence in the reality of motion, change, and ontological plurality (that is, that there exist many things), involve absurdities. Plato’s classical interpretation of Zeno was accepted by Aristotle and by most other commentators throughout the intervening centuries. On Plato’s interpretation, it could reasonably be said that Zeno’s goal was to show that his Dichotomy and Achilles paradoxes demonstrate that any continuous process takes an infinite amount of time, which is paradoxical, while Zeno’s Arrow and Stadium paradoxes demonstrate that the concept of discontinuous change is paradoxical. Because both continuous and discontinuous change are paradoxical, so is any change.
This is Gregory Vlastos’ position regarding Zeno’s goals. Eudemus, a student of Aristotle, offered another interpretation of Zeno’s goals. He suggested that Zeno was challenging both pluralism and Parmenides’ monism, which would imply that Zeno was a nihilist. Paul Tannery in 1885 and Wallace Matson in 2001 offer a third interpretation of Zeno’s goals regarding the paradoxes of motion. Plato and Aristotle did not understand Zeno’s arguments nor his purpose, they say. Zeno was actually challenging the Pythagoreans and their particular brand of pluralism, not Greek common sense. Tannery and Matson suggest Zeno himself did not believe the conclusions of his own paradoxes. The controversial issue of interpreting Zeno’s true goals and purposes is not pursued further in this article. Instead, Plato’s classical interpretation is assumed because it is the one that was so influential throughout history and because the paradox as classically interpreted needs to be countered even if Matson and Tannery are correct about Zeno’s own position.
Aristotle believed Zeno’s Paradoxes were trivial and easily resolved, but later philosophers have not agreed on the triviality.
d. His Method
Before Zeno, Greek thinkers favored presenting their philosophical views by writing poetry. Zeno began the grand shift away from poetry toward a prose that contained explicit premises and conclusions. And he employed the method of indirect proof in his paradoxes by temporarily assuming some thesis that he opposed and then attempting to deduce an absurd conclusion or a contradiction, thereby undermining the temporary assumption. This method of indirect proof or reductio ad absurdum probably originated with Greek mathematicians, but Zeno used it more systematically and self-consciously.
2. The Standard Solution to the Paradoxes
Any paradox can be treated by abandoning enough of its crucial assumptions. In examining Zeno’s paradoxes, it is very interesting to consider which assumptions to abandon, and why those. A paradox is an argument that reaches a contradiction by apparently legitimate steps from apparently reasonable assumptions, while the experts at the time cannot agree on the way out of the paradox, that is, agree on its resolution. It is this latter point about disagreement among the experts that distinguishes a paradox from a mere puzzle in the ordinary sense of that term. Zeno’s paradoxes are now generally considered to be puzzles because of the wide agreement among today’s experts that there is at least one acceptable resolution of the paradoxes.
This resolution is called the Standard Solution. It points out that, although Zeno was correct in saying that at any point or instant before reaching the goal there is always some as yet uncompleted path to cover, this does not imply that the goal is never reached. More specifically, the Standard Solution says that for the runners in the Achilles Paradox and the Dichotomy Paradox, the runner’s path is a physical continuum that is completed by using a positive, finite speed. The details presuppose differential calculus and classical mechanics. The Standard Solution treats speed as the derivative of distance with respect to time. It assumes that physical processes are sets of point-events. It implies that durations, distances and line segments are all linear continua composed of indivisible points, then it uses these ideas to challenge various assumptions made, and inference steps taken, by Zeno. To be very brief and anachronistic, Zeno’s mistake (and Aristotle’s mistake) was to fail to use calculus. More specifically, in the case of the paradoxes of motion such as the Achilles and the Dichotomy, Zeno’s mistake was not his assuming there is a completed infinity of places for the runner to go, which was what Aristotle said was Zeno’s mistake. Instead, Zeno’s and Aristotle’s mistake was in assuming that this is too many places (for the runner to go to in a finite time).
A key background assumption of the Standard Solution is that this resolution is not simply employing some concepts that will undermine Zeno’s reasoning—Aristotle’s reasoning does that, too, at least for most of the paradoxes—but that it is employing concepts which have been shown to be appropriate for the development of a coherent and fruitful system of mathematics and physical science. Aristotle’s treatment of the paradoxes does not employ these fruitful concepts of mathematical physics. Aristotle did not believe that the use of mathematics was needed to understand the world. The Standard Solution is much more complicated than Aristotle’s treatment, and no single person can be credited with creating it.
The Standard Solution allows us to speak of one event happening pi seconds after another, and of one event happening the square root of three seconds after another. In ordinary discourse outside of science we would never need this kind of precision, but it is needed in mathematical physics and its calculus. The need for this precision has led to requiring time to be a linear continuum, very much like a segment of the real number line. By “real numbers” we do not mean actual numbers but rather decimal numbers.
Calculus was invented in the late 1600’s by Newton and Leibniz. Their calculus is a technique for treating continuous motion as being composed of an infinite number of infinitesimal steps. After the acceptance of calculus, most all mathematicians and physicists believed that continuous motion should be modeled by a function which takes real numbers representing time as its argument and which gives real numbers representing spatial position as its value. This position function should be continuous or gap-free. In addition, the position function should be differentiable in order to make sense of speed, which is treated as the rate of change of position. By the early 20th century most mathematicians had come to believe that, to make rigorous sense of motion, mathematics needs a fully developed set theory that rigorously defines the key concepts of real number, continuity and differentiability. Doing this requires a well defined concept of the continuum. Unfortunately Newton and Leibniz did not have a good definition of the continuum, and finding a good one required over two hundred years of work.
The continuum is a very special set; it is the standard model of the real numbers. Intuitively, a continuum is a continuous entity; it is a whole thing that has no gaps. Some examples of a continuum are the path of a runner’s center of mass, the time elapsed during this motion, ocean salinity, and the temperature along a metal rod. Distances and durations are normally considered to be real physical continua whereas treating the ocean salinity and the rod’s temperature as continua is a very useful approximation for many calculations in physics even though we know that at the atomic level the approximation breaks down.
The distinction between “a” continuum and “the” continuum is that “the” continuum is the paradigm of “a” continuum. The continuum is the mathematical line, the line of geometry, which is standardly understood to have the same structure as the real numbers in their natural order. Real numbers and points on the continuum can be put into a one-to-one order-preserving correspondence. There are not enough rational numbers for this correspondence even though the rational numbers are dense, too (in the sense that between any two rational numbers there is another rational number).
For Zeno’s paradoxes, standard analysis assumes that length should be defined in terms of measure, and motion should be defined in terms of the derivative. These definitions are given in terms of the linear continuum. The most important features of any linear continuum are that (a) it is composed of indivisible points, (b) it is an actually infinite set, that is, a transfinite set, and not merely a potentially infinite set that gets bigger over time, (c) it is undivided yet infinitely divisible (that is, it is gap-free), (d) the points are so close together that no point can have a point immediately next to it, (e) between any two points there are other points, (f) the measure (such as length) of a continuum is not a matter of adding up the measures of its points nor adding up the number of its points, (g) any connected part of a continuum is also a continuum, and (h) there are an aleph-one number of points between any two points.
Physical space is not a linear continuum because it is three-dimensional and not linear; but it has one-dimensional subspaces such as paths of runners and orbits of planets; and these are linear continua if we use the path created by only one point on the runner and the orbit created by only one point on the planet. Regarding time, each (point) instant is assigned a real number as its time, and each instant is assigned a duration of zero. The time taken by Achilles to catch the tortoise is a temporal interval, a linear continuum of instants, according to the Standard Solution (but not according to Zeno or Aristotle). The Standard Solution says that the sequence of Achilles’ goals (the goals of reaching the point where the tortoise is) should be abstracted from a pre-existing transfinite set, namely a linear continuum of point places along the tortoise’s path. Aristotle’s treatment does not do this. The next section of this article presents the details of how the concepts of the Standard Solution are used to resolve each of Zeno’s Paradoxes.
Of the ten known paradoxes, The Achilles attracted the most attention over the centuries. Aristotle’s treatment of the paradox involved accusing Zeno of using the concept of an actual or completed infinity instead of the concept of a potential infinity, and accusing Zeno of failing to appreciate that a line cannot be composed of indivisible points. Aristotle’s treatment is described in detail below. It was generally accepted until the 19th century, but slowly lost ground to the Standard Solution. Some historians say Aristotle had no solution but only a verbal quibble. This article takes no side on this dispute and speaks of Aristotle’s “treatment.”
The development of calculus was the most important step in the Standard Solution of Zeno’s paradoxes, so why did it take so long for the Standard Solution to be accepted after Newton and Leibniz developed their calculus? The period lasted about two hundred years. There are four reasons. (1) It took time for calculus and the rest of real analysis to prove its applicability and fruitfulness in physics, especially during the eighteenth century. (2) It took time for the relative shallowness of Aristotle’s treatment of Zeno’s paradoxes to be recognized. (3) It took time for philosophers of science to appreciate that each theoretical concept used in a physical theory need not have its own correlate in our experience. (4) It took time for certain problems in the foundations of mathematics to be resolved, such as finding a better definition of the continuum and avoiding the paradoxes of Cantor’s naive set theory.
Point (3) is about the time it took for philosophers of science to reject the demand, favored by Ernst Mach and most Logical Positivists, that each meaningful term in science must have “empirical meaning.” This was the demand that each physical concept be separately definable with observation terms. It was thought that, because our experience is finite, the term “actual infinite” or “completed infinity” could not have empirical meaning, but “potential infinity” could. Today, most philosophers would not restrict meaning to empirical meaning.
Point (1) is about the time it took for classical mechanics to develop to the point where it was accepted as giving correct solutions to problems involving motion. Point (1) was, and still is, challenged in the metaphysical literature on the grounds that the abstract account of continuity in real analysis does not truly describe either time, space or concrete physical reality. This challenge is discussed in later sections.
Point (4) arises because the standard of rigorous proof and rigorous definition of concepts has increased over the years. As a consequence, the difficulties in the foundations of real analysis, which began with George Berkeley’s criticism of inconsistencies in the use of infinitesimals in the calculus were not satisfactorily resolved until the early 20th century with the development of Zermelo-Fraenkel set theory. The key idea was to work out the necessary and sufficient conditions for being a continuum. To achieve the goal, the conditions for being a mathematical continuum had to be strictly arithmetical and not dependent on our intuitions about space, time and motion. The idea was to revise or “tweak” the definition until it would not create new paradoxes and would still give useful theorems. When this revision was completed, it could be declared that the set of real numbers is an actual infinity, not a potential infinity, and that not only is any interval of real numbers a linear continuum, but so are the spatial paths, the temporal durations, and the motions that are mentioned in Zeno’s paradoxes. In addition, it was important to clarify how to compute the sum of an infinite series (such as 1/2 + 1/4 + 1/8 + …) without requiring any person to manually add or otherwise perform some action that requires an infinite amount of time. The clarification is to say the infinite series sums to a finite value if the partial sums (of adding the first two terms, then the first three, and so on) get closer and closer to that finite value. Finally, mathematicians needed to define motion in terms of the derivative. This new mathematical system required many new well-defined concepts such as compact set, connected set, continuity, continuous function, convergence-to-a-limit of an infinite sequence, curvature at a point, cut, derivative, dimension, function, integral, limit, measure, reference frame, set, and size of a set. Just as for those new mathematical concepts, rigor was added to the definitions of these physical concepts: place, instant, duration, distance, and instantaneous speed. The relevant revisions were made by Euler in the 18th century and by Bolzano, Cantor, Cauchy, Dedekind, Frege, Hilbert, Lebesgue, Peano, Russell, Weierstrass, and Whitehead, among others, during the 19th and early 20th centuries.
What happened over these centuries to Leibniz’s infinitesimals and Newton’s fluxions? Let’s stick with infinitesimals, since fluxions have the same problems and same resolution. In 1734, Berkeley had properly criticized the use of infinitesimals as being “ghosts of departed quantities” that are used inconsistently in calculus. Earlier, Newton had defined instantaneous speed as the ratio of an infinitesimally small distance and an infinitesimally small duration, and he and Leibniz produced a system of calculating variable speeds that was very fruitful. But nobody in that century or the next could adequately explain what an infinitesimal was. Newton had called them “evanescent divisible quantities,” whatever that meant. Leibniz called them “vanishingly small,” but that was just as vague.
The practical use of infinitesimals was unsystematic. For example, the infinitesimal dx is treated as being equal to zero when it is declared that x + dx = x, but is treated as not being zero when used in the denominator of the fraction [f(x + dx) – f(x)]/dx which is used in the derivative of the function f. In addition, consider the seemingly obvious Archimedean property of pairs of positive numbers: given any two positive numbers A and B, if you add enough copies of A, then you can produce a sum greater than B. This property fails if A is an infinitesimal. Finally, mathematicians gave up on answering Berkeley’s charges (and thus re-defined what we mean by standard analysis) because, in 1821, Cauchy showed how to achieve the same useful theorems of calculus by using the idea of a limit instead of an infinitesimal. Later in the 19th century, Weierstrass resolved some of the inconsistencies in Cauchy’s account and satisfactorily showed how to define continuity in terms of limits (his epsilon-delta method). As J. O. Wisdom points out (1953, p. 23), “At the same time it became clear that [Leibniz’s and] Newton’s theory, with suitable amendments and additions, could be soundly based” provided Leibniz’s infinitesimals and Newton’s fluxions were removed. In an effort to provide this sound basis according to the latest, heightened standard of what counts as “sound,” Peano, Frege, Hilbert, and Russell attempted to properly axiomatize real analysis. Unfortuately, this led in 1901 to Russell’s paradox and the fruitful controversy about how to provide a foundation to all of mathematics. That controversy still exists, but the majority view is that axiomatic Zermelo-Fraenkel set theory with the axiom of choice blocks all the paradoxes, legitimizes Cantor’s theory of transfinite sets, and provides the proper foundation for real analysis and other areas of mathematics, and indirectly resolves Zeno’s paradoxes. This standard real analysis lacks infinitesimals, thanks to Cauchy and Weierstrass. Standard real analysis is the mathematics that the Standard Solution applies to Zeno’s Paradoxes.
In Standard real analysis, the rational numbers are not continuous although they are infinitely numerous and infinitely dense. To come up with a foundation for calculus there had to be a good definition of the continuity of the real numbers. But this required having a good definition of irrational numbers. There wasn’t one before 1872. Dedekind’s definition in 1872 defines the mysterious irrationals in terms of the familiar rationals. The result is a clear and useful definition of real numbers. The usefulness of Dedekind’s definition of real numbers, and the lack of any better definition, convinced many mathematicians to be more open to accepting the real numbers and actually-infinite sets.
Let’s take a short interlude and introduce Dedekind’s key, new idea that he discovered in the 1870s about the reals and their relationship to the rationals. He envisioned how to define a real number to be a cut of the rational numbers, where a cut is a certain ordered pair of actually-infinite sets of rational numbers.
A Dedekind cut (A,B) is defined to be a partition or cutting of the standardly-ordered set of all the rational numbers into a left part A and a right part B. A and B are non-empty, and they partition all the rationals so that the numbers in A are less than all those in B, and also A contains no greatest number. Every real number is a unique Dedekind cut. The cut can be made at a rational number or at an irrational number. Here are examples of each:
Dedekind’s real number 1/2 is ({x : x < 1/2} , {x: x ≥ 1/2}).
Dedekind’s positive real number √2 is ({x : x < 0 or x2 < 2} , {x: x2 ≥ 2}).
The value of ‘x’ must be rational only. For any cut (A,B), if B has a smallest number, then the real number for that cut corresponds to this smallest number, as in the definition of ½ above. Otherwise, the cut defines an irrational number which, loosely speaking, fills the gap between A and B, as in the definition of the square root of 2 above. By defining reals in terms of rationals this way, Dedekind gave a foundation to the reals, and legitimized them by showing they are as acceptable as actually-infinite sets of rationals.
But what exactly is an actually-infinite (or transfinite) set, and does this idea lead to contradictions? This question needs an answer if there is to be a good theory of continuity and of real numbers. In the 1870s, Cantor clarified what an actually-infinite set is and made a convincing case that the concept does not lead to inconsistencies. These accomplishments by Cantor are why he (along with Dedekind and Weierstrass) is said by Russell to have “solved Zeno’s Paradoxes.”
That solution recommends using very different concepts and theories than those used by Zeno. The argument that this is the correct solution was presented by many people, but it was especially influenced by the work of Bertrand Russell (1914, lecture 6) and the more detailed work of Adolf Grünbaum (1967). In brief, the argument for the Standard Solution is that we have solid grounds for believing our best scientific theories, but the theories of mathematics such as calculus and Zermelo-Fraenkel set theory are indispensable to these theories, so we have solid grounds for believing in them, too. The scientific theories require a resolution of Zeno’s paradoxes and the other paradoxes; and the Standard Solution to Zeno’s Paradoxes that uses standard calculus and Zermelo-Fraenkel set theory is indispensable to this resolution or at least is the best resolution, or, if not, then we can be fairly sure there is no better solution, or, if not that either, then we can be confident that the solution is good enough (for our purposes). Aristotle’s treatment, on the other hand, uses concepts that hamper the growth of mathematics and science. Therefore, we should accept the Standard Solution.
In the next section, this solution will be applied to each of Zeno’s ten paradoxes.
To be optimistic, the Standard Solution represents a counterexample to the claim that philosophical problems never get solved. To be less optimistic, the Standard Solution has its drawbacks and its alternatives, and these have generated new and interesting philosophical controversies beginning in the last half of the 20th century, as will be seen in later sections. The primary alternatives contain different treatments of calculus from that developed at the end of the 19th century. Whether this implies that Zeno’s paradoxes have multiple solutions or only one is still an open question.
Did Zeno make mistakes? And was he superficial or profound? These questions are a matter of dispute in the philosophical literature. The majority position is as follows. If we give his paradoxes a sympathetic reconstruction, he correctly demonstrated that some important, classical Greek concepts are logically inconsistent, and he did not make a mistake in doing this, except in the Moving Rows Paradox, the Paradox of Alike and Unlike and the Grain of Millet Paradox, his weakest paradoxes. Zeno did assume that the classical Greek concepts were the correct concepts to use in reasoning about his paradoxes, and now we prefer revised concepts, though it would be unfair to say he blundered for not foreseeing later developments in mathematics and physics.
3. The Ten Paradoxes
Zeno probably created forty paradoxes, of which only the following ten are known. Only the first four have standard names, and the first two have received the most attention. The ten are of uneven quality. Zeno and his ancient interpreters usually stated his paradoxes badly, so it has taken some clever reconstruction over the years to reveal their full force. Below, the paradoxes are reconstructed sympathetically, and then the Standard Solution is applied to them. These reconstructions use just one of several reasonable schemes for presenting the paradoxes, but the present article does not explore the historical research about the variety of interpretive schemes and their relative plausibility.
a. Paradoxes of Motion
i. The Achilles
Achilles, whom we can assume is the fastest runner of antiquity, is racing to catch the tortoise that is slowly crawling away from him. Both are moving along a linear path at constant speeds. In order to catch the tortoise, Achilles will have to reach the place where the tortoise presently is. However, by the time Achilles gets there, the tortoise will have crawled to a new location. Achilles will then have to reach this new location. By the time Achilles reaches that location, the tortoise will have moved on to yet another location, and so on forever. Zeno claims Achilles will never catch the tortoise. This argument shows, he believes, that anyone who believes Achilles will succeed in catching the tortoise and who believes more generally that motion is physically possible is the victim of illusion. The claim that motion is an illusion was advanced by Zeno’s mentor Parmenides .
The source for all of Zeno’s arguments is the writings of his opponents. The Achilles Paradox is reconstructed from Aristotle (Physics Book VI, Chapter 8, 239b14-16) and some passages from Simplicius in the fifth century C.E. There is no evidence that Zeno used a tortoise rather than a slow human. The tortoise is a later commentator’s addition. Aristotle spoke simply of “the runner” who competes with Achilles.
It won’t do to react and say the solution to the paradox is that there are biological limitations on how small a step Achilles can take. Achilles’ feet are not obligated to stop and start again at each of the locations described above, so there is no limit to how close one of those locations can be to another. A stronger version of his paradox would ask us to consider the movement of Achilles’ center of mass. It is best to think of Achilles’ change from one location to another as a continuous movement rather than as incremental steps requiring halting and starting again. Zeno is assuming that space and time are infinitely divisible; they are not discrete or atomistic. If they were, this Paradox’s argument would not work.
One common complaint with Zeno’s reasoning is that he is setting up a straw man because it is obvious that Achilles cannot catch the tortoise if he continually takes a bad aim toward the place where the tortoise is; he should aim farther ahead. The mistake in this complaint is that even if Achilles took some sort of better aim, it is still true that he is required to go to every one of those locations that are the goals of the so-called “bad aims,” so remarking about a bad aim is not a way to successfully treat Zeno’s argument.
The treatment called the “Standard Solution” to the Achilles Paradox uses calculus and other parts of real analysis to describe the situation. It implies that Zeno is assuming Achilles cannot achieve his goal because
(1) there is too far to run, or
(2) there is not enough time, or
(3) there are too many places to go, or
(4) there is no final step, or
(5) there are too many tasks.
The historical record does not tell us which of these was Zeno’s real assumption, but they are all false assumptions, according to the Standard Solution.
Let’s consider assumption (1). Presumably Zeno would defend that assumption by remarking that there are an infinity of sub-distances involved in Achilles’ run, and the sum of the sub-distances is an actual infinity, which is too much distance even for Achilles. However, the advocate of the Standard Solution will remark, “How does Zeno know what the sum of this infinite series is, since in Zeno’s day the mathematicians could make sense of a sum of a series of terms only if there were a finite number of terms in the series? Maybe he is just guessing that the sum of an infinite number of terms could somehow be well-defined and be infinite.” According to the Standard Solution the sum is finite. Here is a graph using the methods of the Standard Solution showing the activity of Achilles as he chases the tortoise and overtakes it.
For ease of understanding, Zeno and the tortoise are assumed to be point masses or infinitesimal particles, each moving at a constant velocity (that is, a constant speed in one direction). The graph is displaying the fact that Achilles’ path is a linear continuum and so is composed of an actual infinity of points. (An actual infinity is also called a “completed infinity” or “transfinite infinity.” The word “actual” does not mean “real” as opposed to “imaginary.”) Zeno’s failure to assume that Achilles’ path is a linear continuum is a fatal step in his argument, according to the Standard Solution which requires that the reasoner use the concepts of contemporary mathematical physics.
Achilles travels a distance d1 in reaching the point x1 where the tortoise starts, but by the time Achilles reaches x1, the tortoise has moved on to a new point x2. When Achilles reaches x2, having gone an additional distance d2, the tortoise has moved on to point x3, requiring Achilles to cover an additional distance d3, and so forth. This sequence of non-overlapping distances (or intervals or sub-paths) is an actual infinity, but happily the geometric series converges. The sum of its terms d1 + d2 + d3 +… is a finite distance that Achilles can readily complete while moving at a constant speed.
Similar reasoning would apply if Zeno were to have made assumptions (2) or (3) above about there not being enough time for Achilles or there being too many places for him to run. Regarding assumption (4), Zeno’s requirement that there be a final step or final sub-path in Achilles’ run is simply mistaken—according to the Standard Solution. (More will be said about assumption (5) in Section 5c when we discuss supertasks.)
In Zeno’s day, since the mathematicians could make sense only of the sum of a finite number of distances, it was Aristotle’s genius to claim that Achilles covered only a potential infinity of distances, not an actual infinity since the sum of a potential infinity is a finite number at any time; thus Achilles can in that sense achieve an infinity of tasks while covering a finite distance in a finite duration. When Aristotle made this claim and used it to treat Zeno’s paradoxes, there was no better solution to the Achilles Paradox, and a better solution would not be discovered for many more centuries. In Zeno’s day, no person had a clear notion of continous space, nor of the limit of an actually infinite series, nor even of zero.
The Achilles Argument, if strengthened and not left as vague as it was in Zeno’s day, presumes that space and time are continuous or infinitely divisible. So, Zeno’s conclusion might have more cautiously asserted that Achilles cannot catch the tortoise if space and time are infinitely divisible in the sense of actual infinity. Perhaps, as some commentators have speculated, Zeno used or should have used the Achilles Paradox only to attack continuous space, and he used or should have used his other paradoxes such as the “Arrow” and the “The Moving Rows” to attack discrete space.
ii. The Dichotomy (The Racetrack)
As Aristotle realized, the Dichotomy Paradox is just the Achilles Paradox in which Achilles stands still ahead of the tortoise. In his Progressive Dichotomy Paradox, Zeno argued that a runner will never reach the stationary goal line on a straight racetrack. The reason is that the runner must first reach half the distance to the goal, but when there he must still cross half the remaining distance to the goal, but having done that the runner must cover half of the new remainder, and so on. If the goal is one meter away, the runner must cover a distance of 1/2 meter, then 1/4 meter, then 1/8 meter, and so on ad infinitum. The runner cannot reach the final goal, says Zeno. Why not? There are few traces of Zeno’s reasoning here, but for reconstructions that give the strongest reasoning, we may say that the runner will not reach the final goal because there is too far to run, the sum is actually infinite. The Standard Solution argues instead that the sum of this infinite geometric series is one, not infinity.
The problem of the runner reaching the goal can be viewed from a different perspective. According to the Regressive version of the Dichotomy Paradox, the runner cannot even take a first step. Here is why. Any step may be divided conceptually into a first half and a second half. Before taking a full step, the runner must take a 1/2 step, but before that he must take a 1/4 step, but before that a 1/8 step, and so forth ad infinitum, so Achilles will never get going. Like the Achilles Paradox, this paradox also concludes that any motion is impossible.
The Dichotomy paradox, in either its Progressive version or its Regressive version, assumes here for the sake of simplicity and strength of argumentation that the runner’s positions are point places. Actual runners take up some larger volume, but the assumption of point places is not a controversial assumption because Zeno could have reconstructed his paradox by speaking of the point places occupied by, say, the tip of the runner’s nose or the center of his mass, and this assumption makes for a clearer and stronger paradox.
In the Dichotomy Paradox, the runner reaches the points 1/2 and 3/4 and 7/8 and so forth on the way to his goal, but under the influence of Bolzano and Dedekind and Cantor, who developed the first theory of sets, the set of those points is no longer considered to be potentially infinite. It is an actually infinite set of points abstracted from a continuum of points, in which the word “continuum” is used in the late 19th century sense that is at the heart of calculus. And any ancient idea that the sum of the actually infinite series of path lengths or segments 1/2 + 1/4 + 1/8 + … is infinite now has to be rejected in favor of the theory that the sum converges to 1. This is key to solving the Dichotomy Paradox according to the Standard Solution. It is basically the same treatment as that given to the Achilles. The Dichotomy Paradox has been called “The Stadium” by some commentators, but that name is also commonly used for the Paradox of the Moving Rows, so readers need to be on the alert for ambiguity in the literature.
Aristotle, in Physics Z9, said of the Dichotomy that it is possible for a runner to come in contact with a potentially infinite number of things in a finite time provided the time intervals becomes shorter and shorter. Aristotle said Zeno assumed this is impossible, and that is one of his errors in the Dichotomy. However, Aristotle merely asserted this and could give no detailed theory that enables the computation of the finite amount of time. So, Aristotle could not really defend his diagnosis of Zeno’s error. Today the calculus is used to provide the Standard Solution with that detailed theory.
There is another detail of the Dichotomy that needs resolution. How does Zeno’s runner complete the trip if there is no final step or last member of the infinite sequence of steps (intervals and goals)? Don’t trips need last steps? The Standard Solution answers “no” and says the intuitive answer “yes” is one of many intuitions held by Zeno and Aristotle and the average person today that must be rejected when embracing the Standard Solution.
iii. The Arrow
Zeno’s Arrow Paradox takes a different approach to challenging the coherence of our common sense concepts of time and motion. Think of how you would distinguish an arrow that is stationary in space from one that is flying through space, given that you look only at a snapshot (an instantaneous photo) of them. Would there be any difference? No, and since at any instant of any time period the arrow makes no progress, it never makes progress.
As Aristotle explains, from Zeno’s “assumption that time is composed of moments,” a moving arrow must occupy a space equal to itself during any moment. That is, during any indivisible moment or instant it is at the place where it is. But places do not move. So, if in each moment, the arrow is occupying a space equal to itself, then the arrow is not moving in that moment. The reason it is not moving is that it has no time in which to move; it is simply there at the place. It cannot move during the moment because there is not enough time for any motion, and the moment is indivisible. The same reasoning holds for any other moment during the so-called “flight” of the arrow. So, the arrow is never moving. By a similar argument, Zeno can establish that nothing else moves. The source for Zeno’s argument is Aristotle (Physics, Book VI, chapter 5, 239b5-32).
The Standard Solution to the Arrow Paradox requires appeal to our contemporary theory of speed from calculus. This theory defines instantaneous motion, that is, motion at an instant, without defining motion during an instant. This new treatment of motion originated with Newton and Leibniz in the sixteenth century, and it employs what is called the “at-at” theory of motion, which says motion is being at different places at different times. Motion is not some feature that reveals itself only within a moment. The modern difference between rest and motion, as opposed to the difference in antiquity, has to do with what is happening at nearby moments and—contra Zeno—has nothing to do with what is happening during a moment.
Some researchers have speculated that the Arrow Paradox was designed by Zeno to attack discrete time and space rather than continuous time and space. This is not clear, and the Standard Solution works for both. That is, regardless of whether time is continuous and Zeno’s instant has no finite duration, or time is discrete and Zeno’s instant lasts for, say, 10-44 seconds, there is insufficient time for the arrow to move during the instant. Yet regardless of how long the instant lasts, there still can be instantaneous motion, namely motion at that instant provided the object is in a different place at some other instant.
To re-emphasize this crucial point, note that both Zeno and 21st century mathematical physicists agree that the arrow cannot be in motion within or during an instant (an instantaneous time), but the physicists will point out that the arrow can be in motion at an instant in the sense of having a positive speed at that instant (its so-called instantaneous speed), provided the arrow occupies different positions at times before or after that instant so that the instant is part of a period in which the arrow is continuously in motion. If we do not pay attention to what happens at nearby instants, it is impossible to distinguish instantaneous motion from instantaneous rest, but distinguishing the two is the way out of the Arrow Paradox. Zeno would have balked at the idea of motion at an instant, and Aristotle explicitly denied it.
The Arrow Paradox is refuted by the Standard Solution with its new at-at theory of motion, but the paradox seems especially strong to someone who would prefer instead to say that motion is an intrinsic property of an instant, being some propensity or disposition to be elsewhere.
Let’s reconsider the details of the Standard Solution assuming continuous motion rather than discrete motion. In calculus, the speed of an object at an instant (its instantaneous speed) is the time derivative of the object’s position; this means the object’s speed is the limit of its series of average speeds during smaller and smaller intervals of time containing the instant. We make essentially the same point when we say the object’s speed is the limit of its average speed over an interval as the length of the interval tends to zero. The derivative of the arrow’s position x with respect to time t, namely dx/dt, is the arrow’s instantaneous speed, and it has non-zero values at specific places at specific instants during the arrow’s flight, contra Zeno and Aristotle. The speed during an instant or in an instant, which is what Zeno is calling for, would be 0/0 and is undefined. But the speed at an instant is well defined. If we require the use of these modern concepts, then Zeno cannot successfully produce a contradiction as he tries to do by his assuming that in each moment the speed of the arrow is zero—because it is not zero. Therefore, advocates of the Standard Solution conclude that Zeno’s Arrow Paradox has a false, but crucial, assumption and so is unsound.
Independently of Zeno, the Arrow Paradox was discovered by the Chinese dialectician Kung-sun Lung (Gongsun Long, ca. 325–250 B.C.E.). A lingering philosophical question about the arrow paradox is whether there is a way to properly refute Zeno’s argument that motion is impossible without using the apparatus of calculus.
iv. The Moving Rows (The Stadium)
According to Aristotle (Physics, Book VI, chapter 9, 239b33-240a18), Zeno try to create a paradox by considering bodies (that is, physical objects) of equal length aligned along three parallel rows within a stadium. One track contains A bodies (three A bodies are shown below); another contains B bodies; and a third contains C bodies. Each body is the same distance from its neighbors along its track. The A bodies are stationary. The Bs are moving to the right, and the Cs are moving with the same speed to the left. Here are two snapshots of the situation, before and after. They are taken one instant apart.
Zeno points out that, in the time between the before-snapshot and the after-snapshot, the leftmost C passes two Bs but only one A, contradicting his (very controversial) assumption that the C should take longer to pass two Bs than one A. The usual way out of this paradox is to reject that controversial assumption.
Aristotle argues that how long it takes to pass a body depends on the speed of the body; for example, if the body is coming towards you, then you can pass it in less time than if it is stationary. Today’s analysts agree with Aristotle’s diagnosis, and historically this paradox of motion has seemed weaker than the previous three. This paradox has been called “The Stadium,” but occasionally so has the Dichotomy Paradox.
Some analysts, for example Tannery (1887), believe Zeno may have had in mind that the paradox was supposed to have assumed that both space and time are discrete (quantized, atomized) as opposed to continuous, and Zeno intended his argument to challenge the coherence of the idea of discrete space and time.
Well, the paradox could be interpreted this way. If so, assume the three objects A, B, and C are adjacent to each other in their tracks, and each A, B and C body are occupying a space that is one atom long. Then, if all motion is occurring at the rate of one atom of space in one atom of time, the leftmost C would pass two atoms of B-space in the time it passed one atom of A-space, which is a contradiction to our assumption about rates. There is another paradoxical consequence. Look at the space occupied by left C object. During the instant of movement, it passes the middle B object, yet there is no time at which they are adjacent, which is odd.
So, Zeno’s argument can be interpreted as producing a challenge to the idea that space and time are discrete. However, most commentators suspect Zeno himself did not interpret his paradox this way.
b. Paradoxes of Plurality
Zeno’s paradoxes of motion are attacks on the commonly held belief that motion is real, but because motion is a kind of plurality, namely a process along a plurality of places in a plurality of times, they are also attacks on this kind of plurality. Zeno offered more direct attacks on all kinds of plurality. The first is his Paradox of Alike and Unlike.
i. Alike and Unlike
According to Plato in Parmenides 127-9, Zeno argued that the assumption of plurality–the assumption that there are many things–leads to a contradiction. He quotes Zeno as saying: “If things are many, . . . they must be both like and unlike. But that is impossible; unlike things cannot be like, nor like things unlike” (Hamilton and Cairns (1961), 922).
Zeno’s point is this. Consider a plurality of things, such as some people and some mountains. These things have in common the property of being heavy. But if they all have this property in common, then they really are all the same kind of thing, and so are not a plurality. They are a one. By this reasoning, Zeno believes it has been shown that the plurality is one (or the many is not many), which is a contradiction. Therefore, by reductio ad absurdum, there is no plurality, as Parmenides has always claimed.
Plato immediately accuses Zeno of equivocating. A thing can be alike some other thing in one respect while being not alike it in a different respect. Your having a property in common with some other thing does not make you identical with that other thing. Consider again our plurality of people and mountains. People and mountains are all alike in being heavy, but are unlike in intelligence. And they are unlike in being mountains; the mountains are mountains, but the people are not. As Plato says, when Zeno tries to conclude “that the same thing is many and one, we shall [instead] say that what he is proving is that something is many and one [in different respects], not that unity is many or that plurality is one….” [129d] So, there is no contradiction, and the paradox is solved by Plato. This paradox is generally considered to be one of Zeno’s weakest paradoxes, and it is now rarely discussed. [See Rescher (2001), pp. 94-6 for some discussion.]
ii. Limited and Unlimited
This paradox is also called the Paradox of Denseness. Suppose there exist many things rather than, as Parmenides would say, just one thing. Then there will be a definite or fixed number of those many things, and so they will be “limited.” But if there are many things, say two things, then they must be distinct, and to keep them distinct there must be a third thing separating them. So, there are three things. But between these, …. In other words, things are dense and there is no definite or fixed number of them, so they will be “unlimited.” This is a contradiction, because the plurality would be both limited and unlimited. Therefore, there are no pluralities; there exists only one thing, not many things. This argument is reconstructed from Zeno’s own words, as quoted by Simplicius in his commentary of book 1 of Aristotle’s Physics.
According to the Standard Solution to this paradox, the weakness of Zeno’s argument can be said to lie in the assumption that “to keep them distinct, there must be a third thing separating them.” Zeno would have been correct to say that between any two physical objects that are separated in space, there is a place between them, because space is dense, but he is mistaken to claim that there must be a third physical object there between them. Two objects can be distinct at a time simply by one having a property the other does not have.
iii. Large and Small
Suppose there exist many things rather than, as Parmenides says, just one thing. Then every part of any plurality is both so small as to have no size but also so large as to be infinite, says Zeno. His reasoning for why they have no size has been lost, but many commentators suggest that he’d reason as follows. If there is a plurality, then it must be composed of parts which are not themselves pluralities. Yet things that are not pluralities cannot have a size or else they’d be divisible into parts and thus be pluralities themselves.
Now, why are the parts of pluralities so large as to be infinite? Well, the parts cannot be so small as to have no size since adding such things together would never contribute anything to the whole so far as size is concerned. So, the parts have some non-zero size. If so, then each of these parts will have two spatially distinct sub-parts, one in front of the other. Each of these sub-parts also will have a size. The front part, being a thing, will have its own two spatially distinct sub-parts, one in front of the other; and these two sub-parts will have sizes. Ditto for the back part. And so on without end. A sum of all these sub-parts would be infinite. Therefore, each part of a plurality will be so large as to be infinite.
This sympathetic reconstruction of the argument is based on Simplicius’ On Aristotle’s Physics, where Simplicius quotes Zeno’s own words for part of the paradox, although he does not say what he is quoting from.
There are many errors here in Zeno’s reasoning, according to the Standard Solution. He is mistaken at the beginning when he says, “If there is a plurality, then it must be composed of parts which are not themselves pluralities.” A university is an illustrative counterexample. A university is a plurality of students, but we need not rule out the possibility that a student is a plurality. What’s a whole and what’s a plurality depends on our purposes. When we consider a university to be a plurality of students, we consider the students to be wholes without parts. But for another purpose we might want to say that a student is a plurality of biological cells. Zeno is confused about this notion of relativity, and about part-whole reasoning; and as commentators began to appreciate this they lost interest in Zeno as a player in the great metaphysical debate between pluralism and monism.
A second error occurs in arguing that the each part of a plurality must have a non-zero size. The contemporary notion of measure (developed in the 20th century by Brouwer, Lebesgue, and others) showed how to properly define the measure function so that a line segment has nonzero measure even though (the singleton set of) any point has a zero measure. The measure of the line segment [a, b] is b – a; the measure of a cube with side a is a3. This theory of measure is now properly used by our civilization for length, volume, duration, mass, voltage, brightness, and other continuous magnitudes.
Thanks to Aristotle’s support, Zeno’s Paradoxes of Large and Small and of Infinite Divisibility (to be discussed below) were generally considered to have shown that a continuous magnitude cannot be composed of points. Interest was rekindled in this topic in the 18th century. The physical objects in Newton’s classical mechanics of 1726 were interpreted by R. J. Boscovich in 1763 as being collections of point masses. Each point mass is a movable point carrying a fixed mass. This idealization of continuous bodies as if they were compositions of point particles was very fruitful; it could be used to easily solve otherwise very difficult problems in physics. This success led scientists, mathematicians, and philosophers to recognize that the strength of Zeno’s Paradoxes of Large and Small and of Infinite Divisibility had been overestimated; they did not prevent a continuous magnitude from being composed of points.
iv. Infinite Divisibility
This is the most challenging of all the paradoxes of plurality. Consider the difficulties that arise if we assume that an object theoretically can be divided into a plurality of parts. According to Zeno, there is a reassembly problem. Imagine cutting the object into two non-overlapping parts, then similarly cutting these parts into parts, and so on until the process of repeated division is complete. Assuming the hypothetical division is “exhaustive” or does comes to an end, then at the end we reach what Zeno calls “the elements.” Here there is a problem about reassembly. There are three possibilities. (1) The elements are nothing. In that case the original objects will be a composite of nothing, and so the whole object will be a mere appearance, which is absurd. (2) The elements are something, but they have zero size. So, the original object is composed of elements of zero size. Adding an infinity of zeros yields a zero sum, so the original object had no size, which is absurd. (3) The elements are something, but they do not have zero size. If so, these can be further divided, and the process of division was not complete after all, which contradicts our assumption that the process was already complete. In summary, there were three possibilities, but all three possibilities lead to absurdity. So, objects are not divisible into a plurality of parts.
Simplicius says this argument is due to Zeno even though it is in Aristotle (On Generation and Corruption, 316a15-34, 316b34 and 325a8-12) and is not attributed there to Zeno, which is odd. Aristotle says the argument convinced the atomists to reject infinite divisibility. The argument has been called the Paradox of Parts and Wholes, but it has no traditional name.
The Standard Solution says we first should ask Zeno to be clearer about what he is dividing. Is it concrete or abstract? When dividing a concrete, material stick into its components, we reach ultimate constituents of matter such as quarks and electrons that cannot be further divided. These have a size, a zero size (according to quantum electrodynamics), but it is incorrect to conclude that the whole stick has no size if its constituents have zero size. [Due to the forces involved, point particles have finite “cross sections,” and configurations of those particles, such as atoms, do have finite size.] So, Zeno is wrong here. On the other hand, is Zeno dividing an abstract path or trajectory? Let’s assume he is, since this produces a more challenging paradox. If so, then choice (2) above is the one to think about. It’s the one that talks about addition of zeroes. Let’s assume the object is one-dimensional, like a path. According to the Standard Solution, this “object” that gets divided should be considered to be a continuum with its elements arranged into the order type of the linear continuum, and we should use the contemporary notion of measure to find the size of the object. The size (length, measure) of a point-element is zero, but Zeno is mistaken in saying the total size (length, measure) of all the zero-size elements is zero. The size of the object is determined instead by the difference in coordinate numbers assigned to the end points of the object. An object extending along a straight line that has one of its end points at one meter from the origin and other end point at three meters from the origin has a size of two meters and not zero meters. So, there is no reassembly problem, and a crucial step in Zeno’s argument breaks down.
c. Other Paradoxes
i. The Grain of Millet
There are two common interpretations of this paradox. According to the first, which is the standard interpretation, when a bushel of millet (or wheat) grains falls out of its container and crashes to the floor, it makes a sound. Since the bushel is composed of individual grains, each individual grain also makes a sound, as should each thousandth part of the grain, and so on to its ultimate parts. But this result contradicts the fact that we actually hear no sound for portions like a thousandth part of a grain, and so we surely would hear no sound for an ultimate part of a grain. Yet, how can the bushel make a sound if none of its ultimate parts make a sound? The original source of this argument is Aristotle Physics, Book VII, chapter 4, 250a19-21). There seems to be appeal to the iterative rule that if a millet or millet part makes a sound, then so should a next smaller part.
We do not have Zeno’s words on what conclusion we are supposed to draw from this. Perhaps he would conclude it is a mistake to suppose that whole bushels of millet have millet parts. This is an attack on plurality.
The Standard Solution to this interpretation of the paradox accuses Zeno of mistakenly assuming that there is no lower bound on the size of something that can make a sound. There is no problem, we now say, with parts having very different properties from the wholes that they constitute. The iterative rule is initially plausible but ultimately not trustworthy, and Zeno is committing both the fallacy of division and the fallacy of composition.
Some analysts interpret Zeno’s paradox a second way, as challenging our trust in our sense of hearing, as follows. When a bushel of millet grains crashes to the floor, it makes a sound. The bushel is composed of individual grains, so they, too, make an audible sound. But if you drop an individual millet grain or a small part of one or an even smaller part, then eventually your hearing detects no sound, even though there is one. Therefore, you cannot trust your sense of hearing.
This reasoning about our not detecting low amplitude sounds is similar to making the mistake of arguing that you cannot trust your thermometer because there are some ranges of temperature that it is not sensitive to. So, on this second interpretation, the paradox is also easy to solve. One reason given in the literature for believing that this second interpretation is not the one that Zeno had in mind is that Aristotle’s criticism given below applies to the first interpretation and not the second, and it is unlikely that Aristotle would have misinterpreted the paradox.
ii. Against Place
Given an object, we may assume that there is a single, correct answer to the question, “What is its place?” Because everything that exists has a place, and because place itself exists, so it also must have a place, and so on forever. That’s too many places, so there is a contradiction. The original source is Aristotle’s Physics (209a23-25 and 210b22-24).
The standard response to Zeno’s Paradox Against Place is to deny that places have places, and to point out that the notion of place should be relative to reference frame. But Zeno’s assumption that places have places was common in ancient Greece at the time, and Zeno is to be praised for showing that it is a faulty assumption.
4. Aristotle’s Treatment of the Paradoxes
Aristotle’s views about Zeno’s paradoxes can be found in his Physics, book 4, chapter 2, and book 6, chapters 2 and 9. Regarding the Dichotomy Paradox, Aristotle is to be applauded for his insight that Achilles has time to reach his goal because during the run ever shorter paths take correspondingly ever shorter times.
Aristotle had several criticisms of Zeno. Regarding the paradoxes of motion, he complained that Zeno should not suppose the runner’s path is dependent on its parts; instead, the path is there first, and the parts are constructed by the analyst. His second complaint was that Zeno should not suppose that lines contain indivisible points. Aristotle’s third and most influential, critical idea involves a complaint about potential infinity. On this point, in remarking about the Achilles Paradox, Aristotle said, “Zeno’s argument makes a false assumption in asserting that it is impossible for a thing to pass over…infinite things in a finite time.” Aristotle believed it is impossible for a thing to pass over an actually infinite number of things in a finite time, but he believed that it is possible for a thing to pass over a potentially infinite number of things in a finite time. Here is how Aristotle expressed the point:
For motion…, although what is continuous contains an infinite number of halves, they are not actual but potential halves. (Physics 263a25-27). …Therefore to the question whether it is possible to pass through an infinite number of units either of time or of distance we must reply that in a sense it is and in a sense it is not. If the units are actual, it is not possible: if they are potential, it is possible. (Physics 263b2-5).
Aristotle denied the existence of the actual infinite both in the physical world and in mathematics, but he accepted potential infinities there. By calling them potential infinities he did not mean they have the potential to become actually infinite; potential infinity is a technical term that suggests a process that has not been completed. The term actual infinite does not imply being actual or real. It implies being complete, with no dependency on some process in time.
A potential infinity is an unlimited iteration of some operation—unlimited in time. Aristotle claimed correctly that if Zeno were not to have used the concept of actual infinity and of indivisible point, then the paradoxes of motion such as the Achilles Paradox (and the Dichotomy Paradox) could not be created.
Here is why doing so is a way out of these paradoxes. Zeno said that to go from the start to the finish line, the runner Achilles must reach the place that is halfway-there, then after arriving at this place he still must reach the place that is half of that remaining distance, and after arriving there he must again reach the new place that is now halfway to the goal, and so on. These are too many places to reach. Zeno made the mistake, according to Aristotle, of supposing that this infinite process needs completing when it really does not need completing and cannot be completed; the finitely long path from start to finish exists undivided for the runner, and it is the mathematician who is demanding the completion of such a process. Without using that concept of a completed infinity there is no paradox. Aristotle is correct about this being a treatment that avoids paradox.
Aristotle and Zeno disagree about the nature of division of a runner’s path. Aristotle’s complaint can be expressed succinctly this way: Zeno was correct to suppose that at any time a runner’s path can be divided anywhere, but incorrect to suppose the path can be divided everywhere at the same time.
Today’s standard treatment of the Achilles paradox disagrees with Aristotle’s way out of the paradox and says Zeno was correct to use the concept of a completed infinity and correct to imply that the runner must go to an actual infinity of places in a finite time and correct to suppose the runner’s path can be divided everywhere at the same time.
From what Aristotle says, one can infer between the lines that he believes there is another reason to reject actual infinities: doing so is the only way out of these paradoxes of motion. Today we know better. There is another way out, namely, the Standard Solution that uses actual infinities, which are analyzable in terms of Cantor’s transfinite sets.
Aristotle’s treatment by disallowing actual infinity while allowing potential infinity was clever, and it satisfied nearly all scholars for 1,500 years, being buttressed during that time by the Church’s doctrine that only God is actually infinite. George Berkeley, Immanuel Kant, Carl Friedrich Gauss, and Henri Poincaré were influential defenders of potential infinity. Leibniz accepted actual infinitesimals, but other mathematicians and physicists in European universities during these centuries were careful to distinguish between actual and potential infinities and to avoid using actual infinities.
Given 1,500 years of opposition to actual infinities, the burden of proof was on anyone advocating them. Bernard Bolzano and Georg Cantor accepted this burden in the 19th century. The key idea is to see a potentially infinite set as a variable quantity that is dependent on being abstracted from a pre-exisiting actually infinite set. Bolzano argued that the natural numbers should be conceived of as a set, a determinate set, not one with a variable number of elements. Cantor argued that any potential infinity must be interpreted as varying over a predefined fixed set of possible values, a set that is actually infinite. He put it this way:
In order for there to be a variable quantity in some mathematical study, the “domain” of its variability must strictly speaking be known beforehand through a definition. However, this domain cannot itself be something variable…. Thus this “domain” is a definite, actually infinite set of values. Thus each potential infinite…presupposes an actual infinite. (Cantor 1887)
From this standpoint, Dedekind’s 1872 axiom of continuity and his definition of real numbers as certain infinite subsets of rational numbers suggested to Cantor and then to many other mathematicians that arbitrarily large sets of rational numbers are most naturally seen to be subsets of an actually infinite set of rational numbers. The same can be said for sets of real numbers. An actually infinite set is what we today call a “transfinite set.” Cantor’s idea is then to treat a potentially infinite set as being a sequence of definite subsets of a transfinite set. Aristotle had said mathematicians need only the concept of a finite straight line that may be produced as far as they wish, or divided as finely as they wish, but Cantor would say that this way of thinking presupposes a completed infinite continuum from which that finite line is abstracted at any particular time.
[When Cantor says the mathematical concept of potential infinity presupposes the mathematical concept of actual infinity, this does not imply that, if future time were to be potentially infinite, then future time also would be actually infinite.]
Dedekind’s primary contribution to our topic was to give the first rigorous definition of infinite set—an actual infinity—showing that the notion is useful and not self-contradictory. Cantor provided the missing ingredient—that the mathematical line can fruitfully be treated as a dense linear ordering of uncountably many points, and he went on to develop set theory and to give the continuum a set-theoretic basis which convinced mathematicians that the concept was rigorously defined.
These ideas now form the basis of modern real analysis. The implication for the Achilles and Dichotomy paradoxes is that, once the rigorous definition of a linear continuum is in place, and once we have Cauchy’s rigorous theory of how to assess the value of an infinite series, then we can point to the successful use of calculus in physical science, especially in the treatment of time and of motion through space, and say that the sequence of intervals or paths described by Zeno is most properly treated as a sequence of subsets of an actually infinite set [that is, Aristotle’s potential infinity of places that Achilles reaches are really a variable subset of an already existing actually infinite set of point places], and we can be confident that Aristotle’s treatment of the paradoxes is inferior to the Standard Solution’s.
Zeno said Achilles cannot achieve his goal in a finite time, but there is no record of the details of how he defended this conclusion. He might have said the reason is (i) that there is no last goal in the sequence of sub-goals, or, perhaps (ii) that it would take too long to achieve all the sub-goals, or perhaps (iii) that covering all the sub-paths is too great a distance to run. Zeno might have offered all these defenses. In attacking justification (ii), Aristotle objects that, if Zeno were to confine his notion of infinity to a potential infinity and were to reject the idea of zero-length sub-paths, then Achilles achieves his goal in a finite time, so this is a way out of the paradox. However, an advocate of the Standard Solution says Achilles achieves his goal by covering an actual infinity of paths in a finite time, and this is the way out of the paradox. (The discussion of whether Achilles can properly be described as completing an actual infinity of tasks rather than goals will be considered in Section 5c.) Aristotle’s treatment of the paradoxes is basically criticized for being inconsistent with current standard real analysis that is based upon Zermelo Fraenkel set theory and its actually infinite sets. To summarize the errors of Zeno and Aristotle in the Achilles Paradox and in the Dichotomy Paradox, they both made the mistake of thinking that if a runner has to cover an actually infinite number of sub-paths to reach his goal, then he will never reach it; calculus shows how Achilles can do this and reach his goal in a finite time, and the fruitfulness of the tools of calculus imply that the Standard Solution is a better treatment than Aristotle’s.
Let’s turn to the other paradoxes. In proposing his treatment of the Paradox of the Large and Small and of the Paradox of Infinite Divisibility, Aristotle said that
…a line cannot be composed of points, the line being continuous and the point indivisible. (Physics, 231a25)
In modern real analysis, a continuum is composed of points, but Aristotle, ever the advocate of common sense reasoning, claimed that a continuum cannot be composed of points. Aristotle believed a line can be composed only of smaller, indefinitely divisible lines and not of points without magnitude. Similarly a distance cannot be composed of point places and a duration cannot be composed of instants. This is one of Aristotle’s key errors, according to advocates of the Standard Solution, because by maintaining this common sense view he created an obstacle to the fruitful development of real analysis. In addition to complaining about points, Aristotelians object to the idea of an actual infinite number of them.
In his analysis of the Arrow Paradox, Aristotle said Zeno mistakenly assumes time is composed of indivisible moments, but “This is false, for time is not composed of indivisible moments any more than any other magnitude is composed of indivisibles.” (Physics, 239b8-9) Zeno needs those instantaneous moments; that way Zeno can say the arrow does not move during the moment. Aristotle recommends not allowing Zeno to appeal to instantaneous moments and restricting Zeno to saying motion be divided only into a potential infinity of intervals. That restriction implies the arrow’s path can be divided only into finitely many intervals at any time. So, at any time, there is a finite interval during which the arrow can exhibit motion by changing location. So the arrow flies, after all. That is, Aristotle declares Zeno’s argument is based on false assumptions without which there is no problem with the arrow’s motion. However, the Standard Solution agrees with Zeno that time can be composed of indivisible moments or instants, and it implies that Aristotle has mis-diagnosed where the error lies in the Arrow Paradox. Advocates of the Standard Solution would add that allowing a duration to be composed of indivisible moments is what is needed for having a fruitful calculus, and Aristotle’s recommendation is an obstacle to the development of calculus.
Aristotle’s treatment of The Paradox of the Moving Rows is basically in agreement with the Standard Solution to that paradox–that Zeno did not appreciate the difference between speed and relative speed.
Regarding the Paradox of the Grain of Millet, Aristotle said that parts need not have all the properties of the whole, and so grains need not make sounds just because bushels of grains do. (Physics, 250a, 22) And if the parts make no sounds, we should not conclude that the whole can make no sound. It would have been helpful for Aristotle to have said more about what are today called the Fallacies of Division and Composition that Zeno is committing. However, Aristotle’s response to the Grain of Millet is brief but accurate by today’s standards.
In conclusion, are there two adequate but different solutions to Zeno’s paradoxes, Aristotle’s Solution and the Standard Solution? No. Aristotle’s treatment does not stand up to criticism in a manner that most scholars deem adequate. The Standard Solution uses contemporary concepts that have proved to be more valuable for solving and resolving so many other problems in mathematics and physics. Replacing Aristotle’s common sense concepts with the new concepts from real analysis and classical mechanics has been a key ingredient in the successful development of mathematics and science, and for this reason the vast majority of scientists, mathematicians, and philosophers reject Aristotle’s treatment. Nevertheless, there is a significant minority in the philosophical community who do not agree, as we shall see in the sections that follow.
See (Wallace2003) for a deeper treatment of Aristotle and how the development of the concept of infinity led to the standard solution to Zeno’s Paradoxes.
5. Other Issues Involving the Paradoxes
a. Consequences of Accepting the Standard Solution
There is a price to pay for accepting the Standard Solution to Zeno’s Paradoxes. The following–once presumably safe–intuitions or assumptions must be rejected:
A continuum is too smooth to be composed of indivisible points.
Runners do not have time to go to an actual infinity of places in a finite time.
The sum of an infinite series of positive terms is always infinite.
For each instant there is a next instant and for each place along a line there is a next place.
A finite distance along a line cannot contain an actually infinite number of points.
The more points there are on a line, the longer the line is.
It is absurd for there to be numbers that are bigger than every integer.
A one-dimensional curve can not fill a two-dimensional area, nor can an infinitely long curve enclose a finite area.
A whole is always greater than any of its parts.
Item (8) was undermined when it was discovered that the continuum implies the existence of fractal curves. However, the loss of intuition (1) has caused the greatest stir because so many philosophers object to a continuum being constructed from points. Aristotle had said, “Nothing continuous can be composed of things having no parts,” (Physics VI.3 234a 7-8). The Austrian philosopher Franz Brentano believed with Aristotle that scientific theories should be literal descriptions of reality, as opposed to today’s more popular view that theories are idealizations or approximations of reality. Continuity is something given in perception, said Brentano, and not in a mathematical construction; therefore, mathematics misrepresents. In a 1905 letter to Husserl, he said, “I regard it as absurd to interpret a continuum as a set of points.”
But the Standard Solution needs to be thought of as a package to be evaluated in terms of all of its costs and benefits. From this perspective the Standard Solution’s point-set analysis of continua has withstood the criticism and demonstrated its value in mathematics and mathematical physics. As a consequence, advocates of the Standard Solution say we must live with rejecting the eight intuitions listed above, and accept the counterintuitive implications such as there being divisible continua, infinite sets of different sizes, and space-filling curves. They agree with the philosopher W. V .O. Quine who demands that we be conservative when revising the system of claims that we believe and who recommends “minimum mutilation.” Advocates of the Standard Solution say no less mutilation will work satisfactorily.
b. Criticisms of the Standard Solution
Balking at having to reject so many of our intuitions, Henri-Louis Bergson, Max Black, Franz Brentano, L. E. J. Brouwer, Solomon Feferman, William James, Charles S. Peirce, James Thomson, Alfred North Whitehead, and Hermann Weyl argued in different ways that the standard mathematical account of continuity does not apply to physical processes, or is improper for describing those processes. Here are their main reasons: (1) the actual infinite cannot be encountered in experience and thus is unreal, (2) human intelligence is not capable of understanding motion, (3) the sequence of tasks that Achilles performs is finite and the illusion that it is infinite is due to mathematicians who confuse their mathematical representations with what is represented, (4) motion is unitary or “smooth” even though its spatial trajectory is infinitely divisible, (5) treating time as being made of instants is to treat time as static rather than as the dynamic aspect of consciousness that it truly is, (6) actual infinities and the contemporary continuum are not indispensable to solving the paradoxes, and (7) the Standard Solution’s implicit assumption of the primacy of the coherence of the sciences is unjustified because what is really primary is coherence with a priori knowledge and common sense.
See Salmon (1970, Introduction) and Feferman (1998) for a discussion of the controversy about the quality of Zeno’s arguments, and an introduction to its vast literature. This controversy is much less actively pursued in today’s mathematical literature, and hardly at all in today’s scientific literature. A minority of philosophers are actively involved in attempting to retain one or more of the eight intuitions listed in the previous section. An important philosophical issue is whether the paradoxes should be solved by the Standard Solution or instead by assuming that a line is not composed of points but of intervals, and whether use of infinitesimals is essential to a proper understanding of the paradoxes. For an example of how to solve Zeno’s Paradoxes without using the continuum and with retaining Democritus’ intuition that there is a lower limit to the divisibility of space, see “Atoms of Space” in Rovelli’s theory of loop quantum gravity (Rovelli 2017, pp. 169-171).
c. Supertasks and Infinity Machines
In Zeno’s Achilles Paradox, Achilles does not cover an infinite distance, but he does cover an infinite number of distances. In doing so, does he need to complete an infinite sequence of tasks or actions? In other words, assuming Achilles does complete the task of reaching the tortoise, does he thereby complete a supertask, a transfinite number of tasks in a finite time?
Bertrand Russell said “yes.” He argued that it is possible to perform a task in one-half minute, then perform another task in the next quarter-minute, and so on, for a full minute. At the end of the minute, an infinite number of tasks would have been performed. In fact, Achilles does this in catching the tortoise, Russell said. In the mid-twentieth century, Hermann Weyl, Max Black, James Thomson, and others objected, and thus began an ongoing controversy about the number of tasks that can be completed in a finite time.
That controversy has sparked a related discussion about whether there could be a machine that can perform an infinite number of tasks in a finite time. A machine that can is called an infinity machine. In 1954, in an effort to undermine Russell’s argument, the philosopher James Thomson described a lamp that is intended to be a typical infinity machine. Let the machine switch the lamp on for a half-minute; then switch it off for a quarter-minute; then on for an eighth-minute; off for a sixteenth-minute; and so on. Would the lamp be lit or dark at the end of minute? Thomson argued that it must be one or the other, but it cannot be either because every period in which it is off is followed by a period in which it is on, and vice versa, so there can be no such lamp, and the specific mistake in the reasoning was to suppose that it is logically possible to perform a supertask. The implication for Zeno’s paradoxes is that Thomson is denying Russell’s description of Achilles’ task as a supertask, as being the completion of an infinite number of sub-tasks in a finite time.
Paul Benacerraf (1962) complains that Thomson’s reasoning is faulty because it fails to notice that the initial description of the lamp determines the state of the lamp at each period in the sequence of switching, but it determines nothing about the state of the lamp at the limit of the sequence, namely at the end of the minute. The lamp could be either on or off at the limit. The limit of the infinite converging sequence is not in the sequence. So, Thomson has not established the logical impossibility of completing this supertask, but only that the setup’s description is not as complete as he had hoped.
Could some other argument establish this impossibility? Benacerraf suggests that an answer depends on what we ordinarily mean by the term “completing a task.” If the meaning does not require that tasks have minimum times for their completion, then maybe Russell is right that some supertasks can be completed, he says; but if a minimum time is always required, then Russell is mistaken because an infinite time would be required. What is needed is a better account of the meaning of the term “task.” Grünbaum objects to Benacerraf’s reliance on ordinary meaning. “We need to heed the commitments of ordinary language,” says Grünbaum, “only to the extent of guarding against being victimized or stultified by them.”
The Thomson Lamp Argument has generated a great literature in philosophy. Here are some of the issues. What is the proper definition of “task”? For example, does it require a minimum amount of time in the physicists’ technical sense of that term? Even if it is physically impossible to flip the switch in Thomson’s lamp because the speed of flipping the toggle switch will exceed the speed of light, suppose physics were different and there were no limit on speed; what then? Is the lamp logically impossible or physically impossible? Is the lamp metaphysically impossible? Was it proper of Thomson to suppose that the question of whether the lamp is lit or dark at the end of the minute must have a determinate answer? Does Thomson’s question have no answer, given the initial description of the situation, or does it have an answer which we are unable to compute? Should we conclude that it makes no sense to divide a finite task into an infinite number of ever shorter sub-tasks? Is there an important difference between completing a countable infinity of tasks and completing an uncountable infinity of tasks? Interesting issues arise when we bring in Einstein’s theory of relativity and consider a bifurcated supertask. This is an infinite sequence of tasks in a finite interval of an external observer’s proper time, but not in the machine’s own proper time. See Earman and Norton (1996) for an introduction to the extensive literature on these topics. Unfortunately, there is no agreement in the philosophical community on most of the questions we’ve just entertained.
d. Constructivism
The spirit of Aristotle’s opposition to actual infinities persists today in the philosophy of mathematics called constructivism. Constructivism is not a precisely defined position, but it implies that acceptable mathematical objects and procedures have to be founded on constructions and not, say, on assuming the object does not exist, then deducing a contradiction from that assumption. Most constructivists believe acceptable constructions must be performable ideally by humans independently of practical limitations of time or money. So they would say potential infinities, recursive functions, mathematical induction, and Cantor’s diagonal argument are constructive, but the following are not: The axiom of choice, the law of excluded middle, the law of double negation, completed infinities, and the classical continuum of the Standard Solution. The implication is that Zeno’s Paradoxes were not solved correctly by using the methods of the Standard Solution. More conservative constructionists, the finitists, would go even further and reject potential infinities because of the human being’s finite computational resources, but this conservative sub-group of constructivists is very much out of favor.
L. E. J. Brouwer’s intuitionism was the leading constructivist theory of the early 20th century. In response to suspicions raised by the discovery of Russell’s Paradox and the introduction into set theory of the controversial non-constructive axiom of choice, Brouwer attempted to place mathematics on what he believed to be a firmer epistemological foundation by arguing that mathematical concepts are admissible only if they can be constructed from, and thus grounded in, an ideal mathematician’s vivid temporal intuitions, which are a priori intuitions of time.
Brouwer’s intuitionistic continuum has the Aristotelian property of unsplitability. What this means is that, unlike the Standard Solution’s set-theoretic composition of the continuum which allows, say, the closed interval of real numbers from zero to one to be split or cut into (that is, be the union of sets of) those numbers in the interval that are less than one-half and those numbers in the interval that are greater than or equal to one-half, the corresponding closed interval of the intuitionistic continuum cannot be split this way into two disjoint sets. This unsplitability or inseparability agrees in spirit with Aristotle’s idea of the continuity of a real continuum, but disagrees in spirit with Aristotle’s idea of not allowing the continuum to be composed of points. [For more on this topic, see Posy (2005) pp. 346-7.]
Although everyone agrees that any legitimate mathematical proof must use only a finite number of steps and be constructive in that sense, the majority of mathematicians in the first half of the twentieth century claimed that constructive mathematics could not produce an adequate theory of the continuum because essential theorems would no longer be theorems, and constructivist principles and procedures are too awkward to use successfully. In 1927, David Hilbert exemplified this attitude when he objected that Brouwer’s restrictions on allowable mathematics—such as rejecting proof by contradiction—were like taking the telescope away from the astronomer.
But thanks in large part to the later development of constructive mathematics by Errett Bishop and Douglas Bridges in the second half of the 20th century, most contemporary philosophers of mathematics believe the question of whether constructivism could be successful in the sense of producing an adequate theory of the continuum is still open [see Wolf (2005) p. 346, and McCarty (2005) p. 382], and to that extent so is the question of whether the Standard Solution to Zeno’s Paradoxes needs to be rejected or perhaps revised to embrace constructivism. Frank Arntzenius (2000), Michael Dummett (2000), and Solomon Feferman (1998) have done important philosophical work to promote the constructivist tradition. Nevertheless, the vast majority of today’s practicing mathematicians routinely use nonconstructive mathematics.
e. Nonstandard Analysis
Although Zeno and Aristotle had the concept of small, they did not have the concept of infinitesimally small, which is the informal concept that was used by Leibniz (and Newton) in the development of calculus. In the 19th century, infinitesimals were eliminated from the standard development of calculus due to the work of Cauchy and Weierstrass on defining a derivative in terms of limits using the epsilon-delta method. But in 1881, C. S. Peirce advocated restoring infinitesimals because of their intuitive appeal. Unfortunately, he was unable to work out the details, as were all mathematicians—until 1960 when Abraham Robinson produced his nonstandard analysis. At this point in time it was no longer reasonable to say that banishing infinitesimals from analysis was an intellectual advance. What Robinson did was to extend the standard real numbers to include infinitesimals, using this definition: h is infinitesimal if and only if its absolute value is less than 1/n, for every positive standard number n. Robinson went on to create a nonstandard model of analysis using hyperreal numbers. The class of hyperreal numbers contains counterparts of the reals, but in addition it contains any number that is the sum, or difference, of both a standard real number and an infinitesimal number, such as 3 + h and 3 – 4h2. The reciprocal of an infinitesimal is an infinite hyperreal number. These hyperreals obey the usual rules of real numbers except for the Archimedean axiom. Infinitesimal distances between distinct points are allowed, unlike with standard real analysis. The derivative is defined in terms of the ratio of infinitesimals, in the style of Leibniz, rather than in terms of a limit as in standard real analysis in the style of Weierstrass.
Nonstandard analysis is called “nonstandard” because it was inspired by Thoralf Skolem’s demonstration in 1933 of the existence of models of first-order arithmetic that are not isomorphic to the standard model of arithmetic. What makes them nonstandard is especially that they contain infinitely large (hyper)integers. For nonstandard calculus one needs nonstandard models of real analysis rather than just of arithmetic. An important feature demonstrating the usefulness of nonstandard analysis is that it achieves essentially the same theorems as those in classical calculus. The treatment of Zeno’s paradoxes is interesting from this perspective. See McLaughlin (1994) for how Zeno’s paradoxes may be treated using infinitesimals. McLaughlin believes this approach to the paradoxes is the only successful one, but commentators generally do not agree with that conclusion, and consider it merely to be an alternative solution. See Dainton (2010) pp. 306-9 for some discussion of this.
f. Smooth Infinitesimal Analysis
Abraham Robinson in the 1960s resurrected the infinitesimal as an infinitesimal number, but F. W. Lawvere in the 1970s resurrected the infinitesimal as an infinitesimal magnitude. His work is called “smooth infinitesimal analysis” and is part of “synthetic differential geometry.” In smooth infinitesimal analysis, a curved line is composed of infinitesimal tangent vectors. One significant difference from a nonstandard analysis, such as Robinson’s above, is that all smooth curves are straight over infinitesimal distances, whereas Robinson’s can curve over infinitesimal distances. In smooth infinitesimal analysis, Zeno’s arrow does not have time to change its speed during an infinitesimal interval. Smooth infinitesimal analysis retains the intuition that a continuum should be smoother than the continuum of the Standard Solution. Unlike both standard analysis and nonstandard analysis whose real number systems are set-theoretical entities and are based on classical logic, the real number system of smooth infinitesimal analysis is not a set-theoretic entity but rather an object in a topos of category theory, and its logic is intuitionist (Harrison, 1996, p. 283). Like Robinson’s nonstandard analysis, Lawvere’s smooth infinitesimal analysis may also be a promising approach to a foundation for real analysis and thus to solving Zeno’s paradoxes, but there is no consensus that Zeno’s Paradoxes need to be solved this way. For more discussion see note 11 in Dainton (2010) pp. 420-1.
6. The Legacy and Current Significance of the Paradoxes
What influence has Zeno had? He had none in the East, but in the West there has been continued influence and interest up to today.
Let’s begin with his influence on the ancient Greeks. Before Zeno, philosophers expressed their philosophy in poetry, and he was the first philosopher to use prose arguments. This new method of presentation was destined to shape almost all later philosophy, mathematics, and science. Zeno drew new attention to the idea that the way the world appears to us is not how it is in reality. Zeno probably also influenced the Greek atomists to accept atoms. Aristotle was influenced by Zeno to use the distinction between actual and potential infinity as a way out of the paradoxes, and careful attention to this distinction has influenced mathematicians ever since. The proofs in Euclid’s Elements, for example, used only potentially infinite procedures. Awareness of Zeno’s paradoxes made Greek and all later Western intellectuals more aware that mistakes can be made when thinking about infinity, continuity, and the structure of space and time, and it made them wary of any claim that a continuous magnitude could be made of discrete parts. ”Zeno’s arguments, in some form, have afforded grounds for almost all theories of space and time and infinity which have been constructed from his time to our own,” said Bertrand Russell in the twentieth century.
There is controversy in 20th and 21st century literature about whether Zeno developed any specific, new mathematical techniques. Most scholars say the conscious use of the method of indirect argumentation arose in both mathematics and Zeno’s philosophy independently of each other. See Hintikka (1978) for a discussion of this controversy about origins. Everyone agrees the method was Greek and not Babylonian, as was the method of proving something by deducing it from explicitly stated assumptions. G. E. L. Owen (Owen 1958, p. 222) argued that Zeno influenced Aristotle’s concept of there being no motion at an instant, which implies there is no instant when a body begins to move, nor an instant when a body changes its speed. Consequently, says Owen, Aristotle’s conception is an obstacle to a Newton-style concept of acceleration, and this hindrance is “Zeno’s major influence on the mathematics of science.” Other commentators consider Owen’s remark to be slightly harsh regarding Zeno because, they ask, if Zeno had not been born, would Aristotle have been likely to develop any other concept of motion?
Zeno’s paradoxes have received some explicit attention from scholars throughout later centuries. Pierre Gassendi in the early 17th century mentioned Zeno’s paradoxes as the reason to claim that the world’s atoms must not be infinitely divisible. Pierre Bayle’s 1696 article on Zeno drew the skeptical conclusion that, for the reasons given by Zeno, the concept of space is contradictory. In the early 19th century, Hegel suggested that Zeno’s paradoxes supported his view that reality is inherently contradictory.
Zeno’s paradoxes caused mistrust in infinites, and this mistrust has influenced the contemporary movements of constructivism, finitism, and nonstandard analysis, all of which affect the treatment of Zeno’s paradoxes. Dialetheism, the acceptance of true contradictions via a paraconsistent formal logic, provides a newer, although unpopular, response to Zeno’s paradoxes, but dialetheism was not created specifically in response to worries about Zeno’s paradoxes. With the introduction in the 20th century of thought experiments about supertasks, interesting philosophical research has been directed towards understanding what it means to complete a task.
Zeno’s paradoxes are often pointed to for a case study in how a philosophical problem has been solved, even though the solution took over two thousand years to materialize.
So, Zeno’s paradoxes have had a wide variety of impacts upon subsequent research. Little research today is involved directly in how to solve the paradoxes themselves, especially in the fields of mathematics and science, although discussion continues in philosophy, primarily on whether a continuous magnitude should be composed of discrete magnitudes, such as whether a line should be composed of points. If there are alternative treatments of Zeno’s paradoxes, then this raises the issue of whether there is a single solution to the paradoxes or several solutions or one best solution. The answer to whether the Standard Solution is the correct solution to Zeno’s paradoxes may also depend on whether the best physics of the future that reconciles the theories of quantum mechanics and general relativity will require us to assume spacetime is composed at its most basic level of points, or, instead, of regions or loops or something else.
From the perspective of the Standard Solution, the most significant lesson learned by researchers who have tried to solve Zeno’s paradoxes is that the way out requires revising many of our old theories and their concepts. We have to be willing to rank the virtues of preserving logical consistency and promoting scientific fruitfulness above the virtue of preserving our intuitions. Zeno played a significant role in causing this progressive trend.
7. References and Further Reading
Arntzenius, Frank. (2000) “Are there Really Instantaneous Velocities?”, The Monist 83, pp. 187-208.
Examines the possibility that a duration does not consist of points, that every part of time has a non-zero size, that real numbers cannot be used as coordinates of times, and that there are no instantaneous velocities at a point.
Barnes, J. (1982). The Presocratic Philosophers, Routledge & Kegan Paul: Boston.
A well respected survey of the philosophical contributions of the Pre-Socratics.
Barrow, John D. (2005). The Infinite Book: A Short Guide to the Boundless, Timeless and Endless, Pantheon Books, New York.
A popular book in science and mathematics introducing Zeno’s Paradoxes and other paradoxes regarding infinity.
Benacerraf, Paul (1962). “Tasks, Super-Tasks, and the Modern Eleatics,” The Journal of Philosophy, 59, pp. 765-784.
An original analysis of Thomson’s Lamp and supertasks.
Bergson, Henri (1946). Creative Mind, translated by M. L. Andison. Philosophical Library: New York.
Bergson demands the primacy of intuition in place of the objects of mathematical physics.
Black, Max (1950-1951). “Achilles and the Tortoise,” Analysis 11, pp. 91-101.
A challenge to the Standard Solution to Zeno’s paradoxes. Blacks agrees that Achilles did not need to complete an infinite number of sub-tasks in order to catch the tortoise.
Cajori, Florian (1920). “The Purpose of Zeno’s Arguments on Motion,” Isis, vol. 3, no. 1, pp. 7-20.
An analysis of the debate regarding the point Zeno is making with his paradoxes of motion.
Cantor, Georg (1887). “Über die verschiedenen Ansichten in Bezug auf die actualunendlichen Zahlen.” Bihang till Kongl. Svenska Vetenskaps-Akademien Handlingar , Bd. 11 (1886-7), article 19. P. A. Norstedt & Sôner: Stockholm.
A very early description of set theory and its relationship to old ideas about infinity.
Chihara, Charles S. (1965). “On the Possibility of Completing an Infinite Process,” Philosophical Review 74, no. 1, p. 74-87.
An analysis of what we mean by “task.”
Copleston, Frederick, S.J. (1962). “The Dialectic of Zeno,” chapter 7 of A History of Philosophy, Volume I, Greece and Rome, Part I, Image Books: Garden City.
Copleston says Zeno’s goal is to challenge the Pythagoreans who denied empty space and accepted pluralism.
Dainton, Barry. (2010). Time and Space, Second Edition, McGill-Queens University Press: Ithaca.
Chapters 16 and 17 discuss Zeno’s Paradoxes.
Dauben, J. (1990). Georg Cantor, Princeton University Press: Princeton.
Contains Kronecker’s threat to write an article showing that Cantor’s set theory has “no real significance.” Ludwig Wittgenstein was another vocal opponent of set theory.
De Boer, Jesse (1953). “A Critique of Continuity, Infinity, and Allied Concepts in the Natural Philosophy of Bergson and Russell,” in Return to Reason: Essays in Realistic Philosophy, John Wild, ed., Henry Regnery Company: Chicago, pp. 92-124.
A philosophical defense of Aristotle’s treatment of Zeno’s paradoxes.
Diels, Hermann and W. Kranz (1951). Die Fragmente der Vorsokratiker, sixth ed., Weidmannsche Buchhandlung: Berlin.
A standard edition of the pre-Socratic texts.
Dummett, Michael (2000). “Is Time a Continuum of Instants?,” Philosophy, 2000, Cambridge University Press: Cambridge, pp. 497-515.
Promoting a constructive foundation for mathematics, Dummett’s formalism implies there are no instantaneous instants, so times must have rational values rather than real values. Times have only the values that they can in principle be measured to have; and all measurements produce rational numbers within a margin of error.
Earman J. and J. D. Norton (1996). “Infinite Pains: The Trouble with Supertasks,” in Paul Benacerraf: the Philosopher and His Critics, A. Morton and S. Stich (eds.), Blackwell: Cambridge, MA, pp. 231-261.
A criticism of Thomson’s interpretation of his infinity machines and the supertasks involved, plus an introduction to the literature on the topic.
Feferman, Solomon (1998). In the Light of Logic, Oxford University Press, New York.
A discussion of the foundations of mathematics and an argument for semi-constructivism in the tradition of Kronecker and Weyl, that the mathematics used in physical science needs only the lowest level of infinity, the infinity that characterizes the whole numbers. Presupposes considerable knowledge of mathematical logic.
Freeman, Kathleen (1948). Ancilla to the Pre-Socratic Philosophers, Harvard University Press: Cambridge, MA. Reprinted in paperback in 1983.
One of the best sources in English of primary material on the Pre-Socratics.
Grünbaum, Adolf (1967). Modern Science and Zeno’s Paradoxes, Wesleyan University Press: Middletown, Connecticut.
A detailed defense of the Standard Solution to the paradoxes.
Grünbaum, Adolf (1970). “Modern Science and Zeno’s Paradoxes of Motion,” in (Salmon, 1970), pp. 200-250.
An analysis of arguments by Thomson, Chihara, Benacerraf and others regarding the Thomson Lamp and other infinity machines.
Hamilton, Edith and Huntington Cairns (1961). The Collected Dialogues of Plato Including the Letters, Princeton University Press: Princeton.
Harrison, Craig (1996). “The Three Arrows of Zeno: Cantorian and Non-Cantorian Concepts of the Continuum and of Motion,” Synthese, Volume 107, Number 2, pp. 271-292.
Considers smooth infinitesimal analysis as an alternative to the classical Cantorian real analysis of the Standard Solution.
Heath, T. L. (1921). A History of Greek Mathematics, Vol. I, Clarendon Press: Oxford. Reprinted 1981.
Promotes the minority viewpoint that Zeno had a direct influence on Greek mathematics, for example by eliminating the use of infinitesimals.
Hintikka, Jaakko, David Gruender and Evandro Agazzi. Theory Change, Ancient Axiomatics, and Galileo’s Methodology, D. Reidel Publishing Company, Dordrecht.
A collection of articles that discuss, among other issues, whether Zeno’s methods influenced the mathematicians of the time or whether the influence went in the other direction. See especially the articles by Karel Berka and Wilbur Knorr.
Kirk, G. S., J. E. Raven, and M. Schofield, eds. (1983). The Presocratic Philosophers: A Critical History with a Selection of Texts, Second Edition, Cambridge University Press: Cambridge.
A good source in English of primary material on the Pre-Socratics with detailed commentary on the controversies about how to interpret various passages.
Maddy, Penelope (1992) “Indispensability and Practice,” Journal of Philosophy 59, pp. 275-289.
Explores the implication of arguing that theories of mathematics are indispensable to good science, and that we are justified in believing in the mathematical entities used in those theories.
Matson, Wallace I (2001). “Zeno Moves!” pp. 87-108 in Essays in Ancient Greek Philosophy VI: Before Plato, ed. by Anthony Preus, State University of New York Press: Albany.
Matson supports Tannery’s non-classical and minority interpretation that Zeno’s purpose was to show only that the opponents of Parmenides are committed to absurdly denying motion, and that Zeno himself never denied motion, nor did Parmenides.
McCarty, D.C. (2005). “Intuitionism in Mathematics,” in The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro, Oxford University Press, Oxford, pp. 356-86.
Argues that a declaration of death of the program of founding mathematics on an intuitionistic basis is premature.
McLaughlin, William I. (1994). “Resolving Zeno’s Paradoxes,” Scientific American, vol. 271, no. 5, Nov., pp. 84-90.
How Zeno’s paradoxes may be explained using a contemporary theory of Leibniz’s infinitesimals.
Owen, G.E.L. (1958). “Zeno and the Mathematicians,” Proceedings of the Aristotelian Society, New Series, vol. LVIII, pp. 199-222.
Argues that Zeno and Aristotle negatively influenced the development of the Renaissance concept of acceleration that was used so fruitfully in calculus.
Posy, Carl. (2005). “Intuitionism and Philosophy,” in The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro, Oxford University Press, Oxford, pp. 318-54.
Contains a discussion of how the unsplitability of Brouwer’s intuitionistic continuum makes precise Aristotle’s notion that “you can’t cut a continuous medium without some of it clinging to the knife,” on pages 345-7.
Proclus (1987). Proclus’ Commentary on Plato’s Parmenides, translated by Glenn R. Morrow and John M. Dillon, Princeton University Press: Princeton.
A detailed list of every comment made by Proclus about Zeno is available with discussion starting on p. xxxix of the Introduction by John M. Dillon. Dillon focuses on Proclus’ comments which are not clearly derivable from Plato’s Parmenides, and concludes that Proclus had access to other sources for Zeno’s comments, most probably Zeno’s original book or some derivative of it. William Moerbeke’s overly literal translation in 1285 from Greek to Latin of Proclus’ earlier, but now lost, translation of Plato’s Parmenides is the key to figuring out the original Greek. (see p. xliv)
Rescher, Nicholas (2001). Paradoxes: Their Roots, Range, and Resolution, Carus Publishing Company: Chicago.
Pages 94-102 apply the Standard Solution to all of Zeno’s paradoxes. Rescher calls the Paradox of Alike and Unlike the “Paradox of Differentiation.”
Rovelli, Carlo (2017). Reality is Not What It Seems: The Journey to Quantum Gravity, Riverhead Books: New York.
Rovelli’s chapter 6 explains how the theory of loop quantum gravity provides a new solution to Zeno’s Paradoxes that is more in tune with the intuitions of Democritus because it rejects the assumption that a bit of space can always be subdivided.
Russell, Bertrand (1914). Our Knowledge of the External World as a Field for Scientific Method in Philosophy, Open Court Publishing Co.: Chicago.
Russell champions the use of contemporary real analysis and physics in resolving Zeno’s paradoxes.
Salmon, Wesley C., ed. (1970). Zeno’s Paradoxes, The Bobbs-Merrill Company, Inc.: Indianapolis and New York. Reprinted in paperback in 2001.
A collection of the most influential articles about Zeno’s Paradoxes from 1911 to 1965. Salmon provides an excellent annotated bibliography of further readings.
Szabo, Arpad (1978). The Beginnings of Greek Mathematics, D. Reidel Publishing Co.: Dordrecht.
Contains the argument that Parmenides discovered the method of indirect proof by using it against Anaximenes’ cosmogony, although it was better developed in prose by Zeno. Also argues that Greek mathematicians did not originate the idea but learned of it from Parmenides and Zeno. (pp. 244-250). These arguments are challenged in Hntikka (1978).
Tannery, Paul (1885). “‘Le Concept Scientifique du continu: Zenon d’Elee et Georg Cantor,” pp. 385-410 of Revue Philosophique de la France et de l’Etranger, vol. 20, Les Presses Universitaires de France: Paris.
This mathematician gives the first argument that Zeno’s purpose was not to deny motion but rather to show only that the opponents of Parmenides are committed to denying motion.
Tannery, Paul (1887). Pour l’Histoire de la Science Hellène: de Thalès à Empédocle, Alcan: Paris. 2nd ed. 1930.
More development of the challenge to the classical interpretation of what Zeno’s purposes were in creating his paradoxes.
Thomson, James (1954-1955). “Tasks and Super-Tasks,” Analysis, XV, pp. 1-13.
A criticism of supertasks. The Thomson Lamp thought-experiment is used to challenge Russell’s characterization of Achilles as being able to complete an infinite number of tasks in a finite time.
Tiles, Mary (1989). The Philosophy of Set Theory: An Introduction to Cantor’s Paradise, Basil Blackwell: Oxford.
A philosophically oriented introduction to the foundations of real analysis and its impact on Zeno’s paradoxes.
Vlastos, Gregory (1967). “Zeno of Elea,” in The Encyclopedia of Philosophy, Paul Edwards (ed.), The Macmillan Company and The Free Press: New York.
A clear, detailed presentation of the paradoxes. Vlastos comments that Aristotle does not consider any other treatment of Zeno’s paradoxes than by recommending replacing Zeno’s actual infinities with potential infinites, so we are entitled to assert that Aristotle probably believed denying actual infinities is the only route to a coherent treatment of infinity. Vlastos also comments that “there is nothing in our sources that states or implies that any development in Greek mathematics (as distinct from philosophical opinions about mathematics) was due to Zeno’s influence.”==
Vlastos, Gregory (1967). “Zeno of Elea,” in The Encyclopedia of Philosophy, Paul Edwards (ed.), The Macmillan Company and The Free Press: New York.
A clear, detailed presentation of the paradoxes. Vlastos comments that Aristotle does not consider any other treatment of Zeno’s paradoxes than by recommending replacing Zeno’s actual infinities with potential infinites, so we are entitled to assert that Aristotle probably believed denying actual infinities is the only route to a coherent treatment of infinity. Vlastos also comments that “there is nothing in our sources that states or implies that any development in Greek mathematics (as distinct from philosophical opinions about mathematics) was due to Zeno’s influence.”==
Wallace, David Foster. (2003). A Compact History of ∞, W. W. Norton and Company: New York.
A clear and sophisticated treatment of how a deeper understanding of infinity led to the solution to Zeno’s Paradoxes. Highly recommended.
White, M. J. (1992). The Continuous and the Discrete: Ancient Physical Theories from a Contemporary Perspective, Clarendon Press: Oxford.
A presentation of various attempts to defend finitism, neo-Aristotelian potential infinities, and the replacement of the infinite real number field with a finite field.
Wisdom, J. O. (1953). “Berkeley’s Criticism of the Infinitesimal,” The British Journal for the Philosophy of Science, Vol. 4, No. 13, pp. 22-25.
Wisdom clarifies the issue behind George Berkeley’s criticism (in 1734 in The Analyst) of the use of the infinitesimal (fluxion) by Newton and Leibniz. See also the references there to Wisdom’s other three articles on this topic in the journal Hermathena in 1939, 1941 and 1942.
Wolf, Robert S. (2005). A Tour Through Mathematical Logic, The Mathematical Association of America: Washington, DC.
Chapter 7 surveys nonstandard analysis, and Chapter 8 surveys constructive mathematics, including the contributions by Errett Bishop and Douglas Bridges.
Author Information
Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.
José Ortega y Gasset (1883—1955)
In the roughly 6,000 pages that Spanish philosopher José Ortega y Gasset wrote on the humanities, he covered a wide variety of topics. This captures the kind of thinker he was: one who cannot be strictly categorized to any one school of philosophy. José Ortega y Gasset did not want to constrain himself to any one area of study in his unending dialogue to better understand what was of central importance to him: what it means to be human. He wrote on philosophy, history, literary criticism, sociology, travel writing, the philosophy of life, history, phenomenology, society, politics, the press, and the novel, to name some of the varied topics he explored. He held various identities: he was a philosopher, educator, essayist, theorist, critic, editor, and politician. He did not strive to be a “professional philosopher”; rather, he aimed to be a ‘philosophical essayist.’ While there were many reasons for this, one of central importance was his hope that with shorter texts, he could reach more people. He wanted to have this dialogue not only with influential thinkers from the past, but with his readers as well. Ortega was not only one of the most important philosophers of the twentieth century from continental Europe, but he also had an important impact on Latin American philosophy, most especially in introducing existentialism and perspectivism.
José Ortega y Gasset was born in 1883 in Madrid and died there in 1955 after spending many years of his life in various other countries. Throughout his life, Ortega was involved in the newspaper industry. From an early age he was exposed to what it took to run and write for a newspaper, which arguably had an important impact on his writing style. His grandfather was the founder of what was for a time a renowned daily paper, El Imparcial, and for which Ortega wrote his first article in 1904—the same year he received his Doctorate in Philosophy from the Central University of Madrid. His dissertation was titled The Terrors of Year One Thousand, and in it we see an early interest in a topic that he would explore profoundly: the philosophy on history. While he was finishing his dissertation, he also met his wife, Rose Spottorno, with whom he had three children.
Ortega spent time abroad in Germany, France, Argentina, and Portugal. Some of those years were spent in exile, as he was a staunch critic of Spanish politics across the spectrum. Though he wavered at times as to which political philosophy he was most vocal about, he was quite vociferous against both communism and fascism. Thus, it is not always clear what Ortega’s political views were, and he was also at times misappropriated by some important politicians of the time. For example, José Antonio Primo de Rivera, the son of the military dictator Miguel Primo de Rivera and founder of the Falange, the Spanish Fascist party, arguably greatly misappropriated Ortega’s political philosophy to best suit his own needs. For a time, Ortega supported Rivera, but he came to be vehemently opposed to any kind of one-man rule. He was also initially in support of the Falange and their leader, General Francisco Franco, but eventually also became very disillusioned with them. However, during the Spanish Civil War from 1936-1939, he remained quite silent, probably to ultimately express his dissatisfaction with the aims of both sides. Still, given the ambiguities in his writings, they were misappropriated by both ends of the political spectrum in Spain. This confusion can also be seen in comments such as his ‘socialist leanings for the love of aristocracy.’ Ortega retired from politics in 1933, as he was ultimately more interested in bringing about social and cultural change through education. After 1936, the great majority of his writing was of a philosophical nature.
Being too silent on certain issues, such as on the Spanish Civil War and on Hitler and the Nazis, also brought him some controversy. He did have some bouts of depression, which may have coincided at times with this lack of commentary on the Second World War. At times he longed to be in a place that enabled him to experience some sort of neutrality, which is something that drew him to Argentina. From 1936 until his death in 1955, he suffered from poor health and his productivity declined dramatically. Still, his lack of outspoken criticism of Nazism has not been fully explained. War was one of the few central topics of his day that he wrote little on, because he presumably held the position that words cannot compete with weapons in a time of war.
He also had periods in which he leaned toward socialism. But, essentially, none of the traditional or dominant political views of the time would suffice, and ultimately, he promoted his own version of a meritocracy given his dissatisfaction with democracy, capitalism, bolshevism, fascism, and his revulsion of the type of mass-person that had developed during his lifetime. Despite much confusion regarding his political views, they can perhaps be summarized best as the promotion of a cultured minority in which economic and intellectual benefits trickle down to the rest of society. Possibly, the best classification would be that of a selective individual, a meritocratic-based version of liberalism.
In 1905 he began the first of three periods of study in Germany, which was an eight-month stay at the University of Leipzig. In 1907 he returned to Germany with a stipend and began his studies at the University of Berlin. Sixth months later he went to the University of Marburg, and this experience was particularly influential. He was initially quite drawn to the Neo-Kantianism prevalent there from his studies under Hermann Cohen and Paul Natorp. This influence of idealism is quite prevalent in his first book, Meditations on Quixote, which he published at the age of thirty-two (but he would later critique idealism strongly). It was also during this time that he discovered Husserl’s phenomenology and his distinct concept of consciousness, which would have an important impact on his philosophical perspective as both an influence and as a critique.
In 1910 he returned to the University of Madrid as a professor of metaphysics, and in that same year he married his wife, Rosa Spottorno. This new position was interrupted for a third trip to Germany in 1911, which was both an opportunity for a honeymoon and to continue his studies in Marburg. Ortega and Spottorno’s first son, whom they named Miguel Alemán, was born during this last extended period abroad in Germany. Miguel’s second name translates as “German,” which shows Ortega’s great interest in the nation, as it would come to serve as a model state for him in many ways. Ortega was firmly focused on his goal to modernize Spain, which he saw as greatly behind many other European nations.
From 1932 until the beginning of the Spanish Civil War in 1936, he made a shift away from idealism to emphasize an “I” that is inextricably immersed in circumstances. He developed a new focus for his philosophy: that of historical reason, a position greatly influenced by one of his most admired thinkers, Wilhelm Dilthey. His objective was to develop a philosophy that was neither idealist nor realist.
The rise of the Spanish dictator Francisco Franco at the end of the Civil War in 1939 was the main reason for his voluntary exile in Argentina and Portugal until 1945. His return to Spain thereafter was not a peaceful one. He had made political enemies across the spectrum, and, as a result, he struggled to write and teach freely. He decided to continue to travel and lecture elsewhere. During these later years he also received two honorary Doctorates: one from the University of Marburg, and another from the University of Glasgow. Ortega suffered from poor health, especially in the last couple decades of his life, which prevented him from traveling more extensively including for an invitation to teach at Harvard. He did make his first trip to the United States in 1949 when he spoke at a conference in Aspen on Goethe. In 1951 he participated in a conference with Heidegger in Darmstadt. He gave his last lecture in Venice in 1955 before succumbing to stomach and liver cancer.
Ortega was a prolific writer—in total, his works cover about 6,000 pages. Most of these works span from the year 1914, when he published his first book, Meditations on Quixote, to the 1940s, but there were also several important posthumous publications. Ortega cannot be readily classified because he wrote about such a broad range of subjects that included philosophy, history, literary criticism, sociology, and travel writing. He was a philosopher, educator, essayist, theorist, critic, editor, and politician. He wrote on the philosophy of life; human life is central in his thought. Some of the varied topics he explored were the human body, history and its categories, phenomenology, society, politics, the press, and the novel. Always part of a newspaper and magazine family, Ortega was primarily an essayist—for this reason, some label him as being a writer of “philosophical journalism.” One of his main goals was to have a dialogue with his readers.
While he did not claim to adhere to any one philosophical movement, given the important role that history plays in his philosophy, we should certainly not deny the influence of other thinkers. Influences on Ortega include Neo-Kantianism, which he studied in Marburg with Hermann Cohen and Paul Natorp, as well as the phenomenology of Edmund Husserl. Additional crucial influences include Wilhelm Dilthey especially, as well as Gottfried Leibniz, Friedrich Nietzsche, Johann Fichte, Georg Hegel, Franz Brentano, Georg Simmel, Benedetto Croce, R. G. Collingwood, and in his country of Spain, Miguel de Unamuno, Francisco Giner de los Ríos, Joaquín Costa Azorín, Ramiro de Maeztu, and Pío Baroja, to note some key figures.
Ortega himself left a lasting impact on other important Spanish intellectuals, such as Ramón Pérez de Ayala, Eugenio d’Ors, Américo Castro, and Julian Marías. There were several disciples of his, including Eugenio Imaz, José Gaos, Ignacio Ellacuría, Joaquín Xirau, Eduardo Nicol, José Ferrater Mora, María Zambrano, and Antonio Rodríguez Huescar, who also immigrated to the Latin American countries of Mexico, Venezuela, El Salvador, and Argentina, continuing to add to the philosophical landscape abroad.
In Latin America, there were several important thinkers and historical figures directly influenced by Ortega, such as Samuel Ramos and Leopoldo Zea in Mexico, Luis Recasens Siches from Guatemala—and even the Puerto Rican politician Jaime Benítez (1908-2001) wanted his nation to be like an “Orteguian Weimar.”
2. Philosophy of Life
For Ortega, the activity of philosophy is intimately connected to human life, and metaphysics is a central source of study for how human beings address existential concerns. The first and therefore radical reality is the self, living, and all else radiates from this. The task of the philosopher is to study this radical reality. In Ortega’s philosophy, metaphysics is a ‘construction of the world,’ and this construction is done within circumstances. The world is not given to us ready-made; it is an interpretation made within human life. No matter how unrefined or inaccurate our interpretations may be, we must make them. This interpretation is the resolution of the way in which humankind needs to navigate his or her circumstances. This is a problem of absolute insecurity that must be solved. We may not be able to choose all our circumstances, but we are free in our actions, in the choice of possibilities that lie before us—this is what most strikingly makes us different, he says, from animals, plants, or stones. This emphasis on (limited) human freedom and choice is a principal reason why he is often classified as an “existentialist thinker.” Humankind must make his or her world to save themselves and install themselves within it, he argues. And nobody but the sole individual can do this.
Metaphysics consists of individuals working out the radical orientation of their situation, because life is disorientation; life is a problem that needs to be solved. The human individual comes first, as he argues that to understand “what are things,” we need to first understand “what am I.” The radical reality is the life of each individual, which is each individual’s “I” inextricably living within circumstances. This distinction also marks a break with an earlier influence of phenomenology. For Ortega, human reality is not solipsistic, as many critiqued was the case with Husserl’s method (though Husserl did try to respond to this as a misreading of his view). But neither did Ortega fully reject phenomenology, as it continued to resemble his view on how we are our beliefs—a central position elaborated on further ahead.
The individual human life is the fundamental reality in Ortega’s philosophy. There is no absolute reason (or at least that we can know), there is only reason that stems directly from the individual human life. We can never escape our life just like we can never escape our shadow. There is no absolute, objective truth (or at least that we can know), there is just the perspective of the individual human life. This of course raises the critique that it is contradictory to claim that there is no absolute, as that very idea that there are no absolutes is an absolute. However, what Ortega seems to argue is not a denial of the possibility for the existence of objective truths or absolutes, rather just that, at least for the time being, we cannot know anything beyond each of our own perspectives. Moreover, he is not a staunch relativist, as he argues that there is a hierarchy of perspectives. Life, which is always “my life,” what I call my “I,” is a vital activity that also always finds itself in circumstances. We can choose some circumstances, and others we cannot, but we are always ‘an I within circumstances’—hence his central dictum: “I am my self and my circumstance” (Yo soy yo y mi circunstancia). As a very existentialist theme, a pre-determined life is not given to us; what is given is the need to do something and so life is always a ‘having to do something’ within inescapable circumstances. Thus, life is not the self; life is boththe self and the circumstance. This is his position of “razón vital,” which is perhaps best translated as “reason from life’s point of view.” Everything we find is found in our individual lives, and the meaning we attach to things depends on what it means in our lives (which here is arguably a more pragmatist rather than existentialist or phenomenological stance). By the mid-1930s, this would be developed further, adding on the importance of narrative reason to better contemplate what it means to be human. This he titled “razón vital e histórica,” or “historical and vital reason.”
Humankind is an entity that makes itself, he argued. Humans are beings that are not-yet-being, as we are constantly engaged in having to navigate through circumstances. He describes this navigation like being lost at sea, or like being shipwrecked, making life a ‘problem that needs to be solved.’ Life is a constant dialogue with our surroundings. Despite this emphasis on individuality, given humankind’s constant place within circumstances, we are also individuals living with others, so to live is to live-with. More specifically, there are two principal ways that we are living-with: in a coetaneous way as a generational group, and in a contemporaneous way in terms of being of the same historical period. Hence Ortega’s dictum: “Humankind has no nature, only history.” “Nature” refers to things, and “history” refers to humankind. Each human being is a biography of time and space; each human being has a personal and generational history. We can understand an individual only through his or her narrative. Life is defined as ‘places and dates’ immersed in systems of beliefs dominant among generations.
Ortega’s metaphysics thus consists of each human being oriented toward the future in radical disorientation; life is a problem that needs to be solved because in every instance we are faced with the need to make choices within certain circumstances. The human radical reality is the need to constantly decide who we are going to be, always within circumstances. Take, for example, an individual “I” in a room; the room is not literally a part of one’s “I,” but “I” am an “I in a room.” The “I in a room” has possibilities of choices of what to do, but in that moment those are limited to that room. He writes: “Let us save ourselves in the world, save ourselves in things” (What is Philosophy?). In every moment we are each confronted with many possibilities of being, and among those various possible selves, we can always find one that is our authentic self, which is one’s vocation. One only truly lives when one’s vocation coincides with their true self. By vocation he is not referring to strictly one’s profession but also to our thoughts, opinions, and convictions. That is why a human life is future-oriented; it is a project, and he often symbolically refers to human beings as an archer trying to hit the bullseye of his or her authentic vocation—if, of course, they are being true to themselves. The human individual is not an “is,” rather they are a becoming and an inheritor of an individual and collective past.
a. The Individual
Ortega argues that to live is to feel lost, yet one who accepts this is already closer to finding their self. An individual’s life is always “my life”; the vital activity of “I” is always within circumstances. We can choose some of those circumstances, but we can never not be in any. In every instant, we must choose what we are going to do, what we are going to be in the next instant, and this decision is not transferable. Life is comprised of two inseparable dimensions, as he describes it. The first is the I and circumstance, and as such, life is a problem. In the second dimension, we realize we must figure out what those circumstances are and try to resolve the problem. The solutions are authentic when the problem is authentic.
Each individual has an important historical element that factors in here because different time periods may have dominant ideas about how to solve problems. History is the investigation, therefore, into human lives to try to reconstruct the drama that is the life of each one of us—he often also uses the metaphor of our swimming as castaways in the world. The vital question historical study needs to inquire into is precisely the things that change a human’s life; it is not about historical variations themselves but rather what brings about that change. So, we need to ask: how, when, and why did life change?
Each individual exists in their own set of circumstances, though some overlap with those in the lives of others, and thus each individual is an effort to realize their individual “I.” Being faced with the constant need to choose means that living brings about a radical insecurity for each individual. An individual is not defined by body and soul, because those are “things”; rather, one is defined by their life. For this reason, he proclaims his famous thesis that ‘humans have no nature, only history.’ Thus, again, this is why history should be the focus in the study of human lives; history is the extended human drama. Human life, as he so often says, is a drama, and thus the individual becomes the “histrion” of their self—“histrion” referring to a “theatrical performer,” the usage of which dates back to the ancient Greeks. In its etymological roots, from the Greek historia, we have in part “narrative,” and from histōr “learned, wise human”—thus, for Ortega, to study and be aware of one’s narrative is the means by which we become learned and wise. As one lacks or ignores this historical knowledge, there is a parallel fall in living authentically, and when this increasingly manifests itself in a group of people, there is a parallel rise of barbarity and primitiveness, he argues. This, as is elaborated further ahead, is precisely what is at work with the revolt of the masses of his time.
For Ortega, the primitive human is overly socialized and lacks individuality. As a very existentialist theme in his philosophy, we live authentically via our individuality. Existentialist philosophers generally share the critique that the focus on the self, on human existence as a lived situation, had been lost in the history of philosophy. On this point Ortega agrees; however, he does not share exactly the critique that, from the birth of modern philosophy, especially from Descartes onward, the increase in a rational and detached focus on the pursuit of objective knowledge was all that detrimental. This is because humanism was also in part a result, because the new science and human reason permitted humankind to recover faith and confidence in itself. Ortega does not deny that there are certain scientific facts that we must live with, but science, he says, “will not save humankind” (Man and Crisis). In other words, scientific studies can lead to scientific facts, but these should not extend beyond science—it is an error of perspective to reach beyond. As is elaborated further ahead, the richest of the different types of perspectives is that which is focused on the individual human life, as this is the radical reality.
We each live with a future-orientation, yet the future is problematic, and this is the paradoxical condition that is human life: in living for the future, all we have is the past to realize that potential. Ortega argues that a prophet is an inverse historian; one narrates to anticipate the future. An individual’s present is the result of all the human past—we must understand human life as such, just as how one cannot understand the last chapter of a novel without having read the content that came before. This makes history supreme, the “superior science,” he argues, in response to the dominance of physics in his time, for understanding the fundamental, radical reality that is the human life. While Ortega believes that Einstein’s discoveries, for example, support his position of perspectivism and how reality can only be understood via perspectives, again his concern is when the sciences reach too far beyond science into the realm of what it means to be a human individual. Moreover, what had been largely forgotten in his time is how we are so fundamentally historical beings. Thus, a lack of historical knowledge results in a dangerous disorientation and is an important symptom of the crisis of his time: the hyper-civilization of savagery and barbarity that he defines as the “revolt or rebellion of the masses.”
b. Society and The Revolt of the Masses
Society, for Ortega, is not fully natural, as its origins are individualist. Society arises out of the memory of a remote past, so it can only be comprehended historically. But it is also the case for Ortega that an individual’s vocation can be realized only within a society. In other words, part of our circumstance is to always be with others in a reciprocal and dynamic relationship—here his views tend more toward an existential phenomenology, as the world we live in is an intersubjective one, as we are each both unique and social selves; we are living in a world of a multitude of unique individuals. Ortega is quite critical of his time period, but he is detailed enough in his critiques to point toward a potential way of resolving them. In fact, Ortega is one of the first writers to detail something resembling the European Union as a possible solution. For example, he writes, “There is now coming for Europeans the time when Europe can convert itself into a national idea” (The Revolt of the Masses). He was quite concerned with the threat of the time for politics to go in either extreme direction to the left or right, resulting from the crisis of the masses.
This served as part of his inspiration for studying in Germany, which he saw in many ways as a model state, right after he finished his Doctorate. Contemplating an ideal future for his country of Spain became of great importance to him at an early age. In 1898, Spain lost its last colonies after losing the Spanish-American War. A group of Spanish intellectuals arose, appropriately called the “Generation of ’98,” to address how to heal the future of their country. A division resulted between those labeled the “Hispanizantes” and the “Europeazantes” (to which Ortega belonged), which looked to “Hispanicizing” or “Europeanizing” Spain, respectively, to looking back to tradition or looking to Europe as a model.
The most famed of his critiques are captured in his best-selling and highly prophetic book The Revolt of the Masses from 1930. One clear way in which he describes the main problem of not only Europe, but really the world over, is to imagine what happens in an elementary school classroom when the teacher leaves, even if just momentarily; the mob of children “breaks loose, kicks up its heels, and goes wild” (The Revolt of the Masses). The mob feels themselves in control of their own destiny, which had been previously aided by the school master, but this does not mean of course that these children suddenly know exactly what to do. The world is acting like these children, and often even worse, as spoiled children who are ignorant of the history behind all that they believe they have a right to, resulting in great disrespect. Without the direction of a select minority, such as the teacher or school master, the world is demoralized. The world is being chaotically taken over by the lowest common denominator: the barbarous mass-human.
This mob he calls “the mass-man.” The mass-man is distinguished from the minority both quantitatively and qualitatively—most important is the latter. While the minority consists of specially qualified individuals, the mass-person is not, and he or she is content with that (there is an influence apparent here from Nietzsche’s distinction between “master” and “slave” moralities, but it is not the same). The mass-man sees him or herself just like everybody else and does not want to change that. The minority makes great demands on themselves because they do not see themselves as superior, yet they are striving to improve themselves, whereas the mass-man does not. Being a minority group requires setting themselves apart from the majority, whereas this is not needed for defining a majority, he argues. So, the distinction here is not about social classes; rather, it refers mostly to mentality. The problem, Ortega argues, is that this is essentially upside-down; the minorities feel mediocre, yet not a part of the mass, and the masses are acting as the superior ones, which is enabling them to replace the minorities. He calls this state a hyper-democracy, and it is for Ortega the great crisis of the time because in the process the masses are crushing that which is different, qualified, and excellent. The result is the sovereignty of the unqualified individual—and this is stronger than ever before in history, though this kind of crisis has happened before.
The masses have a debilitating ignorance of history, which is central to sustaining and advancing a civilization. History is necessary to learn what we need to avoid in the future—“We have a need of history in its entirety, not to fall back into it, but to see if we can escape from it” (The Revolt of the Masses). Civilization is not self-supporting, such as nature is, yet the masses, in their lack of historical consciousness, think this to be the case. It is a rebellion of the masses because they are not accepting their own destiny and, therefore, rebelling against themselves. The result, Ortega fears, is great decay in many areas; there will be a militarization of society, a bureaucratization of life, and a suspension of spontaneous action by the state, for example, which the mass-man believes himself to be. Everyone will be enslaved by the state. He saw this clearly in his nation of Spain, where regional particularism was dominant and demolished select individuality (he develops this theory of his country having become ‘spineless’ in another of his more successful books titled Invertebrate Spain). As is further elaborated ahead, this can also be seen in the art trends of the time, as movements toward greater abstraction were having the effect of minimizing the number of people who could ‘understand it.’ Despite having critiqued the aesthetics of the time, Ortega also thought this could shift the balance from the dominance of the masses and put them ‘back into their appropriate places.’ The arts, then, will help restore the select hierarchy.
c. The Mission of the University
Another part of his answer to this crisis of his times lies in the university: the mission of the university is to teach culture, essentially. By “culture” he was referring to more than just scholarly knowledge; it was about being in society. Universities should aim to teach the “average” person to be a good professional in society. Science, very broadly understood here, has a different mission than the university and not every student should be churned into a “scientist.” This does not mean that science and the university are not connected, it is just to emphasize that not all students are scientists. Ortega recognizes that science is necessary for the progression of society, but it should not be the focus of a university education. For this reason, university professors should be chosen for their teaching skills over their intellectual aptitudes, he argued. Again, students should be groomed to follow the vocation of their authentic self. The self is what one potentially is; life is a project. Therefore, what one needs to know is what helps one realize their personal project (again, this is why ‘vital reason’ is ‘reason from life’s point of view’). Not everybody is aware of what their project is, and it is essential to human life to strive to figure out what that is, and ideally realize it as close to fully as possible—the university can aid in this endeavor. But Ortega was also cognizant of the challenge of this endeavor, as perhaps the ‘right frame of mind, dispositions, moods, or tastes’ cannot be taught.
3. Perspectivism
As humankind is always in the situation of having to respond to a problem, we must know how to deal and live with that problem—and this is precisely the meaning of the Spanish verb “saber,” or “to know” (as in concrete facts): to know what to do with the problems we face. Thus, Ortega’s epistemology can be summarized concisely as follows: the only knowledge possible is that which originates from an individual’s own perspective. Knowledge, he writes, is a “mutual state of assimilation” between a thing and the thinker’s process of thinking. When we confront an object in the world, we are only confronting a fragment of it, and this forces us to try to think about how to complete that object. Therefore, philosophizing is unavoidable in life (for some, at least, he argued), because it is part of this process of trying to complete what is a world of fragments. Further, this forms part of the foundation of his perspectivism: we can never see and thus understand the world from a complete perspective, only our own limited one. Perspective, then, is both a component of reality as well as what we use to organize reality. For example, when we look up at a skyscraper from the street level, we can never see the whole building, only a fragment of it from our limited perspective.
In his book What is Knowledge?, which consists of lectures from 1929-1930, we find some of his initial leanings on idealism (which he would come to increasingly move away from), but even here he is not rejecting realism but rather making it subordinate to idealism. This is because, he argues, the existence of the world can only first be derived from the existence of thought. For the world to exist, thought must exist; existence of the world is secondary to the existence of thought. Idealism, he argues, is the realism of thought. Ultimately, however, Ortega rejects both idealism and realism; neither suffices to explain the radical reality that is the individual human life, the coexistence of selves with things. While science begins with a method, philosophy begins with a problem—and this is what makes philosophy one of the greatest intellectual activities because human life is a problem, and when we become aware of our ‘falling into this absolute abyss of uncertainty,’ we do philosophy. Science may be exact, but it is not sufficient for understanding what it means to be human. Philosophy may be inexact, but it brings us much closer to understanding what it means to be human because it is about contemplating the radical reality that is the life of each one of us. Because humankind constantly confronts problems, this makes philosophy a more natural intellectual task than any science. Thus, “each one knows as much as he [they] has doubted” (What is Knowledge?). Ortega measures “the narrowness or breadth of [their] wit by the scope of [their] capacity to doubt and feel at a loss” (What is Knowledge?).
Therefore, one does not need certainties, argues Ortega; what is needed is a system of certainties that creates security in the face of insecurity. He argues: “the first act of full and incontrovertible knowledge” is the acknowledgement of life as the primordial and radical reality (What is Knowledge?). No one perspective is true nor false (though there may be a pragmatic hierarchy of perspectives); it is affected by time and place. For example, it is not just about a visual field; there are many other fields that can be present and vary by time and place. Time itself can be experienced very differently, such as how Christmas Eve may seem to be the longest night of the year for young children, whereas their birthday party passes much too quickly. A subject informs an object, and vice versa; it is an abstraction to speak of one without the other. “To know,” therefore, is to know what to live with, deal with, abide by, in response to the circumstances we find ourselves in—this is a clear example of the existentialist thinking that can be found in his thought, as he emphasizes that the most important knowledge is the individual self that understands his or her circumstances well enough to know how to live with them, deal with them, and in response form principles to abide by. Science will not suffice to help us in this endeavor. It is only in being clear with our selves that we can better navigate the drama that is the individual human life—again, a very existentialist theme. This does not require being highly educated—as this also runs its own risk of getting lost in scholarship, he urges—and the individual need not look far (though neither does this mean that individuals always make this effort). Part of what makes us human is our imagination; human life then is a work of imagination, as we are the novelists of our selves.
Ortega critiques the philosophy of the mid-nineteenth century until the early twentieth century as being “little more than a theory of knowledge” (What is Philosophy?) that still has not been able to answer the most fundamental question as to what knowledge is, in its complete meaning. He is especially highly critical of positivism. He argues that we must first understand what meaning the verb to know carries within itself before we can consider seriously a theory of knowledge. And just as life is a task, knowing is a task that humans impose on themselves, and it is perhaps an impossible task, but we feel this need to know and impose this task on ourselves because we are aware of our ignorance. This awareness of our ignorance is the focus that an epistemological study should take, Ortega argued.
a. Ideas and Beliefs
An important connection between his metaphysics and his epistemology is his distinction between ideas and beliefs. A “belief,” he argues, is something we maintain without being conscious or aware of it. We do not think about our beliefs; we are inseparably united with them. Only when beliefs start to fail us do we begin to think about them, which leads to them no longer being beliefs, and they become “ideas.” Here we can see an influence of Husserl’s phenomenology in trying to ‘suspend’ habitual beliefs, and Ortega adds, to understand how history is moved by them. We do not question beliefs, because when we do, then they stop being beliefs. In moments of crisis, when we question our beliefs, this means we are thinking about them so instead again they become “ideas.” We look at the world through our ideas, and some of them may become beliefs. The “real world” is a human interpretation, an idea that has established itself as a belief.
When we are left without beliefs, then we are left without a world, and this change then converts into a historical crisis—this is an important connection to his theory on history described further ahead. The human individual is always a coming-from-something and a going-to-something, but in moments of crisis, this duality becomes a conflict. The fifteenth century provides a clear example, as it marked a historical crisis of a coming-from the medieval lifestyle conflicting with a going-to a new ‘modern’ lifestyle. So, in a historical crisis such as this one, there is an antithesis of different modes of the same radical attitude; it is not like the coming-from and going-to of the seasons, like summer into fall. The individual of the fifteenth century was lost, without convictions (especially those stemming from Christianity), and as such, was living without authenticity—just as the individual of his day, he argues. The fifteenth century is a clear example of “the variations in the structure of human life from the drama that is living” (Man and Crisis). There was a crisis then of reason supplanting faith (as was also seen when faith supplanted reason in the shift into the start of the medieval period). In times such as these, as with Ortega’s own, there is a crisis of belief as to ‘who knows best.’
Beliefs are thus connected to their historical context, and as such, historical reason is the best tool for understanding both the ebb and flow of beliefs being shaken up and moments of historical crises. Epochs may be defined by crises of beliefs. Ortega argued that he was living in precisely one of those times, and that it took the form of a “rebellion of the masses.” Beliefs in the form of faith in old enlightenment ideals of confidence in science and progress were failing, and advances in technology were making this harder for people to see, precisely because it puts science to work. Reality is a human reality, so science does not provide us with the reality. Instead, what science provides are some of the problems of reality.
4. Theory of History
Ortega believed that “philosophy of history” was a misnomer and preferred the term “theory” for his lengthy discussions on history, a topic which has central importance in his thought. He objected to many terms, which adds to the difficulty of classifying him (others include objections to being an ‘existentialist’ and even a ‘philosopher’). Much of Ortega’s theory on history is outlined in Man and People, Man and Crisis, History as a System, An Interpretation of Universal History, and Historical Reason. The use of the term “system” in his philosophical writings on history is at times misleading because what he is referring to is a kind of pattern or trend that can be studied, but it is not a teleological vision on history. History is defined by its systems of beliefs. As outlined in the section on ideas and beliefs, we hold ideas because we consciously think about them, and we are our beliefs because we do not consciously think about them. There are certain beliefs that are fundamental and other secondary beliefs that are derived from those. To study human existence, whether it be of an individual, a society, or a historical age, we must outline what this system of beliefs is, because crises in beliefs, when they are brought to awareness and questioned, are what move history on any level (personal, generational, or societal). This system of beliefs has a hierarchized order that can help us understand our own lives, that of others, today, and of the past—and the more of these comparisons we compile, the more accurate will be the result. Changes in history are due largely to changes in beliefs. Part of this stems from his view that in these moments, some of us also become aware of our inauthentic living brought about by accepting prevailing beliefs without question. The activity of philosophy is part of this questioning.
History is of fundamental importance to all his philosophy. Human beings are historical beings. Knowledge must be considered in its historical context; “what is true is what is true now” (History as a System). This again raises the critique of what we can do without any objective, absolute knowledge (and also places him arguably in the pragmatist camp here). But Ortega responds that while science may not provide insight on the human element, vital historical reason can. He argues that we can best understand the human individual through historical reason, and not through logic or science. One of his most well-known dictums is that “humankind has no nature, only history.” For Ortega, “nature” refers to something that is fixed; for example, a stone can never be anything other than a stone. This is not the case with humankind, as life is “not given to us ready-made”; we do find ourselves suddenly in it, but then “we must make it for ourselves,” as “life is a task,” unique to each individual (History as a System). This “thrownness” in the world is another very existentialist theme (for which some debate exists about the chronology of the development of this philosophy between Ortega and Heidegger, especially considering they personally knew and respected each other). A human being is not a “thing”; rather, a human life is a drama, a happening, because we make ourselves as infinitely plastic beings. “Things” are objects of existence, but they do not live as humans do, and each human does so according to their own personal choices in response to the problems we face in navigating our circumstances. “Before us lie the diverse possibilities of being, but behind us lies what we have been. And what we have been acts negatively on what we can be,” he writes, and again this applies to any level of humanity, whether regarding individuals or states (History as a System). Thus, while we cannot know what someone or some collective entity will be, we can know what someone or some collective entity will not be. Those possibilities of being are challenged by the circumstances we find ourselves in, so, “to comprehend anything human, be it personal or collective, one must tell its history” (History as a System). In a general sense, humans are distinct in our possession of the concept of time; the human awareness of the inevitability of death makes this so.
We cannot speak of “progress” in a positive sense in the variable becoming of a human being, because an a priori affirmation of progress toward the better is an error, as it is something that can only be confirmed a posteriori by historical reason. So, by “progress,” Ortega means simply an “accumulation of being, to store up reality” (History as a System). We have each inherited an accumulation of being, which is what further gives history its systematic quality, as he writes: “History is a system, the system of human experiences linked in a single, inexorable chain. Hence nothing can be truly clear in history until everything is clear” (History as a System). Since the ancient Greek period, history and reason had been largely opposed, and Ortega wants to reverse this—hence his use of the term “historical reason.” He is not referring to something extra-historical, but rather something substantive: the reality of the self that is underlying all, and all that has happened to that self. Nothing should be accepted as mere fact, he argues, as facts are fluid interpretations that are themselves also embedded in a historical context, so we must study how they have come about. Even “nature” is still just humankind’s “transitory interpretation” on the things around us (History as a System). As Nietzsche similarly argued, humankind is differentiated from animals because we have a consciousness of our own history and history in general. But again, the idea here is that the past is not really past; as Ortega argues, if we are to speak of some ‘thing’ it must have a presence; it must be present, so the past is active in the present. History tells us who we are through what we have done—only history can tell us this, not the physical sciences, hence again his call for the importance of “historical reason.” The physical sciences study phenomena that are independent, whereas humans have a consciousness of our historicity that is, therefore, not independent from our being.
Through history we try to comprehend the variations that persists in the human spirit, writes Ortega. These hierarchized variations are produced by a “vital sensitivity,” and those variations that are decisive become so through a generation. The theory of generations is fundamental to understanding Ortega’s philosophy on (not of) history, as he argues that previous philosophies on history had focused too much on either the individual or collective, whereas historical life is a coexistence of the two. For Ortega, a generation is divided into groups of fifteen-year increments. Each generation captures a perspective of universal history and carries with it the perspectives that came prior. For each generation, life has two dimensions: first, what was already lived, and second, spontaneity. History can also be understood like cinematography, and with each generation comes a new scene, but it is a film that has not come to an end. We are all always living within a generation and between generations—this is part of the human condition.
The two generations between the ages of thirty to sixty are of particular influence in the movement of history, as they generally represent the most historical activity, he argues. From the ages of thirty to forty-five we tend to find a stage of gestation, creation, and polemic. In the generational group from ages forty-five to sixty, we tend to find a stage of predominance, power, and authority. The first of these two stages prepares one for the next. But Ortega also posits that all historical actuality is primarily comprised of three “todays,” which we can also think of as the family writ large: child, parent, grandparent. Life is not an ‘is’—it is something we must make; it is a task, and each age is a particular task. This is because historical study is not to be concerned with only individual lives, as every life is submerged in a collective life; this is one circumstance, that we are immersed in a set of collective beliefs that form the “spirit of the time.” This is very peculiar, he argues, because unlike individual beliefs that are personally held, collective beliefs that take the form of the “spirit of the time” are essentially held by the anonymous entity that is “society,” and they have vigor regardless of individual acceptance. From the moment we are born we begin absorbing the beliefs of our time. The realization that we are unavoidably assigned to a certain age group, or spirit of the time, and lifestyle, is a melancholic experience that all ‘sensitive’ (philosophically-minded) individuals eventually have, he posits.
Ortega makes an important distinction between being “coeval” or “coetaneous,” and being “contemporary.” The former refers to being of the same age, and the latter refers to being of the same historical time period. The former is that of one’s generation, which is so critical that he argues those of the same generation but different nations are more similar than those of the same nation but different generations. His methodology for studying history is grounded in projecting the structure of generations onto the past, as it is a generation that produces a crisis in beliefs that then leads to change and new beliefs (discussed above). He also defines a generation as a dynamic compromise between the masses and the individual on which history hinges. Every moment of historical reality is the coexistence of generations. If all contemporaries were coetaneous, history would petrify and innovation would be lost, in part because each generation lives their time differently. Each generation, he writes, represents an essential and untransferable piece of historical time. Moreover, each generation also contains all the previous generations, and as such is a perspective on universal history. We are the summary of the past. History intends to discover what human lives have been like, and by human, he is not referring to body or soul, because individuals are not “things,” we are dramas. Because we are thrown into the world, this drama creates a radical insecurity that makes us feel shipwrecked or headed for shipwreck in life. We form interpretations of the circumstances we find ourselves thrown into and then must constantly make decisions based upon those. But we are not alone, of course; to live is to live together, to coexist. Yet it is precisely that reality of coexistence that makes us feel solitude; hence our attempt to avoid this loneliness through love. Ortega’s theory on history is therefore a combination of existential, phenomenological, and historicist elements.
5. Aesthetics
Ortega’s Phenomenology and Art provides a very phenomenological and existentialist philosophy on art. Art is not a gateway into an inner life, into inwardness. When an image is created of inwardness, it then ceases to be inward because inwardness cannot be an object. Thus, what art reveals is what seems to be inwardness through esthetic pleasure. Art is a kind of language that tells us about the execution of this process, but it does not tell us about things themselves. A key example he gives to understand this is the metaphor, which he considers an elementary esthetic object. A metaphor produces a “felt object,” but it is not, strictly speaking, the objects themselves that it includes. Art, he says, is de-creation because like in the example of a metaphor, it creates a new felt object out of essentially the destruction of other objects. There is a connection to Brentano and Husserl here in this experience that consciousness is of a consciousness of an object (though, it has been noted, Ortega ultimately aims to redirect the reduction of Husserl and against pure consciousness to instead promote consciousness from the point of view of life).
In the example of painting, which he considers “the most hermetic of all the arts” (Phenomenology and Art), he further elaborates on the importance of an artist’s view on the occupation itself, of being an “artist.” The occupation one chooses is a very personal and important one, thus style is greatly impacted by how an artist would answer the question as to what it means “to be a painter” (Phenomenology and Art). Art history is not just about changes in styles; it is also about the meaning of art itself. Most important is why a painter paints rather than how a painter paints, he argues (as another very existentialist position).
Ortega’s philosophy on the art of his time is further developed in his essay The Dehumanization of Art. While the focus of his analysis in this text is the art of his time, his objective is to understand and work through some basic characteristics of art in general. As this was published in 1925, the art movements he often refers to are those tending toward abstraction, such as expressionism and cubism. He was quite critical of Picasso, for example, but this may have also been primarily politically motivated. His ultimate judgment is that the art of his time has been “dehumanized” because it is an expression moving further away from the lived experience as it becomes more “modern.” This new art is “objectifying” things; it is objectifying the subjective, as an expression of an observed reality more remote from the lived human reality. After all, the “abstraction” in this art means precisely this, starting with some object in the real world and abstracting it (as opposed to art that is completely non-representational). Art becomes an unreality. In this we find his phenomenological leaning, calling to go back “to the things themselves” in art. This is arguably also part of the general existentialist call to avoid objectifying human individuals.
Still, this can provide insight into his contemporary historical age, and there is value in that—hence his desire to better understand the art of his time, the art that divides the public into the elite few that understand and appreciate it, and the majority who do not understand nor enjoy it. There is also value in how this may be used to ‘put the masses back into their place,’ because only an elite few understand ‘modern art.’ Perhaps this could serve as a test, Ortega argued, by observing how one views a work of abstract art. We can add to our judgments about a person’s place as part of the minority or mass his or her ability to contemplate this art.
6. Philosophy
History, for Ortega, represented the “inconstant and changing,” whereas philosophy represented the “eternal and invariable”—and he called for the two to be united in his approach to philosophical study. History is human history; it is the reality of humankind. As a critic of his time, he also has much to say about the movements in philosophy of his time and in the history of philosophy. To the question, “what is philosophy?” Ortega answers: “it is a thing which is inevitable” (What is Philosophy?). Philosophy cannot be avoided. It is an activity, and in his many writings on the topic, he wants to take this activity itself and submit it to analysis. Philosophy must be de-read vertically, not read horizontally, he urges. Philosophy is philosophizing, and philosophizing is a way of living. Therefore, the basic problem of philosophy that he wants to submit to analysis is to define that way of living, of being, of “our life.”
Ortega’s call for a rebirth of philosophy and his concern over too much reliance on modern science, especially physics, is one of the many reasons why he is often classified into the category of existentialist philosophers. In fact, for Ortega, a philosopher is really a contradistinction from any kind of scientist in their navigation into the unknown, into problems (like other existentialists, he is also fond of the metaphor for life consisting of navigating a ship headed for shipwreck). Philosophy, he says, is a vertical excursion downward. In his discussions on what philosophy is, he makes several contrasts to science. For example, philosophy begins with the admission that the world may be a problem that cannot be solved, whereas the business of science is precisely about trying to solve problems. But he did not solely critique physics, as it was also something that he believed supported his perspectivism, as seen in the relativism discovered by Albert Einstein—but neither is Ortega a strict relativist. While an individual reality is relative to a time and place, each of those moments is an absolute position. Moreover, not all perspectives are equal; errors are committed, and there are hierarchies of perspectives.
The exclusive subject of philosophy is the fundamental being, which he defines as that which is lacking. Philosophy, he says, is self-contained and can be defined as a “science without suppositions,” which is another inheritance from Husserl’s phenomenology (What is Philosophy?). In fact, he takes issue with the term “philosophy” itself; better, perhaps, is to consider it a theory or a theoretic knowledge, he insists. A theory, he argues, is a web of concepts, and concepts represent ‘the content of the mind that can be put into words.’
7. The History of Philosophy
In his unfinished work, The Origin of Philosophy, Ortega outlines a reading of philosophy similar to that of history; it must be studied in its entirety. Just as one cannot only read the last chapter of a novel to understand it, one must read all the chapters that came before. His main objective with this work then is to recreate the origin of philosophy. In the history of philosophy, we find a lot of inadequate philosophy, he argues, but it is part of our human condition to keep thinking, nonetheless. It is part of our human condition to realize that we have not thought everything out adequately. Hence, perhaps The Origin of Philosophy was meant to be unfinished because it cannot be otherwise. Upon the first read, therefore, the history of philosophy is a history of errors. We need only think of what came after the Presocratics, the first on record to try to formalize some ways of philosophical thinking, who then gave birth to the relativism of the Sophists and the skepticism of the Skeptics, as a few examples of what came after in the form primarily of a critique or a reaction against. By revealing the errors of earlier philosophy, Ortega argues, philosophers then create another philosophy in that process. For Ortega to take this focus precisely when he did, working on this text in the mid-1940s, when logical positivism and contemporary analytic philosophers had come to dominate the Anglo-American philosophical landscape, provides just an example of this as “analytic philosophy.” That term came about in part to separate those philosophers from “continental philosophy” (“continental” primarily being in reference to existentialist-like thinkers, such as Ortega—those on the continent of Europe, not the British Isles).
Error, he argues, seems to be more natural than truth. But he does not believe that philosophy is an absolute error; in errors there must be at least the possibility for some element of truth. It is also the case that sometimes when we read philosophy, the opposite happens: we are initially struck by how it seems to resound the “truth.” What we have next, then, is a judgment about how ‘such and such philosophy’ has merit and another does not. But each philosophy, he argues, contains elements of the others as “necessary steps in a dialectical series” (The Origin of Philosophy). The philosophical past, therefore, is both an accumulation of errors and truths. He says: “our present philosophy is in great part the current resuscitation of all the yesterdays of philosophy” (The Origin of Philosophy). Philosophy is a vertical excursion down, because it is built upon philosophical predecessors, and as such, continues to function in and influence the present. When we think about the past, that brings it into the present; in other words, thinking about the past makes it more than just “in the past.” Again, he shares with Nietzsche this distinction between humankind and animals in how we possess the past and are more than just consequences of it; we are conscious of our past. We are also distinct in how we cannot possess the future, though we strive very hard to—modern science is very focused on this and working to improve our chances at prediction. The first philosopher, Thales, is given that title for being the first on record that we know of to start to think for himself and move away from mythological explanation, as famously demonstrated by how he predicted a solar eclipse using what we would define as a kind of primitive science. In being able to predict more of the future, one can thus ‘eternalize oneself’ more. In this process one has also obtained a greater possession of the past. “The dawn of historical reason,” as he refers to it, will arrive when that possession of the past has reached an unparalleled level of passion, urgency, and comprehension. Just as history broadly moves with crises of beliefs, this applies very explicitly to philosophy (as it is also the best way to contemplate the human lived situation). This also relates to his perspectivism and to the notion of hierarchies that are very much pragmatically founded. For Ortega, examples of particularly moving moments in the history of philosophy come from these great shifts in philosophical beliefs, such as those from the period of ancient Greece and from Descartes especially. For Ortega, the three most crucial belief positions in philosophy to examine via its history are realism, idealism, and skepticism. Ortega’s hope was that this would all, ideally, come closer to a full circle with the next belief position: that of his “razón vital e histórica,” or “historical and vital reason.”
Despite the challenges in understanding the wide breadth of writings of José Ortega y Gasset, perhaps it serves us best to read him in the context of his own methodology of historical and vital reason—as an individual, a man of his times, searching for nuggets of insight among a history of errors.
8. References and Further Reading
a. Primary Sources
Ortega’s Obras Completas are available digitally.
Ortega y Gasset, José. Obras Completas Vols. I-VI. Spain: Penguin Random House Grupo Editorial, 2017.
Ortega y Gasset, José. Obras Completas Vols. VII-X (posthumous works). Spain: Penguin Random House Grupo Editorial, 2017.
Ortega y Gasset, José. Meditations on Quixote. New York: W.W. Norton, 1961.
Ortega y Gasset, José. The Dehumanization of Art and Other Essays on Art, Culture, and Literature. Princeton: Princeton University Press, 2019.
Ortega y Gasset, José. Phenomenology and Art. New York: W.W. Norton, 1975.
Ortega y Gasset, José. Historical Reason. New York: W.W. Norton, 1984.
Ortega y Gasset, José. Toward a Philosophy of History. Chicago: University of Illinois Press, 2002.
Ortega y Gasset, José. History as a Systemand other Essays Toward a Philosophy of History. New York: W.W. Norton, 1961.
Ortega y Gasset, José. An Interpretation of Universal History. New York: W.W. Norton, 1973.
Ortega y Gasset, José. The Revolt of the Masses. New York: W.W. Norton, 1932.
Ortega y Gasset, José. What is Philosophy? New York: W.W. Norton, 1960.
Ortega y Gasset, José. The Origin of Philosophy. New York: W.W. Norton, 1967.
Ortega y Gasset, José. Man and Crisis. New York: W.W. Norton, 1958.
Ortega y Gasset, José. Man and People. New York: W.W. Norton, 1957.
Ortega y Gasset, José. Meditations on Hunting. New York: Charles Scribner’s Sons, 1972.
Ortega y Gasset, José. Psychological Investigations. New York: W.W. Norton, 1987.
Ortega y Gasset, José. Mission of the University. New York: W.W. Norton, 1966.
Ortega y Gasset, José. The Modern Theme. New York: W.W. Norton, 1933.
Ortega y Gasset, José. On Love: Aspects of a Single Theme. Cleveland: The World Publishing Company, 1957.
Ortega y Gasset, José. Some Lessons in Metaphysics. New York: W.W. Norton, 1969.
Ortega y Gasset, José. What is Knowledge? New York: Suny Press, 2001.
Ortega y Gasset, José. Concord and Liberty. New York: W.W. Norton, 1946.
Ortega y Gasset, José. Invertebrate Spain. New York: Howard Fertig, 1921.
b. Secondary Sources
Blas González, Pedro. Human Existence as Radical Reality: Ortega y Gasset’s Philosophy of Subjectivity. St. Paul: Paragon House, 2011.
Díaz, Janet Winecoff. The Major Theme of Existentialism in the Work of Jose Ortega y Gasset. Chapel Hill, NC: University of North Carolina Press, 1970.
Dobson, Andrew. An Introduction to the Politics and Philosophy of José Ortega y Gasset. Cambridge: University Press, 1989.
Ferrater Mora, José. José Ortega y Gasset: An Outline of His Philosophy. New Haven, CT: Yale University Press, 1957.
Ferrater Mora, José. Three Spanish Philosophers: Unamuno, Ortega, and Ferrater Mora. New York: State University of New York Press, 2003.
Graham, John T. A Pragmatist Philosophy of Life in Ortega y Gasset. Columbia: University of Missouri Press, 1994.
Graham, John T. The Social Thought of Ortega y Gasset: A Systematic Synthesis in Postmodernism and Interdisciplinarity. Columbia: University of Missouri Press, 2001.
Graham, John T. Theory of History in Ortega y Gasset: The Dawn of Historical Reason. Columbia: University of Missouri Press, 1997.
Gray, Rockwell. The Imperative of Modernity: An Intellectual Biography of José Ortega y Gasset. Berkeley: University of California Press, 1989.
Holmes, Oliver W. José Ortega y Gasset. A Philosophy of Man, Society, and History. Chicago: University of Chicago, 1971.
Huéscar, Antonio Rodríguez y Jorge García-Gómez. José Ortega y Gasset’s Metaphysical Innovation: A Critique and Overcoming of Idealism. Albany: State University of New York Press, 1995.
McClintock, Robert. Man and His Circumstances: Ortega as Educator. New York: Teachers College Press, 1971.
Mermall, Thomas. The Rhetoric of Humanism: Spanish Culture after Ortega y Gasset. New York: Bilingual Press, 1976.
Raley, Harold C. José Ortega y Gasset: Philosopher of European Unity. University, Alabama: University of Alabama Press, 1971.
Sánchez Villaseñor, José. José Ortega y Gasset, Existentialist: A Critical Study of his Thought and his Sources. Chicago: Henry Regnery, 1949.
Silver, Philip W. Ortega as Phenomenologist: The Genesis of Meditations on Quixote, New York: Columbia University Press, 1978.
Sobrino, Oswald. Freedom and Circumstance: Philosophy in Ortega y Gasset, Charleston: Logon, 2011.
Author Information
Marnie Binder
Email: marnie.binder@csus.edu
California State University, Sacramento
U. S. A.
Nietzsche’s Ethics
The ethical thought of German philosopher Friedrich Nietzsche (1844–1900) can be divided into two main components. The first is critical: Nietzsche offers a wide-ranging critique of morality as it currently exists. The second is Nietzsche’s positive ethical philosophy, which focuses primarily on what constitutes health, vitality, and flourishing for certain individuals, the so-called “higher types”.
In the critical project, Nietzsche attacks the morality of his day from several different angles. He argues that the metaphysical foundations of morality do not hold up to scrutiny: the concepts of free will, conscious choice, and responsibility that underpin our understanding of morality are all vociferously critiqued, both on theoretical and on practical grounds. Nietzsche also objects to the content of our contemporary moral commitments. He rejects the idea that suffering is inherently bad and should be eradicated, and he denies that selflessness and compassion should be at the core of our moral code. Key components of Nietzsche’s critical project include his investigation of the history of the development of our moral commitments—the method of “genealogy”—as well as an analysis of the underlying psychological forces at work in our moral experiences and feelings. Ultimately, perhaps Nietzsche’s most serious objection to morality as it currently exists is his claim that it cannot help us to avoid the looming threat of nihilism.
In the positive project, Nietzsche offers a vision of what counts as a good and flourishing form of existence for certain people. This positive ethical vision is not open to everyone, but only to the so-called “higher types”—people whose psycho-physical nature makes them capable of coming to possess the traits and abilities that characterize health, vitality, and flourishing on Nietzsche’s account. The flourishing individual, according to Nietzsche, will be one who is autonomous, authentic, able to “create themselves,” and to affirm life. It is through such people, Nietzsche believes, that the threat of nihilism can be averted.
In 1981, the British philosopher Bernard Williams wrote that “[i]t is certain, even if not everyone has yet come to see it, that Nietzsche was the greatest moral philosopher of the past century. This was, above all, because he saw how totally problematical morality, as understood over many centuries, has become, and how complex a reaction that fact, when fully understood, requires.” As Williams’s remark suggests, the core of Nietzsche’s ethical thought is critical: Nietzsche seeks, in various ways, to undermine, critique, and problematize morality as we currently understand it. As Nietzsche himself puts it, “we need a critique of moral values, the value of these values should itself, for once, be examined” (On the Genealogy of Morality, Preface, 6). In speaking of “the value of these values”, Nietzsche is making use of two different senses of the notion of value. One is the set of values that is the object of the critique, the thing to be assessed and evaluated. The other is the standard by which we are to assess these values. In attacking moral values, then, Nietzsche is not setting himself against all possible evaluative systems. And as we shall see, Nietzsche does indeed go on to make many substantive evaluative claims of his own, both critical and positive, including many claims that are broadly ethical in nature. Nietzsche thus proposes to undertake what he calls a “revaluation of all values”, with the final product of this project being a new system of evaluations (see part 2., “The positive project”).
a. The Object of Nietzsche’s Attacks
Since Nietzsche’s critical project is not targeted towards all values as such, we should ask what, exactly, Nietzsche is attacking when he attacks “morality”. In fact, Nietzsche’s various attacks have multiple targets, which together form a family of overlapping worldviews, commitments, and practices. The Judeo-Christian moral-religious outlook is one broad target, but Nietzsche is also keen to attack the post-religious secular legacy of this moral code that he sees as dominant in his contemporary culture in Europe. He is concerned with Kantian morality, as well as the utilitarianism that was gaining prominence around Nietzsche’s time, especially in Britain. Aspects of his attacks are levelled against broadly Platonist metaphysical accounts, as well as the Christian inheritance of these accounts, which understand value as grounded in some otherworldly realm that is more real and true than the world we live in. Other parts of the critical project are directed towards certain particular evaluative commitments, such as a commitment to the centrality of pity or compassion (Mitleid), as exemplified in Schopenhauer’s ethics in particular, but which Nietzsche also sees as a point of thematic commonality between many different moral and religious worldviews. Nietzsche even criticizes evaluative systems that he envisages coming to be widely accepted in the future, such as the commitment to ease and comfort at all costs that he imagines the “last human being” endorsing (see section 1. g., “The threat of nihilism”).
Given this diversity, determining exactly what is under attack in Nietzsche’s critical project is best achieved though attention to the detail of the various attacks. In general, this article uses “morality” as a catch-all term to cover the multiple different objects of Nietzsche’s attacks, allowing the precise target of each attack to be clarified through the nature of the attack itself. The reader should note, then, that not all of Nietzsche’s attacks on “morality” will necessarily apply to each of the individual views and commitments that are gathered under this broad heading.
b. Rejection of an Otherworldly Basis for Value
Nietzsche rejects certain metaphysical accounts of the nature of value. These parts of Nietzsche’s position are not directly about the substantive evaluative content of moral worldviews, but rather the metaphysical presuppositions about the grounds of value that certain moral, and especially moral-theological, worldviews involve. In section 230 of Beyond Good and Evil, Nietzsche states that his philosophical work aims to “translate humanity back into nature”, to reject “the lures of the old metaphysical bird catchers who have been piping at him for far too long: ‘You are more! You are higher! You have a different origin!’”. Human beings, according to Nietzsche, are fundamentally a part of nature. This means that he rejects all accounts of morality that are grounded in a conception of human activity as answerable to a supernatural or otherworldly source of value. The idea of morality as grounded in the commands of God is thus rejected, as is the Platonist picture of a realm of ideal forms, including the “Form of the Good,” as the basis for value.
For the most part, Nietzsche does not go out of his way to argue against these sorts of metaphysical pictures of the nature of value. Instead, he tends to assume that his reader is already committed to a broadly naturalistic understanding of the world and the place of the human being within it. Nietzsche’s rejection of theological or Platonist accounts of the basis of value, then, tends to stand as a background assumption of his discussions, rather than as something he attempts to persuade his reader of directly.
The recurring motif of the “death of God” in Nietzsche’s writing is usefully illustrative here. In The Gay Science, Nietzsche describes a “madman” who is laughed at for announcing the death of God in the marketplace (section 125). But the laughter is not because people think that God is not dead but instead alive and well; rather, these people do not believe in God at all. The intellectual elite of Europe in Nietzsche’s day were, for the most part, atheists. Nietzsche’s insistent emphasis on the idea that “God is dead” is thus not intended as a particularly dramatic way of asserting the non-existence of God, and he does not expect the idea that God does not exist to come as a surprise to his reader. Rather, the problem that Nietzsche seeks to draw attention to is that his fellow atheists have failed to understand the cultural and spiritual significance of the widespread loss of belief in God, and thus of the associated metaphysical picture of the human being as created for a higher divine purpose (see section 1. g., “The threat of nihilism”).
Indeed, Nietzsche is often interested in the way in which aspects of these earlier supernatural worldviews, now largely abandoned, have nonetheless left traces within our current belief and evaluative systems—even within the modern naturalistic conception of the world that Nietzsche takes himself to be working within. Nietzsche writes:
New Battles. – After Buddha was dead, they still showed his shadow in a cave for centuries – a tremendous, gruesome shadow. God is dead; but given the way people are, there may still for millennia be caves in which they show his shadow. – And we – we must still defeat his shadow as well! (The Gay Science, 108)
Although Nietzsche clearly sets himself against supernaturalist accounts of value and of the place of the human being in the cosmos, the precise nature of his own naturalism, and the consequences of this naturalism for his own ethical project, is a topic of debate among commentators. This is complicated by the fact that Nietzsche often directs his attacks towards other naturalist accounts, sometimes simply under the heading of “naturalism,” in a way that can seem to suggest that he himself rejects naturalism. (See Leiter (2015), Clark and Dudrick (2012), and Riccardi (2021) for useful discussion of the nature of Nietzsche’s naturalism.)
In general, Nietzsche expects his reader to share his own basic naturalist orientation and rejection of supernatural metaphysics. However, he thinks that most people have failed to properly understand the full consequences of such commitments. The atheists of his day, thinks Nietzsche, have typically failed to understand the cultural impact that a loss of religious faith will have—perhaps because these cultural effects have not yet shown themselves clearly. Nietzsche also thinks that his contemporaries have not always grasped the ways in which an accurate picture of the nature of the human being will force us to revise or abandon many concepts that are key to our current understanding of morality—perhaps most strikingly, concepts of moral agency and responsibility (see the following section). Many of Nietzsche’s fellow naturalists suppose that we can abandon the supernatural trappings that have previously accompanied morality, and otherwise continue on with our evaluative commitments more or less as before. This, Nietzsche thinks, is not so.
c. Attacks on the Metaphysical Basis of Moral Agency
One family of arguments presented by Nietzsche attacks the metaphysical basis of moral agency. Again, the point here is not directly about the substantive evaluative content of particular moral systems, but rather their metaphysical presuppositions, especially those that have been thought to ground the concept of moral responsibility—notions of the freedom of the will, and the role of consciousness in determining human action.
First, Nietzsche attacks the idea of free will. Nietzsche writes:
The causa sui [cause of itself] is the best self-contradiction that has ever been conceived, a type of logical rape and abomination. But humanity’s excessive pride has got itself profoundly and horribly entangled with precisely this piece of nonsense. The longing for “freedom of the will” in the superlative metaphysical sense (which, unfortunately, still rules in the heads of the half-educated), the longing to bear the entire and ultimate responsibility for your actions yourself and to relieve God, world, ancestors, chance, and society of the burden—all this means nothing less than being that very causa sui and, with a courage greater than Münchhausen’s, pulling yourself by the hair from the swamp of nothingness up into existence. (Beyond Good and Evil, 21)
This passage appears to reject the idea of free will primarily on metaphysical grounds: for the will to be free would be for a thing to be causa sui, the cause of itself, and this is impossible. And so, to the extent that a moral worldview depends on the idea that we do have free will in this sense, then the foundations of such a worldview are undermined.
Some scholars, noting Nietzsche’s references to “pride” and “longing”, have suggested that the primary mode of Nietzsche’s attack on the idea of free will is practical rather than metaphysical. The real problem with the idea of free will, they argue, is that a belief in this idea is motivated by psychological weakness, and is thus not conducive to good psychic health and flourishing (see Janaway (2006)).
Others have argued that Nietzsche’s relationship to the traditional metaphysical debate about free will is not so much to deny that we have free will, but rather to deny the very coherence of the concept at work in this debate (see Kirwin (2017)). For Nietzsche goes on to call the notion of free will an “unconcept” or “nonconcept” (Unbegriff), insisting that just as we must let go of this notion, so too must we let go of “the reversal of this unconcept of ‘free will’: I mean the ‘un-free will’”.
This scholarly disagreement about the nature of Nietzsche’s attacks on the concept of free will also impacts how we understand parts of Nietzsche’s positive ethical vision. In particular, the question of whether we should understand that positive ethical vision to include an ideal of ‘freedom’ in some sense is hotly contested (see section 2. b., “Autonomy”).
Alongside these attacks on the notion of free will, Nietzsche also denies that human action is primarily a matter of conscious decision and control on the part of the agents themselves. We experience ourselves as consciously making decisions and acting on the basis of them, but this experience is, thinks Nietzsche, misleading. To begin with, our conscious self-awareness is only one small part of what is going on within the mind: “For the longest time, conscious thought was considered thought itself; only now does the truth dawn on us that by far the greatest part of the mind’s activity proceeds unconscious and unfelt” (The Gay Science, 333). Furthermore, Nietzsche thinks, it is unclear that this conscious part of the mind really plays any sort of role in determining our action, since “[a]ll of life would be possible without, as it were, seeing itself in the mirror and […] the predominant part of our lives actually unfolds without this mirroring” (The Gay Science, 354). Consciousness, says Nietzsche, is “basically superfluous” (ibid). These parts of Nietzsche’s account of human psychology have often been understood as a precursor to Freudian theories of the unconscious, as well as to recent empirical work establishing that our self-understanding of our own minds and activities is often far from accurate (see Leiter (2019)). Some scholars, while acknowledging Nietzsche’s downgrading of consciousness, have nonetheless argued that Nietzsche retains a robust picture of human agency (see Katsafanas (2016), and section 2. b., “Autonomy”).
Nietzsche’s rejection of free will and his denial of the idea that the conscious mind is the real source of action both appear to undermine the possibility of a person’s being morally responsible for their actions, at least as that notion has traditionally been understood. If moral responsibility requires free will in the sense rejected by Nietzsche, then there can be no moral responsibility. Some philosophers have argued that responsibility does not require free will in this sense, but they have generally done so by arguing that it is sufficient for responsibility that a person’s action follow from their intentions in the right sort of way. But Nietzsche’s attacks on the causal role of consciousness in human action seem to cause problems for this sort of approach as well. In undermining these metaphysical ideas about the nature of human action, then, Nietzsche takes himself to have done away with notion of moral responsibility, thus removing a key underpinning of the system of morality.
d. Attacks on the Content of Morality
Nietzsche also raises objections to the normative content of morality—to the things it presents as valuable and disvaluable, and the actions it prescribes and proscribes. One particular focus of his attacks here is the centrality of Mitleid (variously translated as “pity” or “compassion”) to the moral codes he sees in his contemporary society. Nietzsche sometimes refers to Christianity as “the religion of pity,” and asserts that “[i]n the middle of our unhealthy modernity, nothing is less healthy than Christian pity” (The Antichrist, 7). But Nietzsche’s critique of pity is not limited to Christianity; indeed, he suggests that the “morality of pity” is really an outgrowth of Christianity, rather than properly part of Christianity itself:
[…] ‘On n’est bon que par la pitié: il faut donc qu’il y ait quelque pitié dans tous nos sentiments’ [one is only good through pity: so there must be some pity in all of our sentiments]—thus says morality today! […] That men today feel the sympathetic, disinterested, generally useful social actions to be the moral actions – this is perhaps the most general effect and conversion which Christianity has produced in Europe: although it was not its intention nor contained in its teaching. (Daybreak, 132)
Nietzsche connects the morality of pity to utilitarian and socialist movements, to thinkers in France influenced by the French revolution, and to Schopenhauer’s moral philosophy. (Interestingly, Nietzsche notes that Plato and Kant, who are elsewhere the target of his attacks on morality, do not hold pity in high esteem—On the Genealogy of Morality, Preface, 5.)
The morality of pity, thinks Nietzsche, is problematic in various ways. It emphasizes the eradication of suffering as the main moral goal—and yet suffering, thinks Nietzsche, is not inherently bad, and can indeed be an impetus to growth and creativity. (Nietzsche himself suffered from ill health throughout his life, and often seems to connect his own intellectual and creative achievements to these experiences.) Pity, thinks Nietzsche, both arises from and exacerbates a “softness of feeling” (On the Genealogy of Morality, Preface, 6), as opposed to the sort of strong and hardy psychological constitution that he admires. The morality of pity also prioritizes the wellbeing of “the herd” over that of those individuals who have the potential to achieve greatness. Some of Nietzsche’s attacks on the morality of pity take the form of a distinctive sort of psychological critique: what presents itself as a concern for the other person in fact has a darker, hidden, and more self-serving motive (see section 1. f., “Psychological critique”). Finally, Nietzsche believes that making pity central to our evaluative worldview will lead humanity towards nihilism (see section 1. g., “The threat of nihilism”).
The German word that Nietzsche uses is Mitleid, which can be translated as “pity” or as “compassion”. Some scholars have sought to emphasize the difference between these two concepts, and to interpret Nietzsche’s attacks on Mitleid through the lens of this distinction (Von Tevenar (2007)). The proposal is that pity focuses its attention on the suffered condition rather than on the sufferer themselves, creating distance between the sufferer and pitier, and as a result can end up tinged with a sense of superiority and contempt on the part of the pitier. Compassion, by contrast, is understood to involve genuine other-regarding concern and thus to foster closeness between the two parties. When we read Nietzsche’s attacks on Mitleid in light of this distinction, some of his objections seem to apply primarily to pity, thus understood, while others seem to take compassion as their main target (see section 1. f., “Psychological critique” and section 1. g. “The threat of nihilism” for some further discussion).
Nietzsche’s various objections to Mitleid stand at the heart of his attack on the content of morality. But, as he explains, his concerns with this concept eventually lead him to a broader set of questions about morality. Nietzsche says:
This problem of the value of pity and of the morality of pity […] seems at first to be only an isolated phenomenon, a lone question mark; but whoever pauses over the question and learns to ask, will find what I found:—that a vast new panorama opens up for him, a possibility makes him giddy, mistrust, suspicion, and fear of every kind spring up, belief in morality, all morality, wavers. (On the Genealogy of Morality, Preface, 6)
More generally, then, Nietzsche holds that various traits, behaviors, and ideals that morality typically holds in high regard—humility, love of one’s neighbor, selflessness, equality, and so on—are all open for critique, and indeed all are on Nietzsche’s view found wanting. These values are, according to Nietzsche, “ascetic” or “life-denying”—they involve a devaluation of earthly existence, and indeed of those parts of human existence, such as struggle, suffering, hardship, and overcoming, that are capable of giving rise to greatness. It may be true that the more people possess the qualities that morality holds in high esteem, the easier and more pleasant life may be for the majority of people. But whether or not this is really so does not really matter, for Nietzsche is not concerned with how things are for the majority of people. His interest is primarily in those individuals who have the potential for greatness—those “higher types” who are capable of great deeds and profound creative undertakings. And here, Nietzsche thinks, the characteristic values that morality holds in such esteem are not conducive to the health and flourishing of these individuals.
e. Genealogical Critique
One of the most important and influential components of Nietzsche’s critical project is his attempt to offer a ‘genealogy’ of morality, a certain sort of historical account of its various origins and development over time. This account is offered primarily in On the Genealogy of Morality, though other texts develop similar themes, especially Beyond Good and Evil. In the Genealogy, Nietzsche explicitly connects this historical investigation to his critical project:
[W]e need a critique of moral values, the value of these values should itself, for once, be examined—and so we need to know about the conditions and circumstances under which the values grew up, developed and changed. (On the Genealogy of Morality, Preface, 6)
Scholars have puzzled over this claim. Why do we need to know about the historical origins of morality in order to assess its value here and now? Indeed, it has seemed to many that Nietzsche is here committing the “genetic fallacy”, wrongly inferring an assessment of a thing’s current meaning or value on the basis of its source or origins. But Nietzsche himself appears to be aware of the fallacy in question (see for example The Gay Science 345), and so we have reason to take seriously the project of the Genealogy and to try to understand it as part of Nietzsche’s critical project.
In fact, there are ways in which a thing’s source or origin can rightly affect our current assessment of it. For example, if you learn that the person who gave you a piece of information is untrustworthy, this does not automatically imply that the information is false, but it does undermine your original justification for accepting it, and gives you reason to reconsider your belief in it. It may be that Nietzsche’s genealogical project works in a similar sort of way. In seeking to understand morality as a historical phenomenon, Nietzsche’s approach already unsettles certain aspects of our understanding of morality’s nature and its claim to authority over us. If we had supposed that morality has a timeless or eternal nature (perhaps because it is bestowed by God, or because it is grounded in something like Plato’s Form of the Good—see section 1. b., “Rejection of an otherworldly basis for value”), then coming to understand it as instead a contingent product of human history and development may give us reason to question our commitment to it. Even if morality is not thereby shown to be bad or false, it does seem to be revealed as something that is properly open to questioning and critique.
Furthermore, part of Nietzsche’s point in developing his genealogical account is that certain human phenomena—here, morality, and its associated concepts and psychological trappings—are essentially historical, in the sense that one will not understand the thing itself as it exists here and now, and thus will not be able to give a proper critique, without understanding how it came to be. (Think of what would be needed for a person to properly understand the phenomenon of racial inequality in the present-day United States, for instance.) To fully comprehend the nature of morality, and thus to get it into view as the object of our critique, thinks Nietzsche, we will need to investigate its origins.
In the First Essay of the Genealogy, “‘Good and Evil,’ ‘Good and Bad,’” Nietzsche charts the emergence of two distinct systems of evaluation. The first is the aristocratic “master morality,” which begins from an evaluation of the aristocratic individual himself as “good,” which here indicates something noble, powerful, and strong. Within this moral code, the contrasting evaluation—“bad”—is largely an afterthought, and points to that which is not noble, namely the lowly, plebian, ill-born masses. The opposing evaluative system, “slave morality,” develops in reaction to the subjugation of this lower class under the power of the masters. Led by a vengeful priestly caste (which Nietzsche connects to Judaism), this lower class enacts the “slave revolt in morality,” turning the aristocratic moral code on its head. Within the slave moral code, the primary evaluative term is “evil,” and it is applied to the masters and their characteristic traits of strength and power. The term “good” is then given meaning relative to this primary term, so that “good” now comes to mean meek, mild, and servile—qualities which the slave class possess of necessity, but which they now cast as the products of their own free choice. This evaluative system comes along with the promise that justice will ultimately be meted out in the afterlife: those who suffer and are oppressed on earth will receive their reward in heaven, while the evil masters will face an eternity of punishment in hell. In the resulting struggle between the two evaluative systems, it was the slave morality that eventually won out, and it is this moral code that Nietzsche takes to be dominant in the Europe of his day.
In the Second Essay, “‘Guilt,’ ‘Bad Conscience,’ and Related Matters,” Nietzsche explores the origins of the institution of punishment and of the feelings of guilt and bad conscience. Punishment, Nietzsche thinks, originally emerged from the economic idea of a creditor-debtor relationship. The idea eventually arises that an unpaid debt, or more generally an injury of some kind, can be repaid through pain caused to the debtor. It is from this idea that the institution of punishment comes into being. But punishment is not what gives rise to feelings of bad conscience. Instead, the origins of bad conscience, of the feeling of guilt, arise as a result of violent drives that would normally be directed outwards becoming internalized. When individuals come to live together in communities, certain natural violent tendencies must be reined in, and as a result they are turned inwards towards the self. It is the basic drive to assert power over others, now internalized and directed towards the self, that gives rise to the phenomenon of bad conscience.
In the Third Essay, “What Do Ascetic Ideals Mean?,” Nietzsche explores the multiple significances that ascetic ideals have had, and the purposes they have served, for different groups of people, including artists, philosophers, and priests. The diversity of meanings that Nietzsche finds in ascetic ideals is an important component of the account: one of the characteristic features of genealogy as a method of investigation is the idea that the object under scrutiny (the phenomenon of morality, for instance) will not have a single unified essence, meaning, or origin, but will rather be made up of multiple overlapping ideas which themselves change and shift over time. Nonetheless, ascetic ideals share in common the characteristic of being fundamentally life-denying, and thus, on Nietzsche’s account, not conducive to flourishing health. And although the narrative of the Genealogy so far has connected these ideals to the Judeo-Christian worldview and moral code, in the final part of the book we are told that the most recent evolution of the ascetic ideal comes in the form of science, with its unquestioning commitment to the value of truth. Nietzsche’s critique of morality thus leads even further than we might have expected. It is not only the Judeo-Christian moral code, nor even its later secular iterations that are under attack here. Rather, Nietzsche seeks to call into question something that his investigation has revealed to be an outgrowth of this moral code, namely a commitment to the value of truth at all costs. Even practices like science, then, embody the life-denying ascetic ideal; even the value of truth is to be called into question, evaluated—and found wanting.
In general, Nietzsche expects that when we consider the origins of morality that he presents us with, we will find them rather unsavory. For instance, once we realize that morality’s high valuation of pity, selflessness, and so on came to be out of the weakness, spite, and vengefulness of the subjugated slave class, this new knowledge will, Nietzsche hopes, serve to lessen the grip that these values have on us. Even if morality’s dark origins do not in themselves undermine the value of these ideals, the disquiet or even disgust that we may feel in attending to them can do important work in helping us to overcome our affective attachment to morality. Overcoming this attachment will pave the way for a more clear-eyed evaluation of these ideals as they exist today.
Nonetheless, the question remains just how far this sort of historical account can take us in assessing the value of morality itself. Even if the ideal of loving one’s neighbor, for instance, originally emerged out of a rather less wholesome desire for revenge, this seems not to undermine the value of the ideal itself. So long as loving one’s neighbor now does not involve such a desire for revenge, what, really, has been shown to be wrong with it? Nietzsche sometimes seems to be suggesting, however, that the historical origins of morality are not merely something that happened in the past. Instead, the dark motives that originally gave rise to morality have left their traces within our current psychological make-up, so that even today the ideal of loving one’s neighbor retains these elements of cruelty and revenge. (See section 1. f., “Psychological critique.”)
The Genealogy leaves behind a complex legacy. Scholars still disagree about what, exactly, the method of genealogy really is and what it can achieve. Nonetheless, Nietzsche’s approach has proved remarkably influential, perhaps most notably in relation to Foucault, who sought to offer his own genealogical accounts of various phenomena. The Genealogy also stands in a complex relationship to anti-Semitism. Nietzsche’s writing, including the Genealogy, often include remarks highly critical of anti-Semitism and anti-Semitic movements of his time. Nonetheless, that the book itself deals freely in anti-Semitic tropes and imagery seems undeniable.
f. Psychological Critique
Another distinctive component of Nietzsche’s critical project is his psychological analysis of moral feelings and behavior. Nietzsche frequently attempts to reveal ways in which our self-understanding of supposedly “moral” experiences can be highly inaccurate. Lurking behind seemingly compassionate responses to others, Nietzsche claims, we find a dark underside of self-serving thoughts, and even wanton cruelty. He suggests that feelings of sympathy [Mitgefühl] and compassion [Mitleid, also translated as “pity”] are secretly pleasurable, for we enjoy finding ourselves in a position of power and relative good fortune in relation to others who are suffering. These supposedly selfless, kind, and other-regarding feelings are thus really nothing of the sort.
Nietzsche’s psychological analysis of moral feelings and behaviors echoes the historical analysis he provides in the Genealogy (see section 1. e., “Genealogical critique”). Nietzsche often uses metaphors of “going underground” to represent investigations into the murky historical origins of morality as well as investigations into subconscious parts of the individual or collective psyche. It is not fully clear exactly how the two sorts of investigation are connected for Nietzsche, but he does seem to think that a person’s present psychic constitution can bear the imprint not only of their own personal history but also of historical events, forces, and struggles that affected their ancestors. If this is so, it seems plausible for Nietzsche to suppose that the subconscious motives at work in a person’s psyche could reflect the historical origins that Nietzsche traces for morality more generally, and that an investigation into one could at the same time illuminate the other.
Leaving aside this connection between psychological investigation and genealogy, when it comes to the detail of Nietzsche’s claims about what is really going on in specific instances of seemingly moral feelings, many commentators have found Nietzsche’s psychological assessments to be cuttingly insightful. As Philippa Foot puts it, “Nietzsche, with his devilish eye for hidden malice and self-aggrandizement and for acts of kindness motivated by the wish to still self-doubt, arouses a wry sense of familiarity in most of us”. Nietzsche does seem to have a knack for uncovering hidden motives, and for getting the reader to recognize these less wholesome parts of their own psyche. For instance, describing our responses when someone we admire is suffering, Nietzsche says:
We try to divine what it is that will ease his pain, and we give it to him; if he wants words of consolation, comforting looks, attentions, acts of service, presents—we give them; but above all, if he wants us to suffer at his suffering we give ourselves out to be suffering; in all this, however, we have the enjoyment of active gratitude—which, in short, is benevolent revenge. If he wants and takes nothing whatever from us, we go away chilled and saddened, almost offended […]. From all this is follows that, even in the most favourable case, there is something degrading in suffering and something elevating and productive of superiority in pitying. (Daybreak, 138)
Here, if the reader follows along imaginatively with Nietzsche’s story, they may indeed find themself feeling “chilled and saddened, almost offended” when supposing that the suffering person does not want their help—perhaps they even experience the feeling a split second before they read Nietzsche’s naming of those very feelings. They have been caught in the act, as it were, and made conscious of the secretly self-regarding nature of their supposedly compassionate responses to the suffering of others.
But even supposing that Nietzsche’s observations are correct about a great many real-world instances of purportedly moral phenomena—or even all of them—what sort of objection to morality does this really give us? After all, the problem here does not seem to be with the moral values or ideals themselves. Nietzsche’s objection here does not appear to directly target compassion itself (say) as a moral ideal, but rather the hypocrisy of those who understand themselves and others to be compassionate, but who are in reality anything but. Indeed, in a certain sense, the critique seems to depend on the idea that cruelty and self-serving attitudes are bad, and this evaluation is itself a core component of the morality that Nietzsche is supposed to be attacking.
There are various ways of making sense of Nietzsche’s psychological critique as part of his broader critique of morality. It may be that the uncovering of these hidden motives is merely intended to elicit an initial air of disquiet and an attitude of suspicion towards the whole system of morality—to force us to let go of our comfortable sense that all is well with morality as it currently exists. It seems likely, in addition, that Nietzsche’s main concern is not so much with moral values in the abstract (with the concept of compassion, say), but rather with their concrete historical and psychological reality—and this reality, Nietzsche suggests, is importantly not as it seems. Or perhaps the point is that human nature is always going to be driven by these more malicious feelings, so that a morality that fails to recognize this fact must be grounded in fantasy.
In general, the approach taken in Nietzsche’s psychological analysis of moral behaviour seems to take the form of an internal critique. Nietzsche expects his reader to be moved, on the basis of their current evaluative commitments, by his unmasking project: the hypocrisy of a cruel and self-serving tendency that masquerades as kindness and compassion is likely to strike us as distasteful, unappealing, perhaps disgusting. And thus shaken from our initially uncritical approval of what had presented itself as kindness and compassion, we may find ourselves psychologically more disposed to embark on the deeper ‘revaluation’ project that Nietzsche wants us to undertake. When we do so, Nietzsche hopes to persuade us of the disvalue not only of cruel egoism that presents itself as compassion, but indeed of compassion itself as an ideal. For this ideal, he argues, is fundamentally life-denying, and as a result will lead to nihilism (see the following section). (For more on the precise form of Nietzsche’s objections to Mitleid—pity or compassion—see Von Tevenar (2007).)
g. The Threat of Nihilism
Perhaps Nietzsche’s main objection to our current moral outlook is the likelihood that it will lead to nihilism. Nietzsche says:
Precisely here I saw the great danger to mankind, its most sublime temptation and seduction—temptation to what? to nothingness?—precisely here I saw the beginning of the end, standstill, mankind looking back wearily, turning its will against life, and the onset of the final sickness becoming gently, sadly manifest: I understood the morality of compassion, casting around ever wider to catch even philosophers and make them ill, as the most uncanny symptom of our European culture which has itself become uncanny, as its detour to a new Buddhism? To a new Euro-Buddhism? to—nihilism? (On the Genealogy of Morality, Preface, 5)
The Europe of Nietzsche’s day is entering a post-religious age. What his contemporaries do not realize, Nietzsche thinks, is that following the “death of God,” humanity faces an imminent catastrophic loss of any sense of meaning. Nietzsche’s contemporaries have supposed that one can go on endorsing the basic evaluative worldview of the Judeo-Christian moral code in a secular age, by simply excising the supernatural metaphysical underpinnings and then continuing as before. But this, thinks Nietzsche, is not so. Without these underpinnings, the system as a whole will collapse.
The problem does not seem to be exactly the metaethical worry that the absence of a properly robust metaphysical grounding for one’s values might undermine the project of evaluation as such. After all, Nietzsche himself seems happy to endorse various evaluative judgments, and he does not take these to be grounded in any divine or otherworldly metaphysics. (However, see Reginster (2006) for discussion of nihilism as arising from an assumption that value must be so grounded.) Instead, the problem seems to arise from the specific content of our current moral worldview. In particular, as we have seen, this worldview embodies ascetic and life-denying values—human beings’ earthly, bodily existence is given a negative evaluative valence. In the religious version of these ascetic ideals, however, the supernatural component provided a higher purpose: earthly suffering was given meaning through the promise that it would be repaid in the afterlife. Shorn of this higher purpose, morality is left with no positive sense of meaning, and all that remains is the negative evaluation of suffering and earthly existence. The old Judeo-Christian morality thus evolves into a secular “morality of pity,” aiming only at alleviating suffering and discomfort for “the herd.”
In pursuing this negative goal, the morality of pity seeks at the same time to make people more equal—and thus, thinks Nietzsche, more homogenous and mediocre. In Thus Spoke Zarathustra, Nietzsche gives a striking portrayal of the endpoint of this process:
Behold! I show you the last human being.
‘What is love? What is creation? What is longing? What is a star?’—thus asks the last human being, blinking.
Then the earth has become small, and on it hops the last human being, who makes everything small. His kind is ineradicable, like the flea beetle; the last human being lives longest.
‘We invented happiness’—say the last human beings, blinking.
They abandoned the regions where it was hard to live: for one needs warmth. One still loves one’s neighbor and rubs up against him: for one needs warmth.
[…]
One has one’s little pleasure for the day and one’s little pleasure for the night: but one honors health.
‘We invented happiness’ say the last human beings, and they blink. (Thus Spoke Zarathustra, Zarathustra’s Prologue, 5)
The “last human being” (often translated as “last man”) is taken by scholars to be Nietzsche’s clearest representation of the nihilism that threatens to follow from the death of God. Without any sense of higher meaning, and valuing only the eradication of suffering, humanity will eventually become like this, concerned only with comfort, small pleasures, and an easy life. Nietzsche’s dark portrait of the vacuously blinking “last human being” is supposed to fill the reader with horror—if this is where our current moral system is leading us, it seems that we have good reason to join Nietzsche in his project of an attempted “revaluation of all values”.
2. The Positive Project
As we have seen, Nietzsche’s critical project aims to undermine or unsettle our commitment to our current moral values. These values are fundamentally life-denying, and as such they threaten to bring nihilism in the wake of the death of God. In place of this system of values, then, Nietzsche develops an alternative evaluative worldview.
Drawing on a distinction suggested by Bernard Williams, we might usefully characterize Nietzsche’s positive project as broadly “ethical” rather than “moral,” in that it is concerned more generally with questions about how to live and what counts as a good, flourishing, or healthy form of life for an individual, rather than with more narrowly “moral” questions about right and wrong, how one ought to treat others, what one’s obligations are, or when an action deserves punishment or reward. As a result of this focus on health and flourishing, some scholars have characterized Nietzsche’s positive ethical project as a form of virtue ethics.
a. Higher Types
Nietzsche is not, however, interested in developing a general account of what counts as flourishing or health for the human being as such. Indeed, he rejects the idea that there could be such a general account. For human beings are not, according to Nietzsche, sufficiently similar to one another to warrant any sort of one-size-fits-all ethical code. The primary distinction is between two broad character “types”: the so-called “higher” and “lower” types. Nietzsche’s concern in the positive project is to spell out what counts as flourishing for the higher types, and under what conditions this might be achieved.
The distinction between higher and lower types appears to be a matter of one’s basic and unalterable psycho-physical make-up. While Nietzsche sometimes speaks as though all people can be straightforwardly sorted into one or the other category, at other points things seem more complicated: it may be, for example, that certain higher or lower character traits can end up mixed together in a particular individual. Nietzsche does not limit the concept of “higher types” to any particular ethnic or geographic group. He mentions instances of this type occurring in many different societies and in many different parts of the world. The distinction itself seems, in addition, to be largely ahistorical, such that there always have been and (perhaps) always will be higher types.
However, the detail of what the higher type looks like does vary based on the particular historical context. For example, the infamous “blond beasts” mentioned in the Genealogy are likely examples of higher types, but Nietzsche does not advocate a return (even if such were possible) to this cheerfully unreflective mode of existence. In the wake of the slave revolt in morality, human beings have become more complicated and more intellectual, and this development—though problematically shot through with ascetic ideals—has opened up new and more refined modes of existence to the higher types. As a result, the individuals that Nietzsche points to as his contemporary examples of higher types—Goethe, Emerson, and of course Nietzsche himself—tend to express their greatness through intellectual and artistic endeavors rather than through plundering and bloodlust. (Napoleon stands as an exception, although Nietzsche seems to think of him as a striking, and also somewhat startling, throw-back to an earlier mode of human existence.)
In general, the “higher type” designation seems to indicate a certain sort of potential that an individual possesses to achieve a certain state of being that Nietzsche takes to be valuable—a potential that may or may not end up being realized. The bulk of Nietzsche’s positive project, then, is concerned with spelling out what this state of being looks like, as well as what circumstances lead to its coming to fruition.
b. Autonomy
In recent years, commentators have focused on the notion of autonomy as a central component of Nietzsche’s ideal for the higher types. The autonomous individual, according to Nietzsche, is characterized primarily by self-mastery, which enables him (it appears, on Nietzsche’s account, to be invariably a “him”) to undertake great and difficult tasks—including, as we have seen, great intellectual and artistic endeavors.
This self-mastery, it seems, is primarily a matter of the arrangement of a person’s “drives”—the various and variously conflicting psychic forces that make up his being. What constitutes an ideal arrangement of drives for Nietzsche is not easy to pin down with precision, but some points seem clear. In the autonomous individual, the drives form a robust sort of a unity, with one or more of the most powerful drives co-opting others into their service, so that the individual is not being pulled in multiple different directions by different competing forces but instead forms a coherent whole. Not all forms of unity, however, will do the job. In Twilight of the Idols, Nietzsche offers a psychological portrait of Socrates, describing the “chaos and anarchy of [Socrates’] instincts” along with the “hypertrophy” of one particular drive—that of reason. In Socrates, according to Nietzsche, reason subjugates and tyrannizes over the other wild and unruly appetites, which are seen as dangerous alien forces that must be suppressed at all costs. The tyranny of reason does impose a unity of sorts, but Nietzsche does not seem impressed by the resulting figure of Socrates, whom he labels as “decadent”. The problem with Socrates’ drive formation may be formal—it may be that one drive merely tyrannizing over the others does not give us the right sort of unity; the controlling drive, we might suppose, ought instead to refine, sublimate, and transform the other drives to redirect them towards its purpose, rather than merely aiming to crush or extirpate them. Alternatively, the problem may be substantive: the issue might not be that one drive tyrannizes, but rather which drive is doing the tyrannizing in the case of Socrates. The tyranny of a less ascetic and life-denying drive might leave us with something that Nietzsche would be happy to think of as genuine self-mastery and hence autonomy. (For an interesting discussion of Nietzsche’s account of Socrates’ decadence, including the implicit references made to Plato’s city-soul analogy in the Republic, see Huddleston (2019). For Nietzsche’s drive-based psychology more generally, see Riccardi (2021), and for its relation to Nietzsche’s ideal, see Janaway (2012).)
A point of contention in the literature concerns whether or not the concept of “autonomy” (and related concepts of self-mastery and unity of drive formation) as Nietzsche uses it should be understood as connected to the concept of freedom. There are two related questions on the table here, which ought to be kept separate. The first is whether autonomy itself should be understood as a conception of freedom, so that to be autonomous is to be free in some sense. If so, then it seems that Nietzsche’s positive ethical vision includes freedom as an ideal that can be possessed by certain individuals who are capable of it. The second is whether or to what extent it is up to the individual to bring it about that he becomes autonomous—that is, whether or not the ideal of autonomy is an ideal that a higher type could pursue and achieve through their own agency. Let us consider the two questions in turn.
We have seen already that Nietzsche rejects a certain conception of freedom—the conception of “free will in the superlative metaphysical sense,” as he puts it (see section 1. c., “Attacks on the metaphysical basis of moral agency”). But several scholars have suggested that Nietzsche’s concept of autonomy is intended to offer an alternative picture of freedom, one that is not automatically granted to all as a metaphysical given, but which is rather the possession of the few. Ken Gemes (2009) thus marks a distinction between “deserts free will”—the sort of free will that could ground moral responsibility and thus a concept of desert, and which Nietzsche denies—and “agency free will” or autonomy, which Nietzsche grants certain individuals can come to possess. Several scholars have embraced Gemes’s distinction, and they and others have developed the idea that autonomy as freedom stands as a certain sort of ideal for Nietzsche (see Janaway (2006), May (2009), Richardson (2009), Kirwin (2017)). The thought is roughly that the autonomous individual is “free” because and insofar as he possesses certain sorts of agential abilities: having mastered himself, the autonomous agent is distinctively able to assert his will in the world, to make and honor certain sorts of commitment to himself or to others, to overcome resistance and obstacles to achieve his ends, and so on.
Against this school of thought, other scholars (most notably Brian Leiter) have argued that the picture of the autonomous individual that Nietzsche thinks so highly of does not give us in any meaningful sense a picture of freedom. On this reading, Nietzsche’s overall views on the question of freedom and free will are simple: none of us, not even those self-mastered higher types can be said to be free. Commentators from this camp do not deny that Nietzsche approves of the individual whose drives form a particular robust and powerful unity and who is thus “master of himself” and able to assert his will in the world. Their point is simply that these qualities do not amount to the individual’s being free in any meaningful sense.
One passage in particular has proven to be a point of controversy in the literature. In the Genealogy, Nietzsche introduces a character, the “Sovereign Individual,” who is described as the endpoint of a long historical process. The Sovereign Individual, Nietzsche says, is:
Like only to himself, having freed himself from the morality of custom, an autonomous, supra-moral individual (because ‘autonomous’ and ‘moral’ are mutually exclusive), in short, we find a man with his own, independent, enduring will, whose prerogative it is to promise—and him a proud consciousness quivering in every muscle of what he has finally achieved and incorporated, an actual awareness of power and freedom, a feeling that man in general has reached completion. (On the Genealogy of Morality, II:2)
How should we interpret this passage? There are, broadly speaking, three types of reading open to us. On the first, Nietzsche is sincere in his rather bombastic praise of this character, and his talk of freedom here should be taken seriously: that the Sovereign Individual is described as “autonomous” and as in various respects “free” gives us reason to think that Nietzsche really does hold freedom as a positive ideal for the higher types (see Ridley (2009) for one instance of this sort of reading). On the second type of reading, Nietzsche’s praise is sincere, but his talk of “freedom” is in a certain sense disingenuous: it is an instance of “persuasive definition” (the term comes from Charles Stevenson, writing in a different context), in which Nietzsche seeks to use the word ‘freedom’ in rather a different way to its ordinary usage, while at the same time capitalizing on the emotional attachment he can reasonably expect his readers will have to the term (see Leiter (2011)). On the third type of reading, Nietzsche’s praise of this character is given in a sarcastic tone: after all, the main achievement of this “Sovereign Individual” appears to be that he is able to keep his promises and pay his debts; perhaps what we have here is not a genuinely autonomous Nietzschean ideal (whatever that amounts to), but rather just a self-important member of the petty-bourgeoisie (see Rukgaber (2012), Acampora (2006)). Scholars remain divided on the interpretation of this passage in particular, as well as on the general question of whether the ideal that Nietzsche offers of the self-mastered individual, constituted by a robust unity of drives, should be thought of as an ideal of freedom.
We can in addition consider a second question. Granting that Nietzsche does think highly of such an individual, and that autonomy in this sense represents an ethical ideal for Nietzsche, we can ask whether or not it is an ideal that the higher types can consciously aspire to and work towards. Nietzsche sometimes talks of this ideal state as a sort of “achievement,” and some commentators have as a result presented autonomy as something that one can choose to pursue, and thus can through one’s own efforts bring about (can “achieve” in this sense). But this strongly agential reading of the process of coming to be autonomous faces a problem. For this account seems to suggest that one can freely, in some sense, bring it about that one becomes autonomous. But if Nietzsche has a positive picture of what it is to be free (and thus to act freely) at all, that picture seems to be the picture of autonomy, the state that one is here trying to achieve. It would be a mistake, then, to suppose that one can freely pursue and achieve autonomy, since this would be to import an additional illicit concept of freedom into the picture—the freedom one exercises in freely choosing to become autonomous.
A more plausible account, and one that accords more closely with Nietzsche’s texts, would have the process of coming-to-autonomy to be something that happens in some sense of its own accord, as a result of the interplay of external circumstance (including multi-generational historical processes) and facts about the individual’s inherent nature. Nietzsche often speaks of the growth of such an individual as occurring like the growth of a seed into a plant: the seed does not choose to grow into a mature plant or pursue it as a conscious goal; rather, if conditions are right, and the seed itself is healthy and well-formed, it will indeed grow and flourish. This, then, is how we should understand the process that results in a higher type’s “achieving” the ideal of autonomy. Whether or not that ideal, once achieved, should properly be thought of as a conception of freedom is a separate question. It does not follow from the fact that a condition is not freely pursued and reached that it cannot, once reached, count as a form of freedom.
c. Authenticity and Self-Creation
As the talk of seeds and plants suggests, a key component of Nietzsche’s positive ideal for the higher types involves a process of development into one’s “proper” or “true” or “natural” form. An acorn, given the right conditions, will grow into a particular type of thing—an oak tree—and as such it will have certain distinctive features: it will grow to a certain height, have leaves of a certain shape, and so on. Even when it was a small acorn, this is the form that is proper to it, to which it is in some sense “destined” to grow. “Destined” here does not mean “guaranteed,” for things may go wrong along the way, and the tree may end up stunted, withered, or barren. Nonetheless, if all goes well, the seed will develop into its proper form. Something like this seems to be what Nietzsche has in mind when he speaks of the importance of “becoming what one is.”
One very interesting feature of Nietzsche’s emphasis on this concept is the connection he draws to another concept that seems to be important to his positive ethical vision, namely the idea that one should “create oneself.” Contrasting himself and other higher types from “the many” who are concerned with “moral chatter,” Nietzsche says:
We, however, want to become who we are—human beings who are new, unique, incomparable, who give themselves laws, who create themselves! (The Gay Science, 335)
These two ideas—becoming who one is, and creating oneself, seem on the face of it to stand in some tension with one another. For the notion of becoming who one is implies that one has a particular determinate essential nature, a nature that one will ideally come to fulfil, just as the acorn in the right conditions can grow to reveal its proper and fullest form, that of the oak tree. But the concept of creating oneself, by contrast, seems to conflict with this sort of essence-based destiny. The notions of creation and creativity that Nietzsche invokes here seem to imply that the endpoint of the process is not fixed ahead of time; instead, there seems to be scope for free choice, for different possible outcomes, perhaps even for arbitrariness.
We can bring the two notions into closer alignment by attending to Nietzsche’s own account of artistic creation. Nietzsche rejects the idea that the artist’s approach is one of “laissez-aller”, letting go; instead, he says:
Every artist knows how far removed this feeling of letting go is from his ‘most natural’ state, the free ordering, placing, disposing and shaping in the moment of ‘inspiration’ – he knows how strictly and subtly he obeys thousands of laws at this very moment, laws that defy conceptual formulation precisely because of their hardness and determinateness. (Beyond Good and Evil, 188)
Artistic creation, then, is precisely not about arbitrary choice, but is rather a sort of activity in accordance with necessity. (We can imagine an artist, having been asked why he chose to compose a painting in particular way, replying: “I didn’t choose it—it had to be that way, otherwise the painting wouldn’t have worked!”) And indeed, immediately following the remark about human beings “creating themselves” in The Gay Science, Nietzsche continues:
To that end we must become the best students and discoverers of everything lawful and necessary in the world: we must become physicists in order to become creators in this sense – while hitherto all valuations and ideals have been built on ignorance of physics or in contradiction to it. (The Gay Science, 335)
Nietzsche wants us to understand the process of creation, then, as intimately connected to notions of necessity and law-governed activity. Just as the great artist is not making arbitrary choices but rather responding to their understanding of the unstated (and unstatable) aesthetic laws that govern how things must be done in this particular instance, so too the process of creation through which one creates oneself is not a matter of arbitrary choice but rather of necessity. What marks out an individual’s development as a process of self-creation will thus depend on whether or not the necessity derives from his own inner nature or from external sources. If the value system that an individual embraces (for instance) is merely a result of his being molded by his surrounding society, the worldview of which he accepts unquestioningly, then he will not count as having created himself, for his character has been shaped by forces outside of him and not by his own internal nature. If, on the other hand, an individual’s character emerges as a result of his own inner necessities, then he will count as having created himself. As we have already seen in the previous section, the idea will not be that a person makes a conscious choice to “create himself,” then going on to do so, for whether or not this process will take place is not a matter of conscious choice on the part of the individual. Nonetheless, the individual who creates himself has the principle of his own development, and his own character, within himself—within his inner nature. In this way, Nietzsche’s key concepts of authenticity (being who one is) and self-creation do indeed turn out to be intimately connected.
d. Affirmation
Perhaps the most fundamental part of Nietzsche’s positive ethical vision is his notion of “affirmation”. The flourishing individual, according to Nietzsche, will “say yes” to life—he will embrace and celebrate earthly existence, with all its suffering and hardships. Connected to this notion of affirmation are two other key Nietzschean concepts—amor fati, or love of (one’s) fate, and the notion of “eternal recurrence”:
My formula for human greatness is amor fati: that you do not want anything to be different, not forwards, not backwards, not for all eternity.
The notion of affirmation should be understood by way of contrast with the worldview of the morality that we have seen under attack in the critical part of Nietzsche’s project. Morality, as we have seen, involves a commitment to “life-denying” values: the earthly reality of human existence, and the suffering and pain it involves, is given a fundamentally negative evaluation, so that the only things that have a positive value are the promise of an afterlife in another world (in the religious iteration of the worldview), and the absence of suffering (in the secular version). The life-denying nature of these values is what threatens a descent into nihilism. Nietzsche’s positive ethical vision, by contrast, calls for an embracing of earthly life, including all of its suffering and pain.
The difficulty of Nietzsche’s ethical demand here should not be underestimated. To truly “say yes” to life, to “love one’s fate,” it is not enough simply to tolerate the difficulties and suffering for the sake of the greatness that comes along with them. Instead, one must actively love all aspects and moments of one’s life—to the extent of willing that one’s whole life, even the lowest lows, be repeated through all eternity. This is the notion of “eternal recurrence” or “eternal return”.
Some of Nietzsche’s unpublished remarks present the notion of eternal recurrence as a cosmological thesis to the effect that time is cyclical, so that everything that has happened will continue to repeat eternally. However, the emphasis within the published works is rather on eternal recurrence as a sort of test of affirmation: the point is to consider how one would react if one learnt that one’s life would repeat eternally—and this is the use of the concept that scholars have for the most part focused on. It is generally agreed that Nietzsche was not claiming that everything will in fact recur eternally.
This notion of eternal recurrence shows up in numerous places in the published works. In the Gay Science, Nietzsche says:
What if some day or night a demon were to steal into your loneliest loneliness and say to you: ‘This life as you now live it and have lived it you will have to live once again and innumerable times again; and there will be nothing new in it, but every pain and every joy and every thought and sigh and everything unspeakably small or great in your life must return to you, all in the same succession and sequence—even this spider and this moonlight between the trees, and even this moment and I myself. […]’ Would you not throw yourself down and gnash your teeth and curse the demon who spoke thus? Or have you once experienced a tremendous moment when you would have answered him: ‘You are a god, and never have I heard anything more divine.’ (The Gay Science, 341)
Eternal recurrence is also the central teaching of the prophet-like figure of Zarathustra in Thus Spoke Zarathustra (compare Nietzsche’s own discussion of Zarathustra in Ecce Homo). However, even Zarathustra himself finds it incredibly difficult to achieve the state of sincerely willing the eternal recurrence. Nietzsche seemed to think that this test of affirmation would be very difficult (perhaps impossible) for people, even truly great individuals, to pass. Nonetheless, this is the state of being that would be genuinely and fully opposed to the life-denying values of morality, and to the nihilism that follows in their wake.
3. References and Further Reading
This article draws primarily on Nietzsche’s published work from the 1880s. References to primary texts within the body of the article are to section numbers rather than page numbers.
a. Primary Texts
Daybreak.
The Gay Science.
Thus Spoke Zarathustra.
Beyond Good and Evil.
On the Genealogy of Morality.
Twilight of the Idols.
The Antichrist.
Ecce Homo.
b. Secondary Texts
Acampora, Christa Davis. “On Sovereignty and Overhumanity: Why It Matters How We Read Nietzsche’s Genealogy II:2.” In Christa Davis Acampora (ed.) Nietzsche’s On the Genealogy of Morals: Critical Essays. Lanham, MD: Rowan & Littlefield, pp. 147–162, 2006
Clark, Maudmarie and David Dudrick. The Soul of Nietzche’s Beyond Good and Evil. Cambridge: Cambridge University Press, 2012.
Foot, Philippa. “Nietzsche’s Immoralism.” In Richard Schacht (ed.) Nietzsche, Genealogy, Morality: Essays on Nietzsche’s On the Genealogy of Morals. Berkeley: University of California Press, 1994.
Foot, Philippa. Natural Goodness, Oxford: Oxford University Press, 2001.
Gemes, Ken. “Nietzsche on Free Will, Autonomy and the Sovereign Individual”. In Ken Gemes and Simon May (eds.) Nietzsche on Freedom and Autonomy. Oxford, New York: Oxford University Press, pp. 33–50, 2009.
Huddleston, Andrew. Nietzsche on the Decadence and Flourishing of Culture. Oxford: Oxford University Press, 2019.
Hurka, Thomas. “Nietzsche: Perfectionist.” In Brian Leiter and Neil Sinhababu (eds.), Nietzsche and Morality, Oxford: Oxford University Press, pp. 9–31, 2007.
Janaway, Christopher. “Nietzsche on Free Will, Autonomy and the Sovereign Individual.” Aristotelian Society Supplementary Volume 80, pp. 339–357, 2006.
Janaway, Christopher. “Nietzsche on Morality, Drives, and Human Greatness.” In Christopher Janaway and Simon Robertson (eds.) Nietzsche, Naturalism, and Normativity. Oxford: Oxford University Press, pp. 183–201, 2012.
Katsafanas, Paul. The Nietzschean Self: Moral Psychology, Agency, and the Unconscious. Oxford: Oxford University Press, 2016.
Kirwin, Claire. “Pulling Oneself Up by the Hair: Understanding Nietzsche on Freedom.” Inquiry, vol 61, pp. 82-99, 2017.
Leiter, Brian. Nietzsche on Morality, Second Edition, Oxford: Routledge, 2015 (First Edition published as Routledge Philosophy Guidebook to Nietzsche on Morality, Routledge, 2002).
Leiter, Brian. “Who Is the ‘Sovereign Individual”? Nietzsche on Freedom.” In Simon May (ed.), Nietzsche’s On the Genealogy of Morality: A Critical Guide. Cambridge: Cambridge University Press, pp. 101–119, 2011.
Leiter, Brian. Moral Psychology with Nietzsche, Oxford: Oxford University Press, 2019.
May, Simon. “Nihilism and the Free Self.” In Ken Gemes and Simon May (eds.) Nietzsche on Freedom and Autonomy. Oxford, New York: Oxford University Press, pp. 89–106, 2009.
May, Simon. (ed.) Nietzsche’s On the Genealogy of Morality: A Critical Guide. Cambridge: Cambridge University Press, 2011.
Reginster, Bernard. The Affirmation of Life: Nietzsche on Overcoming Nihilism, Cambridge, MA: Harvard University Press, 2006.
Riccardi, Mattia. Nietzsche’s Philosophical Psychology, Oxford: Oxford University Press, 2021.
Richardson, John. “Nietzsche’s Freedoms.” In Ken Gemes and Simon May (eds.) Nietzsche on Freedom and Autonomy. Oxford, New York: Oxford University Press, pp. 127–150, 2009.
Ridley, Aaron. “What the Sovereign Individual Promises.” In Ken Gemes and Simon May (eds.) Nietzsche on Freedom and Autonomy. Oxford, New York: Oxford University Press, pp. 181–196, 2009.
Rukgaber, Matthew. “The ‘Sovereign Individual’ and the ‘Ascetic Ideal’: On a Perennial Misreading of the Second Essay of Nietzsche’s On the Genealogy of Morality.” Journal of Nietzsche Studies, Vol. 43 (2), pp. 213–239, 2012.
Von Tevenar, Gudrun. “Nietzsche’s Objections to Pity and Compassion.” In Gudrun von Tevenar (ed.) Nietzsche and Ethics. Bern: Peter Land, pp. 263–82, 2007.
Williams, Bernard. “Nietzsche on Tragedy, by M. S. Silk and J. P. Stern; Nietzsche: A Critical Life, by Ronald Hayman; Nietzsche, vol. 1, The Will to Power as Art, by Martin Heidegger, translated by David Farrell Krell, London Review of Books (1981).” Reprinted in his Essays and Reviews 1959–2002, Princeton: Princeton University Press, 2014.
A contrary-to-duty obligation is an obligation telling us what ought to be the case if something that is wrong is true. For example: ‘If you have done something bad, you should make amends’. Doing something bad is wrong, but if it is true that you did do something bad, it ought to be the case that you make amends. Here are some other examples: ‘If he is guilty, he should confess’, ‘If you have hurt your friend, you should apologise to her’, ‘If she will not keep her promise to him, she ought to call him’, ‘If the books are not returned by the due date, you must pay a fine’. Alternatively, we might say that a contrary-to-duty obligation is a conditional obligation where the condition (in the obligation) is forbidden, or where the condition is fulfilled only if a primary obligation is violated. In the first example, he should not be guilty; but if he is, he should confess. You should not have hurt your friend; but if you have, you should apologise. She should keep her promise to him; but if she will not, she ought to call him. The books ought to be returned by the due date; but if they are not, you must pay a fine.
Contrary-to-duty obligations are important in our moral and legal thinking. They turn up in discussions concerning guilt, blame, confession, restoration, reparation, punishment, repentance, retributive justice, compensation, apologies, damage control, and so forth. The rationale of a contrary-to-duty obligation is the fact that most of us do neglect our primary duties from time to time and yet it is reasonable to believe that we should make the best of a bad situation, or at least that it matters what we do when this is the case.
We want to find an adequate symbolisation of such obligations in some logical system. However, it has turned out to be difficult to do that. This is shown by the so-called contrary-to-duty (obligation) paradox, sometimes called the contrary-to-duty imperative paradox. The contrary-to-duty paradox arises when we try to formalise certain intuitively consistent sets of ordinary language sentences, sets that include at least one contrary-to-duty obligation sentence, by means of ordinary counterparts available in various monadic deontic logics, such as the so-called Standard Deontic Logic and similar systems. In many of these systems the resulting sets are inconsistent in the sense that it is possible to deduce contradictions from them, or else they violate some other intuitively plausible condition, for example that the members of the sets should be independent of each other. This article discusses this paradox and some solutions that have been suggested in the literature.
Roderick Chisholm was one of the first philosophers to address the contrary-to-duty (obligation or imperative) paradox (Chisholm (1963)). Since then, many different versions of this puzzle have been mentioned in the literature (see, for instance, Powers (1967), Åqvist (1967, 2002), Forrester (1984), Prakken and Sergot (1996), Carmo and Jones (2002), and Rönnedal (2012, pp. 61–66) for some examples). Here we discuss a particular version of a contrary-to-duty (obligation) paradox that involves promises; we call this example ‘the promise (contrary-to-duty) paradox’. Most of the things we say about this particular example can be applied to other versions. But we should keep in mind that different contrary-to-duty paradoxes might require different solutions.
Scenario I: The promise (contrary-to-duty) paradox (After Prakken and Sergot (1996))
Consider the following scenario. It is Monday and you promise a friend to meet her on Friday to help her with some task. Suppose, further, that you always meet your friend on Saturdays. In this example the following sentences all seem to be true:
N-CTD
N1. (On Monday it is true that) You ought to keep your promise (and see your friend on Friday).
N2. (On Monday it is true that) It ought to be that if you keep your promise, you do not apologise (when you meet your friend on Saturday).
N3. (On Monday it is true that) If you do not keep your promise (that is, if you do not see your friend on Friday and help her out), you ought to apologise (when you meet her on Saturday).
N4. (On Monday it is true that) You do not keep your promise (on Friday).
Let N-CTD = {N1, N2, N3, N4}. N3 is a contrary-to-duty obligation (or expresses a contrary-to-duty obligation). If the condition is true, the primary obligation that you should keep your promise (expressed by N1) is violated. N-CTD seems to be consistent as it does not seem possible to derive any contradiction from this set. Nevertheless, if we try to formalise N-CTD in so-called Standard Deontic Logic, for instance, we immediately encounter some problems. Standard Deontic Logic is a well-known logical system described in most introductions to deontic logic (for example, Gabbay, Horty, Parent, van der Meyden and van der Torre (eds.) (2013, pp. 36–39)). It is basically a normal modal system of the kind KD (Chellas (1980)). In Åqvist (2002) this system is called OK+. For introductions to deontic logic, see Hilpinen (1971, 1981), Wieringa and Meyer (1993), McNamara (2010), and Gabbay et al. (2013). Consider the following symbolisation:
SDL-CTD
SDL1 Ok
SDL2 O(k → ¬a)
SDL3 ¬k → Oa
SDL4 ¬k
O is a sentential operator that takes a sentence as argument and gives a sentence as value. ‘Op’ is read ‘It ought to be (or it should be) the case that (or it is obligatory that) p’. ¬ is standard negation and → standard material implication, well known from ordinary propositional logic. In SDL-CTD, k is a symbolisation of ‘You keep your promise (meet your friend on Friday and help her with her task)’ and a abbreviates ‘You apologise (to your friend for not keeping your promise)’. In this symbolisation SDL1 is supposed to express a primary obligation and SDL3 a contrary-to-duty obligation telling us what ought to be the case if the primary obligation is violated. However, the set SDL-CTD = {SDL1, SDL2, SDL3, SDL4} is not consistent in Standard Deontic Logic. O¬a is entailed by SDL1 and SDL2, and from SDL3 and SDL4 we can derive Oa. Hence, we can deduce the following formula from SDL-CTD: Oa ∧ O¬a (‘It is obligatory that you apologise and it is obligatory that you do not apologise’), which directly contradicts the so-called axiom D, the schema ¬(OA ∧ O¬A). (∧ is the ordinary symbol for conjunction.) ¬(OA ∧ O¬A) is included in Standard Deontic Logic (usually as an axiom). Clearly, this sentence rules out explicit moral dilemmas. Since N-CTD seems to be consistent, while SDL-CTD is inconsistent, something must be wrong with our formalisation, with Standard Deontic Logic or with our intuitions. In a nutshell, this puzzle is the contrary-to-duty (obligation) paradox.
2. Solutions to the Paradox
Many different solutions to the contrary-to-duty paradox have been suggested in the literature. We can try to find some alternative formalisation of N-CTD, we can try to develop some other kind of deontic logic or we can try to show why at least some of our intuitions about N-CTD are wrong. The various solutions can be divided into five categories: quick solutions, operator solutions, connective solutions, action or agent solutions, and temporal solutions, and these categories can be divided into several subcategories. Various answers to the puzzle are often presented as general solutions to all different kinds of contrary-to-duty paradoxes; and if some proposal takes care of all the different kinds, this is a strong reason to accept this solution. Having said that, it might be the case that the same approach cannot be used to solve all kinds of contrary-to-duty paradoxes.
a. Quick Solutions
In this section, we consider some quick responses to the contrary-to-duty paradox. There are at least three types of replies of this kind: (1) We can reject some axiom schemata or rules of inference in Standard Deontic Logic that are necessary to derive our contradiction. (2) We can try to find some alternative formalisation of N-CTD in monadic deontic logic. (3) We can bite the bullet and reject some of the original intuitions that seem to generate the paradox in the first place.
Few people endorse any of these solutions. Still, it is interesting to say a few words about them since they reveal some of the problems with finding an adequate symbolisation of contrary-to-duty obligations. If possible, we want to be able to solve these problems.
One way of avoiding the contrary-to-duty paradox in monomodal deontic systems is to give up the axiom D, ¬(OA ∧ O¬A) (‘It is not the case that it is obligatory that A and obligatory that not-A’). Without this axiom (or something equivalent), it is no longer possible to derive a contradiction from SDL1−SDL4. In the so-called smallest normal deontic system K (Standard Deontic Logic without the axiom D), for instance, SDL-CTD is consistent. Some might think that there are independent reasons for rejecting D since they think there are, or could be, genuine moral dilemmas. Yet, even if this were true (which is debatable), rejecting D does not seem to be a good solution to the contrary-to-duty paradox for several reasons.
Firstly, even if we reject axiom D, it is problematic to assume that a dilemma follows from N-CTD. We can still derive the sentence Oa ∧ O¬a from SDL-CTD in every normal deontic system, which says that it is obligatory that you apologise and it is obligatory that you do not apologise. And this proposition does not seem to follow from N-CTD. Ideally, we want our solution to the paradox to be dilemma-free in the sense that it is not possible to derive any dilemma of the form OA ∧ O¬A from our symbolisation of N-CTD.
Secondly, in every so-called normal deontic logic (even without the axiom D), we can derive the conclusion that everything is both obligatory and forbidden if there is at least one moral dilemma. This follows from the fact that FA (‘It is forbidden that A’) is equivalent to O¬A (‘It is obligatory that not-A’) and the fact that Oa ∧ O¬a entails Or for any r in every normal deontic system. This is clearly absurd. N-CTD does not seem to entail that everything is both obligatory and forbidden. Everything else equal, we want our solution to the contrary-to-duty paradox to avoid this consequence.
Thirdly, such a solution still has problems with the so-called pragmatic oddity (see below, this section).
In monomodal deontic logic, for instance Standard Deontic Logic, we can solve the contrary-to-duty paradox by finding some other formalisation of the sentences in N-CTD. Instead of SDL2 we can use k → O¬a and instead of SDL3 we can use O(¬k → a). Then we obtain three consistent alternative symbolisations of N-CTD. Nonetheless, these alternatives are not non-redundant (a set of sentences is non-redundant only if no member in the set follows from the rest). O(¬k → a) follows from Ok in every so-called normal deontic logic, including Standard Deontic Logic, and k → O¬a follows from ¬k by propositional logic. But, intuitively, N3 does not appear to follow from N1, and N2 does not appear to follow from N4. N-CTD seems to be non-redundant in that it seems to be the case that no member of this set is derivable from the others. Therefore, we want our symbolisation of N-CTD to be non-redundant.
The so-called pragmatic oddity is a problem for many possible solutions to the contrary-to-duty paradox, including our original symbolisation in Standard Deontic Logic, that is, SDL-CTD, the same symbolisation in the smallest normal deontic system K, and the one that uses k → O¬a instead of O(k → ¬a). In every normal deontic logic (with or without the axiom D), it is possible to derive the following sentence from SDL-CTD: O(k ∧ a), which says that it is obligatory that you keep your promise and apologise (for not keeping your promise). Several solutions that use bimodal alethic-deontic logic or counterfactual deontic logic (see Section 2c) as well as Castañeda’s solution (see Section 2d), for instance, also have this problem. The sentence O(k ∧ a) is not inconsistent, but it is certainly very odd, and it does not appear to follow from N-CTD that you should keep your promise and apologise. Hence, we do not want our formalisation of N-CTD to entail this counterintuitive conclusion or anything similar to it.
One final quick solution is to reject some intuition. The set of sentences N-CTD in natural language certainly seems to be consistent and non-redundant, it seems to be dilemma-free, and it does not seem to entail the pragmatic oddity or the proposition that everything is both obligatory and forbidden. One possible solution to the contrary-to-duty paradox, then, obviously, is to reject some of these intuitions about this set. If it is not consistent and non-redundant, for instance, there is nothing puzzling about the fact that our set of formalised sentences (for example SDL-CTD) lack one or both of these properties. In fact, if this is the case, the symbolisation should be inconsistent and/or redundant.
The problem with this solution is, of course, that our intuitions seem reliable. N-CTD clearly seems to be consistent, non-redundant, and so forth. And we do not appear to have any independent reasons for rejecting these intuitions. It might be the case that sometimes when we use contrary-to-duty talk, we really are inconsistent or non-redundant, for instance. Still, that does not mean that we are always inconsistent or non-redundant. If N-CTD or some other set of this kind is consistent, non-redundant, and so on, we cannot use this kind of solution to solve all contrary-to-duty paradoxes. Furthermore, it seems that we should not reject our intuitions if there is some better way to solve the contrary-to-duty paradox. So, let us turn to the other solutions. (For more information on quick solutions to the contrary-to-duty paradox, see Rönnedal (2012, pp. 67–98).)
b. Operator Solutions
We shall begin by considering the operator solution. The basic idea behind this kind of solution is that the contrary-to-duty paradox, in some sense, involves different kinds of obligations or different kinds of ‘ought-statements’. Solutions of this type have, for example, been discussed by Åqvist (1967), Jones and Pörn (1985), and Carmo and Jones (2002).
In Standard Deontic Logic a formula of the form OA ∧ O¬A is derivable from SDL-CTD; but OA ∧ O¬A is not consistent with the axiom D. If, however, there are different kinds of obligations, symbolised by distinct obligation operators, it may be possible to formalise our contrary-to-duty scenarios so as to avoid a contradiction. Suppose, for example, that there are two obligation operators O1 and O2 that represent ideal and actual obligations, respectively. Then, it is possible that instead of Oa ∧ O¬a we may derive the formula O1¬a ∧ O2a from the symbolisation of our scenarios. But O1¬a ∧ O2a is not inconsistent with the axiom D; O1¬a ∧ O2a says that it is ‘ideally-obligatory’ that you do not apologise and it is ‘actually-obligatory’ that you apologise. If we cannot derive any other formula of the form OA ∧ O¬A, it is no longer possible to derive a contradiction from our formalisation. Furthermore, such a solution seems to be dilemma-free, and it does not seem to be possible to derive the conclusion that everything is both obligatory and forbidden from a set of sentences that introduces different kinds of obligations.
An example: Carmo and Jones’s operator solution
Perhaps the most sophisticated version of this kind of solution is presented by Carmo and Jones (2002). Let us now discuss their answer to the contrary-to-duty paradox to illustrate this basic approach. To understand their view, we must first explain some formal symbols. Carmo and Jones use a dyadic, conditional obligation operator O(…/…) to represent conditional obligations. Intuitively, ‘O(B/A)’ says that in any context in which A is a fixed or unalterable fact, it is obligatory that B, if this is possible. They use two kinds of monadic modal operators: □ and ◇, and □ and ◇. Intuitively, □ is intended to capture that which—in a particular situation—is actually fixed, or unalterable, given (among other factors) what the agents concerned have decided to do and not to do. So, □A says that it is fixed or unalterable that A. ◇ is the dual (possibility operator) of □. Intuitively, □ is intended to capture that which—in a particular situation—is not only actually fixed, but would still be fixed even if different decisions had been made, by the agents concerned, regarding how they were going to behave. So, □A says that it is necessary, fixed or unalterable that A, no matter what the agents concerned intend to do or not to do. ◇ is the dual (possibility operator) of □. They also introduce two kinds of derived obligation sentences, OaB and OiB, pertaining to actual obligations and ideal obligations, respectively. OaB is read ‘It is actually obligatory that B’ or ‘It actually ought to be the case that B’, and OiB is read ‘It is ideally obligatory that B’ or ‘It ideally ought to be the case that B’. T is (the constant) Verum; it is equivalent to some logically true sentence (such as, it is not the case that p and not-p). In short, we use the following symbols:
O(B/A) In any context in which A is fixed, it is obligatory that B, if this is possible.
OaB It is actually obligatory that B.
OiB It is ideally obligatory that B.
◇A It is actually possible that A.
◇A It is potentially possible that A.
□A It is not actually possible that not-A.
□A It is not potentially possible that not-A.
T Verum
Before we consider Carmo and Jones’s actual solution to the contrary-to-duty paradoxes, let us say a few words about the formal properties of various sentences in their language. For more on the syntax and semantics of Carmo and Jones’s system, see Carmo and Jones (2002). □ (and ◇) is a normal modal operator of kind KT, and □ (and ◇) is a normal modal operator of kind KD (Chellas (1980)). □A is stronger than □A, and ◇A is stronger than ◇A. There is, according to Carmo and Jones, an intimate conceptual connection between the two notions of derived obligation, on the one hand, and the two notions of necessity/possibility. The system includes □(A ↔ B) → (OaA ↔ OaB) and □(A ↔ B) → (OiA ↔ OiB) for example. The system also contains the following restricted forms of so-called factual detachment: (O(B/A) ∧ □A ∧ ◇B ∧ ◇¬B) → OaB, and (O(B/A) ∧ □A ∧ ◇B ∧ ◇¬B) → OiB. We can now symbolise N-CTD in the following way:
O-CTD
O1 O(k/T)
O2 O(¬a/k)
O3 O(a/¬k)
O4 ¬k
We use the same propositional letters as in Section 1. Furthermore, we assume that the following ‘facts’ hold: □¬k, ◇(k ∧ ¬a), ◇(k ∧ a), ¬a ∧ ◇a ∧ ◇¬a. In other words, we assume that you decide not to keep your promise, but that it is potentially possible for you to keep your promise and not apologise and potentially possible for you to keep your promise and apologise, and that you have not in fact apologised, although it is still actually possible that you apologise and actually possible that you do not apologise. From this, we can derive the following sentences in Carmo and Jones’s system: Oi(k ∧ ¬a) and Oaa; that is, ideally it ought to be that you keep your promise (and help your friend) and do not apologise, but it is actually obligatory that you apologise. Furthermore, the obligation to keep your promise is violated and the ideal obligation to keep your promise and not apologise is also violated. Still, we cannot derive any contradiction. From Oi(k ∧ ¬a) we cannot derive any actual obligation not to apologise. Consequently, we can avoid the contrary-to-duty paradox.
Arguments for Carmo and Jones’s operator solution
According to Carmo and Jones, any adequate solution to the contrary-to-duty paradox should satisfy certain requirements. The representation of N-CTD (and similar sets of sentences) should be: (i) consistent, and (ii) non-redundant, in the sense that the formalisations of the members of N-CTD should be logically independent. The solution should be (iii) applicable to (at least apparently) action- and timeless contrary-to-duty examples (see Section 2d and Section 2e for some examples). (iv) The logical structures of the two conditional obligations in N-CTD (and similar sets of sentences) should be analogous. Furthermore, we should have (v) the capacity to derive actual and (vi) ideal obligations (from (the representation of) N-CTD), (vii) the capacity to represent the fact that a violation of an obligation has occurred, and (viii) the capacity to avoid the pragmatic oddity (see Section 2a above for a description of this problem). Finally, (ix) the assignment of logical form to a sentence in a contrary-to-duty scenario should be independent of the assignment of logical form to the other sentences. Carmo and Jones’s solution satisfies all of these requirements. This is a good reason to accept their approach. Nevertheless, there are also some serious problems with the suggested solution. We now consider two puzzles.
Arguments against Carmo and Jones’s operator solution
Even though Carmo and Jones’s operator solution is quite interesting, it has not generated much discussion. In this section, we consider two arguments against their solution that have not been mentioned in the literature.
Argument 1. Carmo and Jones postulate several different unconditional operators. But ‘ought’ (and ‘obligatory’) does not seem to be ambiguous in the sense the solution suggests. The derived ‘ideal’ obligation to keep the promise and not to apologise does not seem to be of another kind than the derived ‘actual’ obligation to apologise. The ‘ideal’ obligation is an ordinary unconditional obligation to keep your promise and not apologise that holds as long as it is still possible for you to keep your promise and not apologise. And the ‘actual’ obligation is an ordinary unconditional obligation that becomes ‘actual’ as soon as it is settled that you will not keep your promise. Both obligations are unconditional and both obligations are action guiding. The ‘ought’ in the sentence ‘You ought to keep your promise and not apologise’ does not have another meaning than the ‘ought’ in the sentence ‘You ought to apologise’. The only difference between the obligations is that they are in force at different times. Or, at least, so it seems. Furthermore, if the conditional obligation sentences N2 and N3 should be symbolised in the same way, if they have the same logical form, as Carmo and Jones seem to think, it also seems reasonable to assume that the derived unconditional obligation sentences should be symbolised by the same kind of operator.
Argument 2. Carmo and Jones speak about two kinds of obligations: actual obligations and ideal obligations. But it is unclear which of these, if either, they think is action guiding. We have the following alternatives:
(i) Both actual and ideal obligations are action guiding.
(ii) Neither actual nor ideal obligations are action guiding.
(iii) Ideal but not actual obligations are action guiding.
(iv) Actual but not ideal obligations are action guiding.
Yet, all of these alternatives are problematic. It seems that (i) cannot be true. For in Carmo and Jones’s system, we can derive Oi(k ∧ ¬a) and Oaa from the symbolisation of N-CTD. Still, there is no possible world in which it is true both that you keep your promise and not apologise and that you apologise. How, then, can both actual and ideal obligations be action guiding? If we assume that neither actual nor ideal obligations are action guiding, we can avoid this problem, but then the value of Carmo and Jones’s solution is seriously limited. We want, in every situation, to know what we (actually) ‘ought to do’ in a sense of ‘ought to do’ that is action guiding. Nevertheless, according to (ii), neither ideal nor actual obligations are action guiding. In this reading of the text, Carmo and Jones’s system cannot give us any guidance; it does not tell us what we ‘ought to do’ in what seems to be the most interesting sense of this expression. True, the solution does say something about ideal and actual obligations, but why should we care about that? So, (ii) does not appear to be defensible. If it is the ideal and not the actual obligations that are supposed to be action guiding, it is unclear what the purpose of speaking about ‘actual’ obligations is. If actual obligations are supposed to have no influence on our behaviour, they seem to be redundant and serve no function. Moreover, if this is true, why should we call obligations of this kind ‘actual’? Hence, (iii) does not appear to be true either. The only reasonable alternative, therefore, seems to be to assume that it is the actual and not the ideal obligations that are action guiding. Yet, this assumption is also problematic, since it has some counterintuitive consequences. If you form the intention not to keep your promise, if you decide not to help your friend, your actual obligation is to apologise, according to Carmo and Jones. You have an ideal obligation to keep your promise and not apologise, but this obligation is not action guiding. So, it is not the case that you ought to keep your promise and not apologise in a sense that is supposed to have any influence on your behaviour. However, intuitively, it seems to be true that you ought to keep your promise and not apologise as long as you still can keep your promise; as long as this is still (potentially) possible, this seems to be your ‘actual’ obligation, the obligation that is action guiding. As long as you can help your friend (and not apologise), you do not seem to have an actual (action-guiding) obligation to apologise. The fact that you have decided not to keep your promise does not take away your (actual, action-guiding) obligation to keep your promise (and not apologise); you can still change your mind. We cannot avoid our obligations just by forming the intention not to fulfil them. This would make it too easy to get rid of one’s obligations. Consequently, it seems that (iv) is not true either. And if this is the case, Carmo and Jones’s solution is in deep trouble, despite its many real virtues.
c. Connective Solutions
We turn now to our second category of solutions to the contrary-to-duty paradox. In Section 1, we interpreted the English construction ‘if, then’ as material implication. But there are many other possible readings of this expression. According to the connective solutions to the contrary-to-duty paradox, ‘if, then’ should be interpreted in some other way, not as a material implication. The category includes at least four subcategories: (1) the modal (or strict implication) solution according to which ‘if, then’ should be interpreted as strict or necessary implication; (2) the counterfactual (or subjunctive) solution according to which ‘if, then’ should be interpreted as some kind of subjunctive or counterfactual conditional; (3) the non-monotonic solution according to which we should use some kind of non-monotonic logic to symbolise the expression ‘if, then’; and (4) the (primitive) dyadic deontic solution according to which we should develop a new kind of dyadic deontic logic with a primitive, two-place sentential operator that can be used to symbolise conditional norms.
According to the first solution, which we call the modal solution, ‘if, then’ should be interpreted as strict, that is, necessary implication, not as material implication. N2 should, for example, be symbolised in the following way: k => O¬a (or perhaps as O(k => ¬a)), and N3 in the following way: ¬k => Oa (or perhaps as O(¬k => a)), where => stands for strict implication and the propositional letters are interpreted as in Section 1. A => B is logically equivalent to □(A → B) in most modal systems. □ is a sentential operator that takes one sentence as argument and gives one sentence as value. ‘□A’ says that it is necessary that A. The set {Ok, k => O¬a, ¬k => Oa, ¬k} is consistent in some alethic deontic systems (systems that combine deontic and modal logic). So, if we use this symbolisation, it might be possible to avoid the contrary-to-duty paradox. A solution of this kind is discussed by Mott (1973), even though Mott seems to prefer the counterfactual solution. For more on this kind of approach and for some problems with it, see Rönnedal (2012, pp. 99–102).
According to the second solution, the counterfactual solution, the expression ‘if, then’ should be interpreted as some kind of counterfactual or subjunctive implication. Mott (1973) and Niles (1997), for example, seem to defend a solution of this kind, while Tomberlin (1981) and Decew (1981), for instance, criticise it. We say more about the counterfactual solution below (in this section).
According to the third solution, the non-monotonic solution, we should use some kind of non-monotonic logic to symbolise the expression ‘if, then’. A solution of this kind has been discussed by Bonevac (1998). Bonevac introduces a new, non-monotonic, defeasible or generic conditional, >, a sentential operator that takes two sentences as arguments and gives one sentence as value. A > B is true in a possible world, w, if and only if B holds in all A-normal worlds relative to w. This conditional does not support ordinary modus ponens, that is, B does not follow from A and A > B. It only satisfies defeasible modus ponens, that B follows non-monotonically from A and A > B in the absence of contrary information. If we symbolise N2 as O(k > ¬a) (or perhaps as k > O¬a), and N3 as ¬k > Oa (and N1 and N4 as in SDL-CTD), we can no longer derive a contradiction from this set in Bonevac’s system. O¬a follows non-monotonically from Ok and O(k > ¬a), and Oa follows non-monotonically from ¬k and ¬k > Oa. But from {Ok, O(k > ¬a), ¬k > Oa, ¬k} we can only derive Oa non-monotonically. According to Bonevac, so-called factual detachment takes precedence over so-called deontic detachment. Hence, we can avoid the contrary-to-duty paradox.
A potential problem with this kind of solution is that it is not obvious that it can explain the difference between violation and defeat. If you will not see your friend and help her out, the obligation to keep your promise will be violated. It is not the case that this obligation is defeated, overridden or cancelled. The same seems to be true of the derived obligation that you should not apologise. If you do apologise, the derived (unconditional) obligation that you should not apologise is violated. It is not the case that one of the conditional norms in N-CTD defeat or override the other. Nor is it the case that they cancel each other out. Or, at least, so it seems. Ideally, we want our solution to reflect the idea that the primary obligation in a contrary-to-duty paradox has been violated and not defeated. Likewise, we want to be able to express the idea that the derived unconditional obligation not to apologise has been violated if you apologise. However, according to Bonevac, we cannot derive O¬a from {Ok, O(k > ¬a), ¬k > Oa, ¬k}, not even non-monotonically. This approach to the contrary-to-duty paradoxes does not appear to have generated that much discussion. But the non-monotonic paradigm is interesting and Bonevac’s paper provides a fresh view on the paradox.
According to the fourth solution, the (pure) dyadic deontic solution, we should develop a new kind of dyadic deontic logic with a primitive, two-place sentential operator that can be used to symbolise conditional norms. Sometimes O(B/A) is used to symbolise such norms, sometimes O[A]B, and sometimes AOB. Here we use the following construction: O[A]B. ‘O[A]B’ is read ‘It is obligatory (or it ought to be the case) that B given A’. This has been one of the most popular solutions to the contrary-to-duty paradox and it has many attractive features. Nevertheless, we do not say anything more about it in this article, since we discuss a temporal version of the dyadic deontic solution in Section 2e. For more on this kind of approach and for some problems with it, see Åqvist (1984, 1987, 2002) and Rönnedal (2012, pp. 112–118). For more on dyadic deontic logic, see Rescher (1958), von Wright (1964), Danielsson (1968), Hansson (1969), van Fraassen (1972, 1973), Lewis (1974), von Kutschera (1974), Greenspan (1975), Cox (Al-Hibri) (1978), and van der Torre and Tan (1999). Semantic tableau systems for dyadic deontic logic are developed by Rönnedal (2009).
An example: The counterfactual solution
We now consider the counterfactual solution to the contrary-to-duty paradox and some arguments for and against this approach. Mott (1973) and Niles (1997), for example, are sympathetic to this kind of view, while Tomberlin (1981) and Decew (1981), for instance, criticise it. Some of the arguments in this section have previously been discussed in Rönnedal (2012, pp. 102–106). For more on combining counterfactual logic and deontic logic, see the Appendix, Section 7, in Rönnedal (2012), Rönnedal (2016) and Rönnedal (2019); the tableau systems that are used in this section are described in those works.
In a counterfactual deontic system, a system that combines counterfactual logic and deontic logic, we can symbolise the concept of a conditional obligation in at least four interesting ways: (A □→ OB), O(A □→ B), (A □⇒ OB) and O(A □⇒ B). □→ (and □⇒) is a two-place, sentential operator that takes two sentences as arguments and gives one sentence as value. ‘A □→ B’ (and ‘A □⇒ B’) is often read ‘If A were the case, then B would be the case’. (The differences between □→ and □⇒ are unimportant in this context and as such we focus on □→.) So, maybe we can use some of these formulas to symbolise contrary-to-duty obligation sentences and avoid the contrary-to-duty paradox. Let us now consider one possible formalisation of N-CTD that seems to be among the most plausible in counterfactual deontic logic. In the discussion of Argument 2 in this section (see below), we consider two more attempts.
CF-CTD
CF1 Ok
CF2 k □→ O¬a
CF3 ¬k □→ Oa
CF4 ¬k
Let CF-CTD = {CF1, CF2, CF3, CF4}. From CF3 and CF4 we can deduce Oa, but it is not possible to derive O¬a from CF1 and CF2, at least not in most reasonable counterfactual deontic systems. Hence, we cannot derive a contradiction in this way.
Arguments for the counterfactual solution
This solution to the contrary-to-duty paradox is attractive for many reasons. (1) CF-CTD is consistent, as we already have seen. (2) The set is non-redundant. CF3 does not seem to be derivable from CF1, and CF2 does not seem to be derivable from CF4 in any interesting counterfactual deontic logic. (3) The set is dilemma-free. We cannot derive Oa ∧ O¬a from CF-CTD, nor anything else of the form OA ∧ O¬A. (4) We cannot derive the proposition that everything is both obligatory and forbidden from CF-CTD. (5) We can easily express the idea that the primary obligation to keep the promise has been violated in counterfactual deontic logic. This is just the conjunction of CF1 and CF4. (6) All conditional obligations can be symbolised in the same way. CF2 has the same logical form as CF3. (7) We do not have to postulate several different kinds of unconditional obligations. The unconditional obligation to keep the promise is the same kind of obligation as the derived unconditional obligation to apologise. This is a problem for Carmo and Jones’s operator solution (Section 1 above). (8) The counterfactual solution can take care of apparently actionless contrary-to-duty paradoxes. Such paradoxes are a problem for the action or agent solutions (see Section 2d). (9) The counterfactual solution can perhaps take care of apparently timeless contrary-to-duty paradoxes. Such paradoxes are a problem for the temporal solution (see Section 2e). (Whether or not this argument is successful is debatable.) (10) From CF3 and CF4 we can derive the formula Oa, which says that you should apologise, and, intuitively, it seems that this proposition follows from N3 and N4 (at least in some contexts). (11) In counterfactual deontic logic a conditional obligation can be expressed by a combination of a counterfactual conditional and an ordinary (unconditional) obligation. We do not have to introduce any new primitive dyadic deontic operators. According to the dyadic and temporal dyadic deontic solutions (see above in this section and Section 2e below), we need some new primitive dyadic deontic operator to express conditional obligations.
Hence, the counterfactual solution to the contrary-to-duty paradox seems to be among the most plausible so far suggested in the literature. Nonetheless, it also has some serious problems. We now consider four arguments against this solution. For more on some problems, see Decew (1981) and Tomberlin (1981), and for some responses, see Niles (1997).
Arguments against the counterfactual solution
Argument 1. The symbol □→ has often been taken to represent conditional sentences in the subjunctive, not in the indicative form. That is, A □→ B is read ‘If it were the case that A, then it would be the case that B’, not ‘If A is the case, then B is the case’ (or ‘If A, then B’). So, the correct reading of k □→ O¬a seems to be ‘If you were to keep your promise, then it would be obligatory that you do not apologise’, and the correct reading of ¬k □→ Oa seems to be ‘If you were not to keep your promise, then it would be obligatory that you apologise’. If this is true, the formal sentences CF2 and CF3 do not correctly reflect the meaning of the English sentences N2 and N3, because the English sentences are not in the subjunctive form.
Here is a possible response to this argument. A □→ B might perhaps be used to symbolize indicative conditionals and not only subjunctive conditionals, and if this is the case, we can avoid this problem. Furthermore, maybe the formulation in natural language is not satisfactory. Maybe the English sentences in N-CTD are more naturally formulated in the subjunctive form. So, ‘It ought to be that if you keep your promise, you do not apologise’ is taken to mean the same thing as ‘If you were to keep your promise, then it would be obligatory that you do not apologise’; and ‘If you do not keep your promise, you ought to apologise’ is taken to say the same thing as ‘If you were not to keep your promise, then it would be obligatory that you apologise’. And if this is the case, the symbolisations might very well be reasonable. To decide whether this is the case or not, it seems that we have to do much more than just look at the surface structure of the relevant sentences. So, this argument—while interesting—does not seem to be conclusive.
Argument 2. In counterfactual deontic logic, N2 can be interpreted in (at least) two ways: k □→ O¬a (CF2) or O(k □→ ¬a) (CF2(b)). Faced with the choice between two plausible formalisations of a certain statement, we ought to choose the stronger one. CF2(b) is stronger than CF2. So, N2 should be symbolized by CF2(b) and not by CF2. Furthermore, CF2(b) corresponds better with the surface structure of N2 than CF2; in N2 the expression ‘It ought to be that’ has a wide and not a narrow scope. This means that N-CTD should be symbolized in the following way:
C2F-CTD
CF1 Ok
CF2(b) O(k □→ ¬a)
CF3 ¬k □→ Oa
CF4 ¬k
Let C2F-CTD = {CF1, CF2(b), CF3, CF4}. Yet, in this reading, the paradox is reinstated, for C2F-CTD is inconsistent in most plausible counterfactual deontic systems. (An argument of this kind against a similar contrary-to-duty paradox can be found in Tomberlin (1981).) Let us now prove this. (In the proofs below, we use some semantic tableau systems that are described in the Appendix, Section 7, in Rönnedal (2012); temporal versions of these systems can be found in Rönnedal (2016). All rules that are used in our deductions are explained in these works.) First, we establish a derived rule, rule DR8, which is used in our proofs. This rule is admissible in any counterfactual (deontic) system that contains the tableau rule Tc5.
Derivation of DR8.
(1) A □→ B, i
↙↘
(2) ¬(A → B), i [CUT] (3) A → B, i [CUT]
(4) A, i [2, ¬→]
(5) ¬B, i [2, ¬→]
(6) irAi [4, Tc5]
(7) B, i [1, 6, □→]
(8) * [5, 7]
Now we are in a position to prove that C2F-CTD is inconsistent. To prove that a set of sentences A1, A2, …, Anis inconsistent in a tableau system S, we construct an S-tableau which begins with every sentence in this set suffixed in an appropriate way, such as A1, 0, A2, 0, …, An, 0. If this tableau is closed, that is, if every branch in it is closed, the set is inconsistent in S. (‘MP’ stands for the derived tableau rule Modus Ponens.)
(1) Ok, 0
(2) O(k □→ ¬a), 0
(3) ¬k □→ Oa, 0
(4) ¬k, 0
(5) ¬k → Oa, 0 [3, DR8]
(6) Oa, 0 [4, 5, MP]
(7) 0s1 [T − dD]
(8) k, 1 [1, 7, O]
(9) k □→ ¬a, 1 [2, 7, O]
(10) a, 1 [6, 7, O]
(11) k → ¬a, 1 [9, DR8]
(12) ¬a, 1 [8, 11, MP]
(13) * [10, 12]
So, the counterfactual solution is perhaps not so plausible after all. Nevertheless, this argument against this solution is problematic for at least two different reasons.
(i) It is not clear in what sense CF2(b) is ‘stronger’ than CF2. Tomberlin does not explicitly discuss what he means by this expression in this context. Usually one says that a formula A is (logically) stronger than a formula B in a system S if and only if A entails B but B does not entail A in S. In this sense, CF2(b) does not seem to be stronger than CF2 in any interesting counterfactual deontic logic. But perhaps one can understand ‘stronger’ in some other sense in this argument. CF2(b) is perhaps not logically stronger than CF2, but it is a more natural interpretation of N2 than CF2. Recall that N2 says that it ought to be that if you keep your promise, then you do not apologise. This suggests that the correct symbolisation of N2 is O(k □→ ¬a), not k □→ O¬a; in other words, the O-operator should have a wide and not a narrow scope.
(ii) Let us grant that O(k □→ ¬a) is stronger than k □→ O¬a in the sense that the former is more natural than the latter. Furthermore, it is plausible to assume that if two interpretations of a sentence are reasonable one should choose the stronger or more natural one (as a pragmatic rule and ceteris paribus). Hence, CF2 should be symbolised as O(k □→ ¬a) and not as k □→ O¬a. Here is a possible counterargument. Both O(k □→ ¬a) and k □→ O¬a are reasonable interpretations of N2. So, ceteris paribus we ought to choose O(k □→ ¬a). But if we choose O(k □→ ¬a) the resulting set C2F-CTD is inconsistent. Thus, in this case, we cannot (or should not) choose O(k □→ ¬a) as a symbolisation of N2. We should instead choose the narrow scope interpretation k □→ O¬a. Furthermore, it is not obvious that N2 says something other than the following sentence: ‘If you keep your promise, it ought to be the case that you do not apologise’ (N2b). And here k □→ O¬a seems to be a more natural symbolisation. Even if N2 and N2b are not equivalent, N2b might perhaps express our original idea better than N2. Consequently, this argument does not seem to be conclusive. However, it does seem to show that C2F-CTD is not a plausible solution to the contrary-to-duty paradox.
What happens if we try some other formalisation of N3? Can we avoid this problem then? Let us consider one more attempt to symbolize N-CTD in counterfactual deontic logic.
C3F-CTD
CF1 Ok
CF2(b) O(k □→ ¬a)
CF3(b) O(¬k □→ a)
CF4 ¬k
Let C3F-CTD = {CF1, CF2(b), CF3(b), CF4}. In this set N3 is once more represented by a sentence where the O-operator has wide scope. From this set we can derive O¬a from CF1 and CF2(b), but not Oa from CF3(b) and CF4. The set is not inconsistent.
Yet, this solution is problematic for another reason. All of the following sentences seem to be true: O(k □→ ¬a), k □→ O¬a, ¬k □→ Oa, but O(¬k □→ a) seems false. According to the standard truth-conditions for counterfactuals, A □→ B is true in a possible world w if and only if B is true in every possible world that is as close as (as similar as) possible to w in which A is true; and OA is true in a possible world w if and only if A is true in every possible world that is deontically accessible from w. If we think of the truth-conditions in this way, O(¬k □→ a) is true in w (our world) if and only if ¬k □→ a is true in all ideal worlds (in all possible worlds that are deontically accessible from w), that is, if and only if: in every ideal world w’ deontically accessible from w, a is true in all the worlds that are as close to w’ as possible in which ¬k is true. But in all ideal worlds you keep your promise, and in all ideal worlds, if you keep your promise, you do not apologise. From this it follows that in all ideal worlds you do not apologise. Accordingly, in all ideal worlds you keep your promise and do not apologise. Take an ideal world, say w’. In the closest ¬k world(s) to w’, ¬a seems to be true (since ¬a is true in w’). If this is correct, ¬k and ¬a is true in one of the closest ¬k worlds to w’. So, ¬k □→ a is not true in w’. Hence, O(¬k □→ a) is not true in w (in our world). In conclusion, if this argument is sound, we cannot avoid the contrary-to-duty paradox by using the symbolisation C3F-CTD.
Argument 3. We turn now to the pragmatic oddity. We have mentioned that this is a problem for some quick solutions and for the modal solution. It is also a problem for the counterfactual solution. In every counterfactual deontic system that includes the tableau rule Tc5 (see Rönnedal (2012, p. 160)), and hence the schema (A □→ B) → (A → B), the sentence O(k ∧ a) is derivable from CF-CTD. This is odd since it does not seem to follow that it ought to be that you keep your promise and apologise (for not keeping your promise) from N-CTD and since it seems that (A □→ B) → (A → B) should hold in every reasonable counterfactual logic. The following semantic tableau shows that O(k ∧ a) is derivable from CF-CTD (in most counterfactual deontic systems).
(1) Ok, 0
(2) k □→ O¬a, 0
(3) ¬k □→ Oa, 0
(4) ¬k, 0
(5) ¬O(k ∧ a), 0
(6) P¬(k ∧ a), 0 [5, ¬O]
(7) 0s1 [6, P]
(8) ¬(k ∧ a), 1 [6, P]
(9) k, 1 [1, 7, O]
(10) ¬k → Oa, 0 [3, DR8]
↙↘
(11) ¬¬k, 0 [10, →] (12) Oa, 0 [10, →]
(13) * [4, 11] (14) a, 1 [12, 7, O]
↙↘
(15) ¬k, 1 [8, ¬∧] (16) ¬a, 1 [8, ¬∧]
(17) * [9, 15] (18) * [14, 16]
Argument 4. According to the counterfactual solution, so-called factual detachment holds unrestrictedly, that is, OB always follows from A and A □→ OB. This view is criticised by Decew (1981). From the proposition that I will not keep my promise and the proposition that if I will not keep my promise I ought to apologise, it does not follow that I ought to apologise. For as long as I still can keep my promise I ought to keep it, and if I keep it, then I should not apologise. According to Decew, it is not enough that a condition is true, it must be ‘unalterable’ or ‘settled’ before we are justified in detaching the unconditional obligation. See also Greenspan (1975). If this is correct, the counterfactual solution cannot, in itself, solve all contrary-to-duty paradoxes.
d. Action or Agent Solutions
Now, let us turn to the action or agent solutions. A common idea behind most of these solutions is that we should make a distinction between what is obligatory, actions or so-called practitions, and the circumstances of obligations. We should combine deontic logic with some kind of action logic or dynamic logic. And when we do this, we can avoid the contrary-to-duty paradox. Three subcategories deserve to be mentioned: (1) Castañeda’s solution, (2) the Stit solution, and (3) the dynamic deontic solution.
Castañeda has developed a unique approach to deontic logic. According to him, any useful deontic calculus must contain two types of sentences even at the purely sentential level. One type is used to symbolise the indicative clauses—that speak about the conditions and not the actions that are considered obligatory—in a conditional obligation, and the other type is used to symbolise the infinitive clauses that speak about the actions that are considered obligatory and not the conditions. Castañeda thinks that the indicative components, but not the infinitive ones, allow a form of (internal) modus ponens. From N3 and N4 we can derive the conclusion that you ought to apologise, but from N1 and N2 we cannot derive the conclusion that you ought not to apologise. Hence, we avoid the contradiction. For more on this approach, see, for instance, Castañeda (1981). For a summary of some arguments against Castañeda’s solution, see Carmo and Jones (2002); see also Powers (1967).
According to the Stit solution, deontic logic should be combined with some kind of Stit (Seeing to it) logic. However, Stit logic is often combined with temporal logic. So, this approach can also be classified as a temporal solution. We say a few more words about this kind of view in Section 2e.
To illustrate this type of solution to the contrary-to-duty paradox, let us now discuss the dynamic deontic solution and some problems with this particular way of solving the puzzle.
An example: The dynamic deontic solution
According to the dynamic deontic proposal, we can solve the contrary-to-duty paradox if we combine deontic logic with dynamic logic. A view of this kind is suggested by Meyer (1988), which includes a dynamic deontic system. We will now consider this solution and some arguments for and against it. Dynamic deontic logic is concerned with what we ought to do rather than with what ought to be, and the sentences in N-CTD should be interpreted as telling us what you ought to do. The solution is criticised by Anglberger (2008).
Dynamic deontic logic introduces some new notions: α stands for some action, the formula [α]A denotes that performance of the action α (necessarily) leads to a state (or states) where A holds, where A is any sentence and [α] is similar to an ordinary necessity-like modal operator (the so-called box). The truth-conditions of [α]A are as follows: [α]A is true in a possible world w if and only if all possible worlds w’ with Rα(w, w’) satisfy A. Rα is an accessibility relation Rα ⊆ W⨯W associated with α, where W is the set of possible worlds or states. Rα(w, w’) says that from w one (can) get into state w’ by performing α. Fα, to be read ‘the action α is forbidden’, can be defined as Fα ↔ [α]V (call this equivalence Def F; ↔ is ordinary material equivalence), where V is a special atomic formula denoting violation, in other words, that some action is forbidden if and only if doing the action leads to a state of violation. Oα, to be read ‘the action α is obligatory’ or ‘it is obligatory to perform the action α’, can now be defined as Oα ↔ F(-α) (call this equivalence Def O), where ‐α stands for the non-performance of α. Two further formulas should be explained: α ; β stands for ‘the performance of α followed by β’, and α & β stands for ‘the performance of α and β (simultaneously)’.
The first three sentences in N-CTD can now be formalised in the following way in dynamic deontic logic:
DDLF-CTD
DDLF1 Oα
DDLF2 [α]O‐β
DDLF3 [‐α]Oβ
Let DDLF-CTD = {Oα, [α]O‐β, [‐α]Oβ}, where α stands for the act of keeping your promise (and helping your friend) and β for the act of apologising. In dynamic deontic logic, it is not possible to represent (the dynamic version of) N4, which states that the act of keeping your promise is not performed. This should perhaps make one wonder whether the formalisation is adequate (see Argument 1 below in this section). Yet, if we accept this fact, we can see that the representation solves the contrary-to-duty paradox. From DDLF-CTD it is not possible to derive a contradiction. So, in dynamic deontic logic we can solve the contrary-to-duty paradox.
Arguments for the dynamic solution
Meyer’s system is interesting and there seem to be independent reasons to want to combine deontic logic with some kind of action logic or dynamic logic. The symbolisations of the sentences in N-CTD seem intuitively plausible. DDLF-CTD is consistent; the set is dilemma-free and we cannot derive the proposition that everything is both obligatory and forbidden from it. We can assign formal sentences with analogous structures to all conditional obligations in N-CTD. We do not have to postulate several different types of unconditional obligations. Furthermore, from DDLF-CTD it is possible to derive O(α ; ‐β) ∧ [‐α](V ∧ Oβ), which says that it is obligatory to perform the sequence α (keeping your promise) followed by ‐β (not-apologising), and if α has not been done (that is, if you do not keep your promise), one is in a state of violation and it is obligatory to do β; that is, it is obligatory to apologise. This conclusion is intuitively plausible. Nevertheless, there are also some potential and quite serious problems with this kind of solution.
Arguments against the dynamic solution
We consider four arguments against the dynamic solution to the contrary-to-duty paradox in this section. Versions of the second and the third can be found in Anglberger (2008). However, as far as we know, Argument 1 and Argument 4 have not been discussed in the literature before. According to the first argument, we cannot symbolise all premises in dynamic deontic logic, which is unsatisfactory. If we try to avoid this problem, we run into the pragmatic oddity once again. According to the second argument, the dynamic formalisations of the contrary-to-duty sets are not non-redundant. According to the third, it is provable in Meyer’s system PDeL + ¬O(α & ‐α) that no possible action is forbidden, which is clearly implausible. ‘¬O(α & ‐α)’ says that it is not obligatory to perform α and non-α. According to the fourth argument, there seem to be action- and/or agentless contrary-to-duty paradoxes, which seem impossible to solve in dynamic deontic logic.
Argument 1. We cannot symbolise all sentences in N-CTD in dynamic deontic logic; there is no plausible formalisation of N4. This is quite problematic. If the sentence N4 cannot be represented in dynamic deontic logic, how can we then claim that we have solved the paradox? Meyer suggests adding a predicate DONE that attaches to action names (Meyer (1988)). Then, DONE(α) says that action α has been performed. If we add this predicate, we can symbolise all sentences in N-CTD. Sentence N4 is rendered DONE(-α). Meyer appears to think that (DONE(α)→A) is derivable from [α]A. This seems plausible. Still, if we assume this, we can deduce a dynamic counterpart of the pragmatic oddity from our contrary-to-duty sets. To prove this, we use a lemma, Lemma 1, that is a theorem in dynamic deontic logic. α and β are interpreted as above.
But the conclusion 10 in this argument says that it is obligatory that you perform the act of keeping your promise and the act of apologising (for not keeping your promise), and this is counterintuitive.
Argument 2. Recall that the first three sentences in N-CTD are symbolized in the following way: DDLF1 Oα, DDLF2 [α]O‐β, and DDLF3 [-α]Oβ. We will show that we can derive DDLF3 from DDLF1. It follows that the formalisation of N-CTD in dynamic deontic logic is not non-redundant. This is our second argument. The rules that are used in the proofs below are mentioned by Meyer (1988).
Oα is equivalent to F‐α and [‐α]Oβ to [‐α]F‐β. F‐α → [‐α]F‐β is an instance of Lemma 5. So, DDLF3 in DDLF-CTD is derivable from DDLF1. Consequently, DDLF-CTD is not non-redundant.
Argument 3. Here is our third argument. This argument shows that if we add Axiom DD (¬O(α & ‐α)) to Meyer’s dynamic deontic logic PDeL, we can derive a sentence that, in effect, says that no possible action is forbidden. Axiom DD seems to be intuitively plausible, as it is a dynamic counterpart of the axiom D in Standard Deontic Logic that rules out moral dilemmas. Hence, this problem is quite serious. In the proof below, T is Verum and ⊥ is Falsum. T is equivalent to an arbitrary logical truth (for example, p or not-p) and ⊥ is equivalent to an arbitrary contradiction (for example, p and not-p). Obviously, T is equivalent to ¬⊥ and ⊥ is equivalent to ¬T. (Let us call these equivalences Def T and Def ⊥.) Furthermore, <α>β is equivalent to ¬[α]¬β (let us call this equivalence Def <>). So, <α> is similar to an ordinary possibility-like modal operator (the so-called diamond). []-nec (or N) is a fundamental rule in Meyer’s system. It says that if B is a theorem (in the system), then [α]B is also a theorem (in the system).
Axiom DD ¬O(α & ‐α) [DD is called NCO in Meyer (1988)]
Lemma 6 [α](A ∧ B) ↔ ([α]A ∧ [α]B) [Theorem 3 in Meyer (1988)]
In effect, 19 claims that no possible action is forbidden. As Anglberger points out, Fα → [α]⊥ (line 15) seems implausible, but it can be true. If α is an impossible action, the consequent—and hence the whole sentence—is true. Nonetheless, if α is possible, α cannot be forbidden. <α>T says that α is possible, in the sense that there is a way to execute α that leads to a state in which T holds. Clearly 19 is implausible. Clearly, we want to be able to say that at least some possible action is forbidden. So, adding the intuitively plausible axiom DD to Meyer’s dynamic deontic logic PDeL is highly problematic.
Argument 4. The last argument against the dynamic solution to the contrary-to-duty paradox that we discuss seems to be a problem for most action or agent solutions. At least it is a problem for both the dynamic solution and the solution that uses some kind of Stit logic. Several examples of such (apparently) action- and/or agentless contrary-to-duty paradoxes have been mentioned in the literature, such as in Prakken and Sergot (1996). Here we consider one introduced by Rönnedal (2018).
Consider the following scenario. At t1, you are about to get into your car and drive somewhere. Then at t1 it ought to be the case that the doors are closed at t2, when you are in your car. If the doors are not closed, then a warning light ought to appear on the car instrument panel (at t3, a point in time as soon as possible after t2). It ought to be that if the doors are closed (at t2), then it is not the case that a warning light appears on the car instrument panel (at t3). Furthermore, the doors are not closed (at t2 when you are in the car). In this example, all of the following sentences seem to be true:
N2-CTD
AN1 (At t1) The doors ought to be closed (at t2).
AN2 (At t1) It ought to be that if the doors are closed (at t2), then it is not the case that a warning light appears on the car instrument panel (at t3).
AN3 (At t1) If the doors are not closed (at t2) then a warning light ought to appear on the car instrument panel (at t3).
AN4 (At t1 it is the case that at t2) The doors are not closed.
N2-CTD is similar to N-CTD. In this set, AN1 expresses a primary obligation (or ought), and AN3 expresses a contrary-to-duty obligation. The condition in AN3 is satisfied only if the primary obligation expressed by AN1 is violated. But AN3 does not seem to tell us anything about what you or someone else ought to do, and it does not seem to involve any particular agent. AN3 appears to be an action- and agentless contrary-to-duty obligation. It tells us something about what ought to be the case if the world is not as it ought to be according to AN1. It does not seem to be possible to find any plausible symbolisations of N2-CTD and similar paradoxes in dynamic deontic logic or any Stit logic.
Can someone who defends this kind of solution avoid this problem? Two strategies come to mind. One could argue that every kind of apparently action- and agentless contrary-to-duty paradox really involves some kind of action and agent when it is analysed properly. One could, for instance, claim that N2-CTD really includes an implicit agent. It is just that the agent is not a human being; the agent is the car or the warning system in the car. When analysed in detail, AN3 should be understood in the following way:
AN3(b) (At t1) If the doors are not closed (at t2) then the car or the warning system in the car ought to see to it that a warning light appears on the car instrument panel (at t3).
According to this response, one can always find some implicit agent and action in every apparently action- and/or agentless contrary-to-duty paradox. If this is the case, the problem might not be decisive for this kind of solution.
According to the second strategy, we simply deny that genuinely action- and/or agentless obligations are meaningful. If, for example, the sentences in N2-CTD are genuinely actionless and agentless, then they are meaningless and we cannot derive a contradiction from them. Hence, the paradox is solved. If, however, we can show that they involve some kind of actions and some kind of agent or agents, we can use the first strategy to solve them.
Whether any of these strategies is successful is, of course, debatable. There certainly seems to be genuinely action- and agentless obligations that are meaningful, and it seems prima facie unlikely that every apparently action- and agentless obligation can be reduced to an obligation that involves an action and an agent. Is it, for example, really plausible to think of the car or the warning system in the car as an acting agent that can have obligations? Does AN3 [(At t1) If the doors are not closed (at t2) then a warning light ought to appear on the car instrument panel (at t3)] say the same thing as AN3(b) [(At t1) If the doors are not closed (at t2) then the car or the warning system in the car ought to see to it that a warning light appears on the car instrument panel (at t3)]?
e. Temporal Solutions
In this section, we consider some temporal solutions to the contrary-to-duty paradox. The temporal approaches can be divided into three subcategories: (1) the pure temporal solution(s), (2) the temporal-action solution(s), and (3) the temporal dyadic deontic solution(s). All of these combine some kind of temporal logic with some kind of deontic logic. According to the temporal-action solutions, we should also add some kind of action logic to the other parts. Some of the first to construct systems that include both deontic and temporal elements were Montague (1968) and Chellas (1969).
According to the pure temporal solutions, we should use systems that combine ordinary so-called monadic deontic logic with some kind of temporal logic (perhaps together with a modal part) when we symbolise our contrary-to-duty obligations. See Rönnedal (2012, pp. 106–112) for more on some pure temporal solutions and on some problems with such approaches.
The idea of combining temporal logic, deontic logic and some kind of action logic has gained traction. A particularly interesting development is the so-called Stit (Seeing to it) paradigm. According to this paradigm, it is important to make a distinction between agentive and non-agentive sentences. A (deontic) Stit system is a system that includes one or several Stit (Seeing to it) operators that can be used to formalise various agentive sentences. The formula ‘[α: stit A]’ (‘[α: dstit A]’), for instance, says ‘agent α sees to it that A’ (‘agent α deliberately sees to it that A’). [α: (d)stit A] can be abbreviated as [α: A]. Some have argued that systems of this kind can be used to solve the contrary-to-duty paradox; see, for instance, Bartha (1993). According to the Stit approach, deontic constructions must take agentive sentences as complements; in a sentence of the form OA, A must be (or be equivalent to) a Stit sentence. A justification for this claim is, according to Bartha, that practical obligations, ‘ought to do’s’, should be connected to a specific action by a specific agent. The construction ‘agent α is obligated to see to it that A’ can now be defined in the following way: O[α: A] ⟺ L(¬[α: A] → S), where L says that ‘It is settled that’ and S says that ‘there is wrongdoing’ or ‘there is violation of the rules’ or something to that effect. Hence, α is obligated to see to it that A if and only if it is settled that if she does not see to it that A, then there is wrongdoing. In a logic of this kind, N-CTD can be symbolised in the following way: {O[α: k], O[α: [α: k] → [α:¬a]], O[α:¬[α: k] → [α: [α: a]]], ¬[α: k]}. And this set is consistent in Bartha’s system. For more on Stit logic and many relevant references, see Horty (2001), and Belnap, Perloff and Xu (2001).
An example: The temporal dyadic deontic solution
Here we consider, as an example of a temporal solution, the temporal dyadic deontic solution. We should perhaps not talk about ‘the’ temporal dyadic deontic solution, since there really are several different versions of this kind of view. However, let us focus on an example presented in Rönnedal (2018). What is common to all approaches of this kind is that they use some logical system that combines dyadic deontic logic with temporal logic to solve the contrary-to-duty paradox. Usually, the various systems also include a modal part with one or several necessity- and possibility-operators. Solutions of this kind are discussed by, for example, Åqvist (2003), van Eck (1982), Loewer and Belzer (1983), and Feldman (1986, 1990) (see also Åqvist and Hoepelman (1981) and Thomason (1981, 1981b)). Castañeda (1977) and Prakken and Sergot (1996) express some doubts about this kind of approach.
We first describe how the contrary-to-duty paradox can be solved in temporal alethic dyadic deontic logic of the kind introduced by Rönnedal (2018). Then, we consider some reasons why this solution is attractive. We end by mentioning a potential problem with this solution. In temporal alethic dyadic deontic logic, N-CTD can be symbolised in the following way:
F-CTD
F1. Rt1O[T]Rt2k
F2. Rt1O[Rt2k]Rt3¬a
F3. Rt1O[Rt2¬k]Rt3a
F4. Rt1Rt2¬k [⇔Rt2¬k]
where k and a are interpreted as in SDL-CTD. R is a temporal operator; ‘Rt1A’ says that it is realised at time t1 (it is true on t1) that A, and so forth. t1 refers to the moment on Monday when you make your promise, t2 refers to the moment on Friday when you should keep your promise and t3 refers to the moment on Saturday when you should apologise if you do not keep your promise on Friday. O is a dyadic deontic sentential operator of the kind mentioned in Section 2c. ‘O[B]A’ says that it is obligatory that (it ought to be the case that) A given B. In dyadic deontic logic, an unconditional, monadic O-operator can be defined in terms of the dyadic deontic O-operator in the following way: OA =dfO[T]A. According to this definition, it is unconditionally obligatory that A if and only if it is obligatory that A given Verum. All other symbols are interpreted as above. Accordingly, F1 is read as ‘It is true on Monday that you ought to keep your promise on Friday’. F2 is read as ‘It is true on Monday that it ought to be the case that you do not apologise on Saturday given that you keep your promise on Friday’. F3 is read as ‘It is true on Monday that it ought to be the case that you apologise on Saturday given that you do not keep your promise on Friday’. F4 is read as ‘It is true on Monday that it is true on Friday that you do not keep your promise’; in other words, ‘It is true on Friday that you do not keep your promise’. This rendering of N-CTD seems to be plausible.
In temporal (alethic) dyadic deontic logic, truth is relativized to world-moment pairs. This means that a sentence can be true in one possible world w at a particular time t even though it is false in some other possible world, say w’, at this time (that is, at t) or false in this world (that is, in w) at another time, say t’. Some (but not all) sentences are temporally settled. A temporally settled sentence satisfies the following condition: if it is true (in a possible world), it is true at every moment of time (in this possible world); and if it is false (in a possible world), it is false at every moment of time (in this possible world). All the sentences F1−F4 are temporally settled; O[T]Rt2k, O[Rt2k]Rt3¬a and O[Rt2¬k]Rt3a are examples of sentences that are not, as their truth values may vary from one moment of time to another (in one and the same possible world).
Rt1Rt2¬k is equivalent to Rt2¬k. For it is true on Monday that it is true on Friday that you do not keep your promise if and only if it is true on Friday that you do not keep your promise. Hence, from now on we use Rt2¬k as a symbolisation of N4. Note that it might be true on Monday that you will not keep your promise on Friday (in some possible world) even though this is not a settled fact—in other words, even though it is not historically necessary. In some possible worlds, you will keep your promise on Friday and in some possible worlds you will not. F4 is true at t1 (on Monday) in the possible worlds where you do not keep your promise at t2 (on Friday).
Let F-CTD = {F1, F2, F3, F4}. F-CTD is consistent in most interesting temporal alethic dyadic deontic systems (see Rönnedal (2018) for a rigorous proof of this claim). Hence, we can solve the contrary-to-duty paradox in temporal alethic dyadic deontic logic.
Arguments for the temporal alethic dyadic deontic solution
We now consider some reasons why the temporal alethic dyadic deontic solution to the contrary-to-duty paradox is attractive. We first briefly mention some features; then, we discuss some reasons in more detail. (1) F-CTD is consistent. (2) F-CTD is non-redundant. (3) F-CTD is dilemma-free. (4) It is not possible to derive the proposition that everything is both obligatory and forbidden from F-CTD. (5) F-CTD avoids the so-called pragmatic oddity. (6) The solution in temporal alethic dyadic deontic logic is applicable to (at least apparently) action- and agentless contrary-to-duty examples. (7) We can assign formal sentences with analogous structures to all conditional obligations in N-CTD in temporal alethic dyadic deontic logic. (8) We can express the idea that an obligation has been violated, and (9) we can symbolise higher order contrary-to-duty obligations in temporal alethic dyadic deontic logic. (10) In temporal alethic dyadic deontic logic we can derive ‘ideal’ obligations, and (11) we can derive ‘actual’ obligations (in certain circumstances). (12) We can avoid the so-called dilemma of commitment and detachment in temporal alethic dyadic deontic logic. All of these reasons are discussed in Rönnedal (2018). Now let us say a few more words about some of them.
Reason (I): F-CTD is dilemma-free. The solution in temporal alethic dyadic deontic logic is dilemma-free. The sentence Rt1O[T]Rt3¬a is derivable from F1 and F2 (in some systems) (see Reason V below) and from F3b and F4 we can deduce the formula Rt2O[T]Rt3a (in some systems under some circumstances) (see Reason VI below). Accordingly, we can derive the following sentence: Rt1O[T]Rt3¬a ∧ Rt2O[T]Rt3a (in certain systems). Rt1O[T]Rt3¬a says ‘On Monday [when you have not yet broken your promise] it ought to be the case that you do not apologise on Saturday’, and Rt2O[T]Rt3a says ‘On Friday [when you have broken your promise] it ought to be the case that you apologise on Saturday’. Despite this, O[T]Rt3a and O[T]Rt3¬a are not true at the same time. Neither Rt1O[T]Rt3¬a ∧ Rt1O[T]Rt3a nor Rt2O[T]Rt3¬a ∧ Rt2O[T]Rt3a is derivable from F-CTD in any interesting temporal alethic dyadic deontic system. Consequently, this is not a moral dilemma. Since N-CTD seems to be dilemma-free, we want our formalisation of N-CTD to be dilemma-free too; and F-CTD is, as we have seen, dilemma-free. This is one good reason to be attracted to the temporal alethic dyadic deontic solution.
Reason (II): F-CTD avoids the so-called pragmatic oddity. Neither O[T](Rt2k ∧ Rt3a), Rt1O[T](Rt2k ∧ Rt3a) nor Rt2O[T](Rt2k ∧ Rt3a) is derivable from F-CTD in any interesting temporal alethic dyadic deontic system. Hence, we can avoid the pragmatic oddity (see Section 2a above).
Reason (III): The solution in temporal alethic dyadic deontic logic is applicable to (at least apparently) actionless and agentless contrary-to-duty examples. In Section 2d, we considered an example of an (apparently) action- and agentless contrary-to-duty paradox. In temporal alethic dyadic deontic logic, it is easy to find plausible symbolisations of (apparently) action- and agentless contrary-to-duty obligations; the sentences in N2-CTD have the same logical form as the sentences in N-CTD. It follows that contrary-to-duty paradoxes of this kind can be solved in exactly the same way as we solved our original paradox.
Reason (IV): We can assign formal sentences with analogous structures to all conditional obligations in N-CTD in temporal alethic dyadic deontic logic. According to some deontic logicians, a formalisation of N-CTD is adequate only if the formal sentences assigned to N2 and N3 have the same (or analogous) logical form (see Carmo and Jones (2002)). The temporal alethic dyadic deontic solution satisfies this requirement. Not all solutions do that. F2 and F3 have the ‘same’ logical form and they can both be formalised using dyadic obligation.
Reason (V): We can derive ‘ideal’ obligations in temporal alethic dyadic deontic logic. N1 and N2 seem to entail that you ought not to apologise. Ideally you ought to keep your promise, and ideally it ought to be that if you keep your promise, then you do not apologise (for not keeping your promise). Accordingly, ideally you ought not to apologise. We want our formalisation of N-CTD to reflect this intuition. Rt1O[T]Rt3¬a is deducible from F1 (Rt1O[T]Rt2k) and F2 (Rt1O[Rt2k]Rt3¬a) in many temporal dyadic deontic systems. The tableau below proves this.
We use two derived rules in our deduction. These are also used in our next semantic tableau (see Reason VI below). According to the first derived rule, DR1, we may add ¬A, wit to any open branch in a tree that includes ¬RtA, witj. This rule is deducible in every system. According to the second derived rule, DR2, we may add O[T](A → B), witjto any open branch in a tree that contains O[A]B, witj. DR2 can be derived in every system that includes the rules T − Dα0 and T − Dα2. (All other special rules that we use in our deductions are described by Rönnedal (2018).)
(1) Rt1O[T]Rt2k, w0t0
(2) Rt1O[Rt2k]Rt3¬a, w0t0
(3) ¬Rt1O[T]Rt3¬a, w0t0
(4) ¬O[T]Rt3¬a, w0t1 [3, DR1]
(5) P[T]¬Rt3¬a, w0t1 [4, ¬O]
(6) sTw0w1t1 [5, P]
(7) ¬Rt3¬a, w1t1 [5, P]
(8) ¬¬a, w1t3 [7, DR1]
(9) O[T]Rt2k, w0t1 [1, Rt]
(10) Rt2k, w1t1 [9, 6, O]
(11) k, w1t2 [10, Rt]
(12) O[Rt2k]Rt3¬a, w0t1 [2, Rt]
(13) O[T](Rt2k → Rt3¬a), w0t1 [12, DR2]
(14) Rt2k → Rt3¬a, w1t1 [13, 6, O]
↙↘
(15) ¬Rt2k, w1t1 [14, →] (16) Rt3¬a, w1t1 [14, →]
(17) ¬k, w1t2 [15, DR1] (18) ¬a, w1t3 [16, Rt]
(19) * [11, 17] (20) * [8, 18]
Informally, Rt1O[T]Rt3¬a says that it is true at t1, that is, on Monday, that it ought to be the case that you will not apologise on Saturday when you meet your friend. For, ideally, you keep your promise on Friday. Yet, Rt2O[T]Rt3¬a does not follow from F1 and F2 (see Reason I above). On Friday, when you have broken your promise, and when it is no longer historically possible for you to keep your promise, then it is not obligatory that you do not apologise on Saturday. On Friday, it is obligatory that you apologise when you meet your friend on Saturday (see Reason VI). Nevertheless, it is plausible to claim that it is true on Monday that it ought to be the case that you do not apologise on Saturday. For on Monday it is not a settled fact that you will not keep your promise; on Monday, it is still possible for you to keep your promise, which you ought to do. These conclusions correspond well with our intuitions about Scenario I.
According to the counterfactual solution (see Section 2c) to the contrary-to-duty paradoxes, we cannot derive any ‘ideal’ obligations of this kind. This is a potential problem for this solution.
Reason (VI): We can derive ‘actual’ obligations in temporal alethic dyadic deontic logic (in certain circumstances). N3 and N4 appear to entail that you ought to apologise. Ideally you ought to keep your promise, but if you do not keep your promise, you ought to apologise. As a matter of fact, you do not keep your promise. It follows that you should apologise. We want our symbolisation of N-CTD to reflect this intuition. Therefore, let us assume that the conditional (contrary-to-duty) obligation expressed by N3 is still in force at time t2; in other words, we assume that the following sentence is true:
F3b Rt2O[Rt2¬k]Rt3a.
Informally, F3b says that it is true at t2 (on Friday) that if you do not keep your promise on Friday, you ought to apologise on Saturday. Rt2O[T]Rt3a is derivable from F4 (Rt2¬k) and F3b in every tableau system that includes T−Dα0, T−Dα2, T−DMO (the dyadic must-ought principle) and T−BT (backward transfer) (see Rönnedal (2018)). According to Rt2O[T]Rt3α, it is true at t2 (on Friday), when you have broken your promise to your friend, that it ought to be the case that you apologise to your friend on Saturday when you meet her.
Note that Rt1O[T]Rt3a is not deducible from F3 (or F3b or F3 and F3b) and F4 (see Reason I). According to Rt1O[T]Rt3a, it is true at t1, on Monday, that you should apologise to you friend on Saturday when you meet her. However, on Monday it is not yet a settled fact that you will not keep your promise to your friend; on Monday it is still open to you to keep your promise. Accordingly, it is not true on Monday that you should apologise on Saturday. Since it is true on Monday that you ought to keep your promise, and it ought to be that if you keep your promise then you do not apologise, it follows that it is true on Monday that it ought to be the case that you do not apologise on Saturday (see Reason V). These facts correspond well with our intuitions about Scenario I.
The following tableau proves that Rt2O[T]Rt3a is derivable from F3b and F4:
(1) Rt2¬k, w0t0
(2) Rt2O[Rt2¬k]Rt3a, w0t0
(3) ¬Rt2O[T]Rt3a, w0t0
(4) ¬O[T]Rt3a, w0t2 [3, DR1]
(5) P[T]¬Rt3a, w0t2 [4, ¬O]
(6) sTw0w1t2 [5, P]
(7) ¬Rt3a, w1t2 [5, P]
(8) ¬a, w1t3 [7, DR1]
(9) rw0w1t2 [6, T − DMO]
(10) ¬k, w0t2 [1, Rt]
(11) O[Rt2¬k]Rt3a, w0t2 [2, Rt]
(12) O[T](Rt2¬k → Rt3a), w0t2 [11, DR2]
(13) Rt2¬k → Rt3a, w1t2 [6, 12, O]
↙↘
(14) ¬Rt2¬k, w1t2 [13, →] (15) Rt3a, w1t2 [13, →]
(16) ¬¬k, w1t2 [14, DR1] (17) a, w1t3 [15, Rt]
(18) k, w1t2 [16, ¬¬] (19) * [8, 17]
(20) k, w0t2 [9, 18, T − BT]
(21) * [10, 20]
F3 and F3b are independent of each other (in most interesting temporal alethic dyadic deontic systems). Hence, one could argue that N3 should be symbolised by a conjunction of F3 and F3b. For we have assumed that the contrary-to-duty obligation to apologise, given that you do not keep your promise, is still in force at t2. It might be interesting to note that this does not affect the main results in this section. {F1, F2, F3, F3b, F4} is, for example, consistent, non-redundant, and so on. So, we can use such an alternative formalisation of N3 instead of F3. Moreover, note that the symbolisation of N2 can be modified in a similar way.
Reason (VII): In temporal alethic dyadic deontic logic we can avoid the so-called dilemma of commitment and detachment. (Factual) Detachment is an inference pattern that allows us to infer or detach an unconditional obligation from a conditional obligation and this conditional obligation’s condition. Thus, if detachment holds for the conditional (contrary-to-duty) obligation that you should apologise if you do not keep your promise (if detachment is possible), and if you in fact do not keep your promise, then we can derive the unconditional obligation that you should apologise.
van Eck (1982, p. 263) describes the so-called dilemma of commitment and detachment in the following way: (1) detachment should be possible, for we cannot take seriously a conditional obligation if it cannot, by way of detachment, lead to an unconditional obligation; and (2) detachment should not be possible, for if detachment is possible, the following kind of situation would be inconsistent—A, it ought to be the case that B given that A; and C, it ought to be the case that not-B given C. Yet, such a situation is not necessarily inconsistent.
In pure dyadic deontic logic, we cannot deduce the unconditional obligation that it is obligatory that A (OA) from the dyadic obligation that it is obligatory that A given B (O[B]A) and B. Still, if this is true, how can we take such conditional obligations seriously? Hence, the dilemma of commitment and detachment is a problem for solutions to the contrary-to-duty paradox in pure dyadic deontic logic. In temporal alethic dyadic deontic logic, we can avoid this dilemma. We cannot always detach an unconditional obligation from a conditional obligation and its condition, but we can detach the unconditional obligation OB from O[A]B and A if A is non-future or historically necessary (in some interesting temporal alethic dyadic deontic systems). This seems to give us exactly the correct answer to the current problem. Detachment holds, but the rule does not hold unrestrictedly. We have seen above that Rt2O[T]Rt3a, but not Rt1O[T]Rt3a, is derivable from Rt2¬k and Rt2O[Rt2¬k]Rt3a in certain systems, that is, that we can detach the former sentence, but not the latter. Nevertheless, we cannot conclude that a set of the following kind must be inconsistent: {A, O[A]B, C, O[C]¬B}; this seems to get us exactly what we want.
All of these reasons show that the temporal dyadic deontic solution is very attractive. It avoids many of the problems with other solutions that have been suggested in the literature. However, even though the solution is quite attractive, it is not unproblematic. We will now consider one potential serious problem.
An argument against the temporal solutions
The following argument against the temporal dyadic deontic solution appears to be a problem for every other kind of temporal solution too. There seems to be timeless (or parallel) contrary-to-duty paradoxes. In a timeless (or parallel) contrary-to-duty paradox, all obligations seem, in some sense, to be in force simultaneously, and both the antecedent and consequent in the contrary-to-duty obligation appear to ‘refer’ to the same time (if indeed they refer to any time at all). Such paradoxes cannot be solved in temporal dyadic deontic logic or any other system of this kind. For a critique of temporal solutions to the contrary-to-duty paradoxes, see Castañeda (1977). Several (apparently) timeless (or parallel) contrary-to-duty paradoxes are mentioned by Prakken and Sergot (1996).
Here is one example.
Scenario III: The Dog Warning Sign Scenario (After Prakken and Sergot (1996))
Consider the following set of cottage regulations. It ought to be that there is no dog. It ought to be that if there is no dog, there is no warning sign. If there is a dog, it ought to be that there is a warning sign. Suppose further that there is a dog. Then all of the following sentences seem to be true:
TN-CTD
(TN1) It ought to be that there is no dog.
(TN2) It ought to be that if there is no dog, there is no warning sign.
(TN3) If there is a dog, it ought to be that there is a warning sign.
(TN4) There is a dog.
(TN1) expresses a primary obligation and (TN3) a contrary-to-duty obligation. The condition in (TN3) is fulfilled only if the primary obligation expressed by (TN1) is violated. Let TN-CTD = {TN1, TN2, TN3, TN4}. It seems possible that all of the sentences in TN-CTD could be true; the set does not seem to be inconsistent. Yet, if this is the case, TN-CTD poses a problem for all temporal solutions.
In this example, all obligations appear to be timeless or parallel; they appear to be in force simultaneously, and the antecedent and consequent in the contrary-to-duty obligation (TN3) seem to refer to one and the same time (or perhaps to no particular time at all). So, a natural symbolisation is the following:
FTN-CTD
(FTN1) O[T]¬d
(FTN2) O[¬d]¬w
(FTN3) O[d]w
(FTN4) d
where d stands for ‘There is a dog’ and w for ‘There is a warning sign’ and all other symbols are interpreted as above. Nevertheless, this set is inconsistent in many temporal alethic dyadic deontic systems. We prove this below. But first let us consider some derived rules that we use in our tableau derivation.
Derived rules
DR3 O[A]B => O[T](A→B)
DR4 O[A]B, O[A](B→C) => O[A]C
DR5 O[T](A→B), A => O[T]B, given that A is non-future.
According to DR3, if we have O[A]B, witj on an open branch in a tree we may add O[T](A→B), witj to this branch in this tree. The other derived rules are interpreted in a similar way. A is non-future as long as A does not include any operator that refers to the future.
We are now in a position to prove that the set of sentences FTN-CTD = {FTN1, FTN2, FTN3, FTN4} is inconsistent in every temporal dyadic deontic tableau system that includes the rules T–DMO, T–Dα0 – T–Dα4, T–FT, and T–BT (Rönnedal (2018)). Here is the tableau derivation:
(1) O[T]¬d, w0t0
(2) O[¬d]¬w, w0t0
(3) O[d]w, w0t0
(4) d, w0t0
(5) O[T](¬d → ¬w), w0t0 [2, DR3]
(6) O[T](d → w), w0t0 [3, DR3]
(7) O[T]¬w, w0t0 [1, 5, DR4]
(8) O[T]w, w0t0 [4, 6, DR5]
(9) T, w0t0 [Global Assumption]
(10) STw0w1t0 [9, T–Dα3]
(11) ¬w, w1t0 [7, 10, O]
(12) w, w1t0 [8, 10, O]
(13) * [11, 12]
This is counterintuitive, since TN-CTD seems to be consistent. This is an example of a timeless (parallel) contrary-to-duty paradox.
Can we avoid this problem by introducing some temporal operators in our symbolisation of TN-CTD? One natural interpretation of the sentences in this set is as follows: (TN1) (At t1) It ought to be that there is no dog; (TN2) (At t1) It ought to be that if there is no dog (at t1), there is no warning sign (at t1); (TN3) (At t1) If there is a dog, then (at t1) it ought to be that there is a warning sign (at t1); and (TN4) (At t1) There is a dog.
Hence, an alternative symbolisation of the sentence in (TN-CTD) is the following:
F2TN-CTD
(F2TN1) Rt1O[T]Rt1¬d
(F2TN2) Rt1O[Rt1¬d]Rt1¬w
(F2TN3) Rt1O[Rt1d]Rt1w
(F2TN4) Rt1d
Yet, the set F2TN-CTD = {F2TN1, F2TN2, F2TN3, F2TN4} is also inconsistent. The proof is similar to the one above. So, this move does not help. And it does not seem to be the case that we can find any other plausible symbolisation of TN-CTD in temporal alethic dyadic deontic logic that is consistent. (TN2) cannot, for instance, plausibly be interpreted in the following way: (At t1) It ought to be that if there is no dog (at t2), there is no warning sign (at t3), where t1 is before t2 and t2 before t3. And (TN3) cannot plausibly be interpreted in the following way: (At t1) If there is a dog, then (at t2) it ought to be that there is a warning sign (at t3), where t1 is before t2 and t2 before t3.
Hence, (apparently) timeless contrary-to-duty paradoxes pose a real problem for the temporal dyadic deontic solution and other similar temporal solutions.
3. References and Further Reading
Anglberger, A. J. J. (2008). Dynamic Deontic Logic and Its Paradoxes. Studia Logica, Vol. 89, No. 3, pp. 427–435.
Åqvist, L. (1967). Good Samaritans, Contrary-to-duty Imperatives, and Epistemic Obligations. Noûs 1, pp. 361–379.
Åqvist, L. (1984). Deontic Logic. In D. Gabbay and F. Guenthner (eds.) Handbook of Philosophical Logic, Vol. II, D. Reidel, pp. 605–714.
Åqvist, L. (1987). Introduction to Deontic Logic and the Theory of Normative Systems. Naples, Bibliopolis.
Åqvist, L. (2002). Deontic Logic. In Gabbay and Guenthner (eds.) Handbook of Philosophical Logic, 2nd Edition, Vol. 8, Dordrecht/Boston/London: Kluwer Academic Publishers, pp. 147–264.
Åqvist, L. (2003). Conditionality and Branching Time in Deontic Logic: Further Remarks on the Alchourrón and Bulygin (1983) Example. In Segerberg and Sliwinski (eds.) (2003) Logic, law, morality: thirteen essays in practical philosophy in honour of Lennart Åqvist, Uppsala philosophical studies 51, Uppsala: Uppsala University, pp. 13–37.
Åqvist, L. and Hoepelman, J. (1981). Some theorems about a ‘tree’ system of deontic tense logic. In R. Hilpinen (ed.) New Studies in Deontic Logic, D. Reidel, Dordrecht, pp. 187–221.
Bartha, P. (1993). Conditional obligation, deontic paradoxes, and the logic of agency. Annals of Mathematics and Artificial Intelligence 9, (1993), pp. 1–23.
Belnap, N., Perloff, M. and Xu, M. (2001). Facing the Future: Agents and Choices in Our Indeterminist World. Oxford: Oxford University Press.
Bonevac, D. (1998). Against Conditional Obligation. Noûs, Vol 32 (Mars), pp. 37–53.
Carmo, J. and Jones, A. J. I. (2002). Deontic Logic and Contrary-to-duties. In Gabbay and Guenthner (eds.) (2002) Handbook of Philosophical Logic, vol 8, pp. 265–343.
Castañeda, H. -N. (1977). Ought, Time, and the Deontic Paradoxes. The Journal of Philosophy, Vol. 74, No. 12, pp. 775–791.
Castañeda, H. -N. (1981). The paradoxes of deontic logic: the simplest solution to all of them in one fell swoop. In R. Hilpinen (ed.) New Studies in Deontic Logic, D. Reidel, Dordrecht, pp. 37–85.
Chellas, B. F. (1969). The Logical Form of Imperatives. Stanford: Perry Lane Press.
Chellas, B. F. (1980). Modal Logic: An Introduction. Cambridge: Cambridge University Press.
Chisholm, R. M. (1963). Contrary-to-duty Imperatives and Deontic Logic. Analysis 24, pp. 33–36.
Cox, Azizah Al-Hibri. (1978). Deontic Logic: A Comprehensive Appraisal and a New Proposal. University Press of America.
Danielsson, S. (1968). Preference and Obligation: Studies in the Logic of Ethics. Filosofiska föreningen, Uppsala.
Decew, J. W. (1981). Conditional Obligations and Counterfactuals. The Journal of Philosophical Logic 10, pp. 55–72.
Feldman, F. (1986). Doing The Best We Can: An Essay in Informal Deontic Logic. Dordrecht: D. Reidel Publishing Company.
Feldman, F. (1990). A Simpler Solution to the Paradoxes of Deontic Logic. Philosophical Perspectives, vol. 4, pp. 309–341.
Fisher, M. (1964). A contradiction in deontic logic?, Analysis, XXV, pp. 12–13.
Forrester, J. W. (1984). Gentle Murder, or the Adverbial Samaritan. Journal of Philosophy, Vol. LXXI, No. 4, pp. 193–197.
Gabbay, D., Horty, J., Parent, X., van der Meyden, E. & van der Torre, L. (eds.). (2013). Handbook of Deontic Logic and Normative Systems. College Publications.
Greenspan. P. S. (1975). Conditional Oughts and Hypothetical Imperatives. The Journal of Philosophy, Vol. 72, No. 10 (May 22), pp. 259–276.
Hansson, B. (1969). An Analysis of Some Deontic Logics. Noûs 3, 373-398. Reprinted in Hilpinen, Risto (ed). 1971. Deontic Logic: Introductory and Systematic Readings. Dordrecht: D. Reidel Publishing Company, pp. 121–147.
Hilpinen, R. (ed). (1971). Deontic Logic: Introductory and Systematic Readings. Dordrecht: D. Reidel Publishing Company.
Hilpinen, R. (ed). (1981). New Studies in Deontic Logic Norms, Actions, and the Foundation of Ethics. Dordrecht: D. Reidel Publishing Company.
Horty, J. F. (2001). Agency and Deontic Logic. Oxford: Oxford University Press.
Jones, A. and Pörn, I. (1985). Ideality, sub-ideality and deontic logic. Synthese 65, pp. 275–290.
Lewis, D. (1974). Semantic analysis for dyadic deontic logic. In S. Stenlund, editor, Logical Theory and Semantical Analysis, pp. 1–14. D. Reidel Publishing Company, Dordrecht, Holland.
Loewer, B. and Belzer, M. (1983). Dyadic deontic detachment. Synthese 54, pp. 295–318.
McNamara, P. (2010). Deontic Logic. In E. N. Zalta (ed.), TheStanford Encyclopedia of Philosophy.
Montague, R. (1968). Pragmatics. In R. Klibansky (ed.) Contemporary Philosophy: Vol. 1: Logic and the Foundations of Mathematics, pp. 102–122, La Nuova Italia Editrice, Firenze, (1968).
Mott, P. L. (1973). On Chisholm’s paradox. Journal of Philosophical Logic 2, pp. 197–211.
Meyer, J.-J. C. (1988). A Different Approach to Deontic Logic: Deontic Logic Viewed as a Variant of Dynamic Logic. Notre Dame Journal of Formal Logic, Vol. 29, Number 1.
Niles, I. (1997). Rescuing the Counterfactual Solution to Chisholm’s Paradox. Philosophia, Vol. 25, pp. 351–371.
Powers, L. (1967). Some Deontic Logicians. Noûs 1, pp. 361–400.
Prakken, H. and Sergot, M. (1996). Contrary-to-duty obligations. Studia Logica, 57, pp. 91–115.
Rescher, N. (1958). An axiom system for deontic logic. Philosophical studies, Vol. 9, pp. 24–30.
Rönnedal, D. (2009). Dyadic Deontic Logic and Semantic Tableaux. Logic and Logical Philosophy, Vol. 18, No. 3–4, pp. 221–252.
Rönnedal, D. (2012). Extensions of Deontic Logic: An Investigation into some Multi-Modal Systems. Department of Philosophy, Stockholm University.
Rönnedal, D. (2016). Counterfactuals in Temporal Alethic-Deontic Logic. South American Journal of Logic. Vol. 2, n. 1, pp. 57–81.
Rönnedal, D. (2018). Temporal Alethic Dyadic Deontic Logic and the Contrary-to-Duty Obligation Paradox. Logic and Logical Philosophy. Vol. 27, No 1, pp. 3–52.
Rönnedal, D. (2019). Contrary-to-duty paradoxes and counterfactual deontic logic. Philosophia, 47 (4), pp. 1247–1282.
Thomason, R. H. (1981). Deontic Logic as Founded on Tense Logic. In R. Hilpinen (ed.) New Studies in Deontic Logic, D. Reidel, Dordrecht, pp. 165–176.
Thomason, R. H. (1981b). Deontic Logic and the Role of Freedom in Moral Deliberation. In R. Hilpinen (ed.) New Studies in Deontic Logic, D. Reidel, Dordrecht, pp. 177–186.
Tomberlin, J. E. (1981). Contrary-to-duty imperatives and conditional obligations. Noûs 15, pp. 357–375.
van Eck, J. (1982). A system of temporally relative modal and deontic predicate logic and its philosophical applications. Logique et Analyse, Vol 25, No 99, pp. 249–290, and No 100, pp. 339–381. Original publication, as dissertation, Groningen, University of Groningen, 1981.
van der Torre, L. W. N. and Tan, Y. H. (1999). Contrary-To-Duty Reasoning with Preference-based Dyadic Obligations. Annals of Mathematics and Artificial Intelligence 27, pp. 49–78.
Wieringa, R. J. & Meyer, J.-J. Ch. (1993). Applications of Deontic Logic in Computer Science: A Concise Overview. In J.-J. Meyer and R. Wieringa, editors, Deontic Logic in Computer Science: Normative System Specification, pp. 17–40. John Wiley & Sons, Chichester, England.
van Fraassen, C. (1972). The Logic of Conditional Obligation. Journal of Philosophical Logic 1, pp. 417–438.
van Fraassen, C. (1973). Values and the Heart’s Command. The Journal of Philosophy LXX, pp. 5–19.
von Kutschera, F. (1974). Normative Präferenzen und bedingte Gebote. I Lenk, H., & Berkemann J. (eds.). (1974), pp. 137–165.
von Wright, G. H. (1964). A new system of deontic logic. Danish yearbook of philosophy, Vol. 1, pp. 173–182.
The compactness theorem is a fundamental theorem for the model theory of classical propositional and first-order logic. As well as having importance in several areas of mathematics, such as algebra and combinatorics, it also helps to pinpoint the strength of these logics, which are the standard ones used in mathematics and arguably the most important ones in philosophy.
The main focus of this article is the many different proofs of the compactness theorem, applying different Choice-like principles before later calibrating the strength of these and the compactness theorems themselves over Zermelo-Fraenkel set theory ZF. Although the article’s focus is mathematical, much of the discussion keeps an eye on philosophical applications and implications.
We first introduce some standard logics, detailing whether the compactness theorem holds or fails for these. We also broach the neglected question of whether natural language is compact. Besides algebra and combinatorics, the compactness theorem also has implications for topology and foundations of mathematics, via its interaction with the Axiom of Choice. We detail these results as well as those of a philosophical nature, such as apparent ‘paradoxes’ and non-standard models of arithmetic and analysis. We then provide several different proofs of the compactness theorem based on different Choice-like principles.
In later sections, we discuss several variations of compactness in logics that allow for infinite conjunctions / disjunctions or generalised quantifiers, and in higher-order logics. The article concludes with a history of the compactness theorem and its many proofs, starting from those that use syntactic proofs before moving to the semantic proofs model theorists are more accustomed to today.
Robert Leek
Email: r.leek@bham.ac.uk
University of Birmingham
United Kingdom
Perspectivism in Science
Perspectivism, or perspectival realism, has been discussed in philosophy for many centuries, but as a view about science, it is a twenty-first-century topic. Although it has taken many forms and even though there is no agreed definition, perspectivism at its heart uses a visual metaphor to help us understand the scope and character of scientific knowledge. Several interrelated issues are surrounding this one central feature of perspectivism. It is typically an epistemic position, although debates about its scope have touched upon metaphysics as well. Modeling, realism, representation, pluralism, and justification are some of the main issues in the philosophy of science that are connected with perspectivism.
Defenders of this view aspire to develop an account of science that has a kind of realist flavor, but typically without the epistemic inaccessibility often accompanying realism. To do this, perspectivists often try, in various ways, to resist social constructivism (sometimes construed by its opponents as endorsing the social dependence of scientific knowledge). The strategy is to endorse a mind-independent world that our theories track, while at the same time accommodating the historical, experimental, and modeling contexts of scientific knowledge on the other. Perspectival realism, therefore, has a realist element as well as a perspective-dependent element, where perspective-dependence is meant to acknowledge the importance of those contexts. The visual metaphor speaks to both of these elements because the character of a visual experience depends upon contributions from human sense organs as well as the environment (in the form of light rays). We can think of the human contribution that affects scientific knowledge as taking two forms: one associated with a historical study of science and the other associated with variation across the contemporary sciences.
This article explores the different ways that defenders of perspectivism have attempted to make use of the visual metaphor to develop a coherent account of realism while also overcoming criticisms. The article examines the visual metaphor and the general philosophical problems motivating perspectivism. Chief among these motivations is model pluralism. Next, the article details how Ronald Giere—the first philosopher in contemporary times to apply the metaphor to science—motivates a representational version of perspectivism. His account has inspired criticism and alternative applications of the metaphor and hence different ways of conceiving perspectivism. The rest of this article explores those criticisms and how philosophers have attempted to reconceive perspectivism more fruitfully and in some cases without relying so heavily on representation.
In visual experience, the human contribution involves the visual system. In applying this metaphor to scientific knowledge, there are a few ways to think about how that knowledge is human-based and what it means for there to be a human contribution. There are two ways in which the literature on perspectivism emphasizes human contribution. Massimi writes of these:
(1) Our scientific knowledge is historically situated, that is, it is the inevitable product of the historical period to which those scientific representations, modeling practices, data gathering, and scientific theories belong. [Diachronic]
And/Or
(2) Our scientific knowledge is culturally situated, that is, it is the inevitable product of the prevailing cultural tradition in which those scientific representations, modeling practices, data gathering, and scientific theories were formulated [Synchronic] (Massimi in Saatsi, 2017, page 164).
“Culturally situated” here does not mean Western Science or nationally affiliated (as in British or Indian Science). Rather, it refers to the particular scientific enterprise, such as phylogeny, ecosystem ecology, or classical field theory, though it may refer to other parts of scientific practice or theory at courser or finer resolution. These perspectival elements are meant to be concessions to anti-realism that still allow the perspectivist to retain a form of realism. Exactly what sorts of threats realism should confront via perspectivism depends upon the perspectival account.
Strong forms of realism and of anti-realism both face substantial challenges and at the same time offer different insights about scientific knowledge. By taking a middle ground perspectivism, its defenders hope, one might take on board the best the anti-realist has to offer while also committing to key realist commitments. Those commitments, as Psillos (1999, Introduction) defines them, are these:
(1) Metaphysical. There is a mind- and theory-independent world.
(2) Epistemic. We can have justified true beliefs about the mind-independent world.
(3) Semantic. Our scientific terms (and theories) track this
the mind-independent world through reference.
The most ambitious perspectivist will want to endorse all three of these while at the same time acknowledging that scientific knowledge is human-dependent in some way (which is the concession to anti-realism). To say scientific knowledge is dependent upon humans typically might mean one of two things
(1) Our scientific knowledge is historically situated. This is a diachronic version that takes an interest in how science has and has not changed over time.
(2) Our scientific knowledge is situated within different areas of contemporary science, whether it is within different practices, disciplines, families of models, or in some other unit of science. This is a synchronic version of perspectivism.
The synchronic character of scientific knowledge is often one of the central issues motivating debates about perspectivism. The worry perspectivists are responding to is that model pluralism could put pressure on realism as traditionally conceived. As Rueger notes (2020), the realist wants to say that the success of the best theories of science should lead us to treat those theories as true in a metaphysically and epistemically robust sense: success is our guide to a mind-independent reality. One trouble with this realist motivation is that there are several successful scientific theories, but they do not seem to offer the same description of reality; in fact, many theories and models appear to be contradictory. Unless nature is also contradictory, the realist needs a story about the plurality of successful theories, models, and explanations we see in the sciences. This is the problem of inconsistent models. Perspectival realism aspires to address this problem without retreating too far from that mind-independent position; hence, realism motivates perspectivists who take seriously the pluralism of successful theories or models.
This is not the only reason to be a perspectivist. Thinking about perspectivism diachronically may give some resources for acknowledging the success of past science and the historical path that modern science has taken. The thought here is that scientific knowledge emerges within particular contexts and is always evaluated within them. There is no view from outside of our epistemic contexts that allows us to evaluate knowledge independently of the practices that use it. This is a problem of knowledge generation and knowledge evaluation and is sometimes called the God’s-eye-view problem (2006, page 15). Realists often treat truth as epistemically unavailable (see Psillos (1999, Introduction)). Instead, they argue we must use success as a proxy for justifying our knowledge of a mind-independent world. But the trouble here is that what counts as successful is historically situated and many past theories that once seemed successful have been abandoned. The realist, therefore, needs a story about the connection between truth and success if we are to imagine that we can make claims about a mind-independent world and at the same time understand the success of past science. Perspectivism aspires to tell this story in such a way that is historically sensitive and not just instrumentalist.
Different philosophers take different realist commitments more or less seriously. The first contemporary take on scientific perspectivism, which Ronald Giere defends, is committed to both synchronic and diachronic versions, but he has been criticized for not clearly committing to the realist tenets as they are traditionally construed. Other authors, as we will see, try to have stronger endorsements of these tenets with more traditional realist interpretations. The upshot, if they are successful, is to have an epistemically accessible interpretation of science, that is, a view of scientific knowledge that is modest enough to achieve, but still robust (in some realist sense).
2. Philosophical Inspirations
Although distinct from their philosophical predecessors, some of the views discussed in this article have been influenced by Kant, Kuhn, Feyerabend, Sosa, Nietzsche, American pragmatists, and others. Some of that inspiration is vague and the similarity between a historical figure’s views and contemporary perspectivism may be a similarity only in spirit. Although they do not describe their views as perspectival, several philosophers are particularly important for the groundwork that has made an interest in the middle ground that perspectivists seek an attractive ground. Kuhn (1962), Putnam (1981), and Hacking (2002) are particularly noteworthy for their interests in science, in truth, and in rationality – interests that revolve in various ways around the attempt to transcend the dichotomy between realism and antirealism, or even between objectivity and subjectivity. We can see their influence at work in contemporary perspectivism, though the labels and indeed crucial elements of these various accounts differ. Kant, Sosa, and to an extent Kuhn are implicitly inspirations for Massimi’s views (see 2014[a]; 2015[a]).Though some of the crucial philosophical predecessors do not defend accounts they would call perspectival, there is some contemporary work that attempts to more rigorously interpret those historical figures in perspectival terms. The most explicitly discussed influences are Kuhn and Feyerabend (Giere, 2016; Giere, 2013; Giere, 2009). Ronald Giere argues that the latter develops a form of perspectivism, but with minimal realism. He describes how Feyerabend uses theater and perspective painting to show by analogy that science compares constructions with other constructions, but never with reality directly (2001, pages 103ff). Although this shares with Giere’s perspectivism the idea that scientists build things (models) that they then compare with other things they have built (models of data), Giere thinks Feyerabend strays much further from realism than Giere’s account. Giere has not given an argument for distinguishing his account from Feyerabend’s, however, and admits that he has simply avoided offering a view of truth (2016, pages 137-138). So, although Giere aspires to be a realist, the argument establishing that his account is a realist account is missing. Feyerabend does not have this aspiration; the difference between their accounts may amount to a difference in their vision for science, not the arguments they give in defense of their views.
Giere also likens Kuhn to a perspectival realist. The main differences Giere identifies are (1) that Kuhn tends to think of paradigms in terms of semantics (instead of models, see (2013, page 54) and (2) Kuhn in many places is deeply concerned with truth and metaphysics; scientific revolutions and world change are profoundly connected to metaphysics and yet this is not a topic Giere addresses in much detail.
Pragmatism may also share some crucial similarities to some forms of perspectivism, especially those forms, such as Giere’s or Mitchell’s, that appeal heavily to the aims of scientists. Brown offers a comparison here (2009). He shows that American pragmatists, especially Peirce and Dewey, provide a more detailed account of scientific aims and how those aims feature in and structure scientific inquiry. Perspectivists (he explicitly directs his remarks at Giere’s views) could benefit from an examination of this more detailed and contextualized view of scientific aims.
3. Representational and Other Forms of Perspectivism
The first application of perspectivism to science comes from Giere (2006). His account is based on representation. Mitchell (2020) and Teller (2020) are also interested in defending perspectivist accounts for broadly similar reasons. The next section examines these representation-based accounts before turning to other characterizations of perspectivism. After, there is a section examining the motivation for representation
a. Representational Perspectivism
In his work (2006), Giere argues science should be understood in terms of a hierarchy of models and that modeling works a lot like human vision, hence the perspective metaphor (see pages 60-61 in particular). Models and representation are key features of his account, a kind representational account. His views do not leave much room for elements of science that are not model-based. It is unclear whether this is because he thought these elements were just part of modeling practice or whether he just considered them less important than modeling practice. Giere’s examples and arguments mostly involve contemporary science, so he is most concerned with synchronically understanding perspectivism, but in places, he takes some interest in applying perspectivism to an understanding of the history of science (see for example 2016; 2013).
Giere argues vision, scientific instruments, and scientific models are all representational, selective, constructed, and all give rise to a product that is jointly determined by a human and an environmental element. In the case of human vision, our visual experience is the product of what the world is like and what our visual system is like (Giere, 2006, chapter 1). Similarly, our scientific models are the product of our modeling practices and what the world that we model is like.
His views turn on an analogy between scientific observation and regular, everyday observation because scientific instruments are selective the way vision is (Giere, 2006, chapter 2). These are analogous to scientific theorizing because theorizing also involves selecting specific features of the environment and then building models that represent those specific features (Giere, 2006, chapter 3).
Selectivity is crucial for making this comparison between vision and science. Human vision is sensitive to specific wavelengths of light while other creatures or even machines are sensitive to other specific wavelengths. Depending on what visual apparatus one has, one will experience different pictures or visual images. These images are still images of a real and external world, but their character does not solely depend upon the external world, it also depends upon the visual system we are considering. Similarly, models selectively represent different features of the world, and which features one selects will determine what model to build out of many different possible models. Depending on what sort of model one wants to build, one will make different An especially salient illustration Giere uses is brain imaging (2006, page 49ff). CAT (Computer Assisted Tomography) scans and PET (Positron Emission Tomography) scans are two common methods for scanning the human brain. The CAT scan uses a series of X-ray photographs to provide an image of the brain that tells us some of its structural properties. Technicians can alternatively use a PET scan, which uses gamma rays to detect neural activity in different parts of the brain. The activity can then be graphed in images that give us some of the functional, instead of structural, properties of the brain. These two methods of scanning are sensitive to different stimuli (X-rays and gamma rays), allowing scientists to build different images and conclusions about the brain. Depending upon what one’s interests are, one will choose a particular kind of scanning.
Representation lies at the heart of the perspectival metaphor, explains how models are epistemically valuable, and supplants truth as a basis for assessing scientific knowledge claims. Giere calls this an agent-based account of scientific representation and says “an agent represents features of the world for specific purposes (Giere, 2006; Giere, 2010, pages 60–61).”
The representing that the agent does involves some kind of similarity relation between representation and target (Giere, 2006, pages 63–67). In what way they are similar depends upon the specific purposes that the scientists using the representation have. Any model and any real-world phenomenon may have several similar features, but what makes a particular model represent a particular target is the act of using the model to represent specific features. Similarity comes in degrees and how similar is similar enough is also determined by how scientists use models.
Similarity supplants truth. Rather than claim that a successful model is true, or true of a target, Giere thinks we are only entitled to say that the model is similar (for present purposes) to the target in the relevant respects. He argues for this claim by appealing to scientific practice. For any given model, it will fail to be a perfect fit with the target system and scientists only ever specify certain similarities. Claiming the model is true is a metaphysically more committing claim that goes beyond what scientists do and what science can offer. To make the more metaphysically robust claim, we would need a model that fits the world perfectly in every respect and we have no such model.
b. Other Versions of Perspectivism based on Representation
Teller and Mitchell defend versions of perspectivism and representation features importantly in both of their accounts, just as it did for Giere’s. Teller (2020) argues that traditional forms of realism fail to capture the representational features of scientific terms. Truth, as traditionally conceived, therefore cannot feature in a realist analysis of science. Instead, he argues that we should endorse approximate truth within well-defined contexts and that this can be called perspectivism. Representation features importantly in his account because it is the representational part of science that realism fails to capture and that gives him the argument in favor of partial truth within constrained contexts. Scientific theories and terms do have a connection to the actual world, but that representational relationship is not as rigid as the reference often associated with realism.
The upshot of this account is that we can ascribe truth to scientific claims, but we must specify concerning what a given claim is true. That is, a claim is not true absolutely, but true concerning the aims or intentions of scientists within specific contexts. Truth is approximate because we need to specify what the context is when determining whether a claim is true and even with that specification, the truth is still only true in degrees (2020, pages 58-59). The hope is that this notion of truth allows perspectivism to be a position principally about our representations (and hence it is just epistemic) and not about ontology. Therefore, there is a mind-independent, single world toward which our scientific theories and terms are directed.
Mitchell defends a perspectival account directed at the epistemology of science (2020). Her writings on perspectivism build directly upon her earlier work developing integrative pluralism (1992; 2002; 2003; 2009; 2017). She also argues it is the selectivity of model representation that makes science, especially scientific modeling, perspectival. Despite the emphasis on models and representation, her account is radically unlike Giere’s. She does not conclude that knowledge is situated within perspectives qua models but instead argues that different perspectives (models) can be integrated, and the result of this integration is knowledge of the natural world. Integration is necessary because any model is incomplete and therefore gives us accurate but partial knowledge of the target that it selectively represents.
There are still, however, questions about where and which models can be integrated. Some of the criticisms leveled against Giere’s perspectivism may apply here, especially Morrison’s example of the atomic nucleus models, which she argues show that the inconsistency between some models is quite deep and that the incompatibility cannot readily be accounted for for using the selectivity of representation. It is an open question whether these kinds of examples pose a serious problem for integrative pluralism and associated forms of representation.
c. Arguments in Favor of Representational Perspectivism
Having examined key elements of the representational version of perspectivism, this section examines arguments in favor of this position. The discussion focuses on Giere because his arguments are the most influential and the most closely related views (such as those of Mitchell and Teller) find them persuasive.
There is one main argument in Giere for perspectivism and one minor argument that he uses more implicitly. The main argument, leaning on the visual metaphor, is that modeling practices, and subsequently the knowledge we get from them, are irreducibly selective, partial, and hence perspectival. He makes this argument in two stages: an instrument and then a theory stage.
i. Two Arguments for Perspectivism
In the first stage, Giere presumes that science is based upon observation and that contemporary science relies upon instruments for its observations. Because instruments are responsive to limited stimulus, just as vision is, instruments are perspectival in the same way that vision is perspectival (2006, Chapter 3). There are several key claims here.
(1) Science is built upon observation
(2) Observations in contemporary science requires instruments
(3) Vision is only sensitive to a limited range of stimuli
(4) Any instrument is also only sensitive to a limited range of stimuli
(5) Different detection systems (either instruments or visual systems) offer different “perspectives” on the same object in the virtue of their different sensitivities.
(6) Instruments and observation are perspectival in the same way vision is.
There is little argument in favor of 1, perhaps because it appears to be an innocuous, vague commitment to empiricism. Claim 2 also is unsupported, except that it can perhaps be intuitively seen by considering examples; it seems uncontroversial that much of modern science cannot rely solely on human vision because the subjects are either very small (molecules, atoms, DNA) or very large and distant (galaxies, stars, black holes, and so forth). To observe these things requires instruments. Giere does support claim 3 with an extensive discussion of how vision works (2006, Chapter 2) and claim 4 with case studies, such as the Compton Gamma Ray Observatory, which has a telescope that is only sensitive to gamma rays within a specific energy range (2006, pages 45-48).
In the second stage of the main argument, Giere extends the reasoning about instruments to theories (2006, Chapter 4). He argues that theories are models or sets of models, that scientists use to represent parts of the world. Theories are an extension of vision and instruments because models represent selectively, just as vision and instruments represent selectively.
The more minor argument emerges from considering examples of models that appear to be inconsistent with one another because they appear to ascribe incompatible properties to the same target, but at the same time, two or more such models may be equally valuable in some way, such as the different ways of scanning and modeling the brain. These different types of scans attribute different properties to the brain that seem to be inconsistent, but they are valuable for addressing different kinds of problems and tell us about different aspects of the brain. Other authors give greater credence to this argument.
Giere claims there are upshots to his account (briefly canvassed in (2006, pages 13-15). For one, he can accommodate the constructivist insight that science irreducibly involves a human contribution and that the products of science depend heavily upon this creative and constructive effort. If that is correct, he might be able to do justice to the history of science and the contextual nature of human knowledge, thus avoiding the problem of presupposing a God’s-eye view. Briefly, this problem concerns believing we can transcend human limitations and arrive at infallible and true claims about the world (cf. with objective realism 2006, pages 4 ff.). On the other hand, he thinks he has serious realist commitments that give us a picture of science as a reliable, empirically supported, and authoritative source of knowledge that is safe from serious skepticism. He does not defend those realist commitments explicitly.
ii. Is Giere’s Perspectivism a Kind of Realism?
The answer is not straightforward. Although Giere claims that models do represent the actual, real, mind-independent world, many of his other claims are not compatible with realism. This section explains how Giere’s views fit with the realist commitments to semantics, epistemology, and metaphysics.
The first tenet was metaphysical: there is a mind-independent world. Giere’s views do not conflict with this. Models represent features of this world and what this world is like is independent of modeling practice. So Giere can endorse this one. However, he is not explicit about his metaphysical commitments and if he thought that ontology was perspective relative (or if that relativity is entailed by his other claims), then he could not endorse the metaphysical requirement realists have. He does claim that models are never directly compared with the world, but merely with less abstract models (2006, page 68). Given this epistemic limitation, it is unclear how one could know whether one’s models were models of the real world. It is therefore unclear how Giere could be able to endorse the metaphysical realist tenet, even if it is compatible with his account in principle.
The second tenet, that we can have justified true beliefs about that external world, is less straightforward based on Giere’s writings. On the one hand, models as Giere construes them do represent the world and they allow us to have beliefs about it. So far this seems like a commitment he could endorse. However, Giere also claims that we should think of the success of our models in terms of similarity, not in terms of truth (2006, pages 64-65). Furthermore, Giere claims that if we do make claims about what is true, those claims are only true within a perspective, perhaps best understood as modeling practice (2006, page 81). These considerations strongly suggest that perspectivists should be committed to a claim being true, or not, relative to scientific practice. If truth depends upon practice, then it depends upon what models scientists build and what purposes they have in building and using models. Truth then appears to depend upon the actions of scientists and their purposes. Put this way, it appears very hard to reconcile Giere’s views with the epistemic commitment realists typically require. See Massimi (2018a, page 349) for the relativity of truth and this interpretation of Giere’s perspectivism.
It is doubtful that Giere wants to endorse the third realist tenet: our best successful scientific terms (or theories or models) track the mind-independent world through reference. This may not be a problem, however, because Giere uses the representation to link our models to the actual world and representation may just be able to do the same work as reference. Like reference, representation allows for a kind of correspondence between the world and our structures (whether they be language structures or model structures). Perhaps unlike reference, there is a lot of flexibility in Giere’s account of representation. A model can be more or less similar, depending upon the scientist’s purposes. However, the representation does establish a direct link between the world and the model, even if that link lacks the kind of precision that we might associate with reference. Whether this lack of precision prevents Giere from endorsing the last realist tenet may depend upon what account of reference we endorse and what level of precision realists require. Further work would be needed to address these issues. Teller (Teller, 2020) argues that reference also fails to have very high levels of precision, just as models do. Indeed, Giere elsewhere (2016, page 140) hints that linguistic analyses of science, especially those that appeal to reference, are unlikely to work. He suggests such a project may very well be “logically impossible.” If we accept this view of reference, then perspectivists could probably endorse the third realist tenet in a different form, a form that analyzes how our theories track the world in terms of a kind of flexible and contextual correspondence (either representational or reference-based).
Giere’s defense of perspectivism is realist in spirit and is in principle compatible with realist metaphysics, but it is not a full form of realism because of important deviations. Given that Giere claimed to be developing an account that fell somewhere between realism and constructivism, this may be a satisfactory outcome, though much more work would be needed to spell out more specifically where and how Giere’s views depart from or align with, realism.
d. Other Forms of Perspectivism
The term “perspectivism” implies visual media which in turn suggests that representation is going to be important for a perspectival analysis. However, some versions of perspectivism do not take representation to be what makes science distinctly perspectival. Some of these accounts, as we will see, are in part motivated by the difficulties that representational perspectivism faces. Massimi (for example 2012; 2018b) argues it is the modality of the knowledge gained from modeling that makes science perspectival. Chang (2020) argues it is the epistemic aims that provide the perspectival element in science. Danks (2020) also takes epistemic aims to be an important part of what makes science perspectival because aims structure the way scientists exercise their conceptual capacities, which gives rise to alternative conceptual systems. Rueger (2005; 2020) argues perspectivism should be understood more metaphysically, particularly in terms of relational properties. Section 5 will examine these views further after considering criticisms leveled against perspectivism, particularly Giere’s version thereof.
4. Criticisms of Perspectivism
Perspectivism aspires to be a middle-of-the-road account that has realist commitments, but which at the same time accommodates the contextual nature of human knowledge. We saw that there was some ambiguity in Giere’s commitments to realism, but that this may not be a problem given the aim of the project. However, two main criticisms cast doubt on whether perspectivism is a unique position that occupies this middle space. Those criticisms pull in opposite directions. One claims that perspectivism is a form of more traditional realism; the other claims that perspectivism just amounts to instrumentalism.
There is also a third criticism that endorses many of perspectivism’s features but denies that the metaphor of a perspective appropriately captures the relevant features of scientific practice. These criticisms are directed at Giere’s views in particular but might apply to any perspectival account based upon representation. These criticisms do not necessarily apply to other forms of perspectivism, though they might.
a. Perspectivism as Dispositional Realism
One charge against perspectivism is that it collapses into a more traditional form of realism, such as dispositional realism. Chakravartty argues this case by undermining the support that he sees perspectivism receiving from three arguments: the argument from detection, from inconsistency, and from reference. These are primarily, but not exclusively, synchronic concerns. Some issues, such as reference, Chakravartty examines with a historical, and thus diachronic, lens. As the first step in this project of reducing perspectivism to realism, Chakravartty argues we can interpret the perspectival thesis in one of two ways, which are:
We have knowledge of perspectival facts only because non-perspectival facts are beyond our epistemic grasp.
We have knowledge of perspectival facts only because there are no non-perspectival facts to be known (2010, page 407).
The first interpretation is epistemic. It claims what we can know and what we cannot know. Perspectivism under this reading is simply claiming that we can only know things based on the models we make, everything else is beyond our epistemic ability because models are the only means by which we come to have knowledge. The claim that all facts are perspectival does not indicate knowledge is limited on its own. This is because a knowledge limit implies that there is a candidate for knowledge, but because of human nature (or cognition, scientific methods, what have you), we cannot grasp that candidate for knowledge, that is, there are facts we cannot grasp. This is the first interpretation. The second interpretation, however, indicates that there is nothing beyond that which we can know and that there is, therefore, no constraint on knowledge: we can know all the facts, it just turns out that all facts are perspectival. This second reading offers an ontological claim and is an alternative reading of the perspectival view of science, as Chakravartty interprets it. The perspectivist imagines that one or both theses are supported by three different arguments.
The detection argument is that different instruments or experiments are sensitive to a limited range of stimuli and therefore only capture part of a real-world phenomenon; instruments are limited and partial detectors. Different instruments that are sensitive to different stimuli give different perspectives on the same phenomenon. The human visual system, to return to Giere’s example, is like an instrument and is only sensitive to certain wavelengths of light. Other life forms have different visual systems and are sensitive to different wavelengths of light. This consideration of how instruments work appears to suggest that all we can know, via instrumentation, is perspectival (the epistemic thesis listed above). This consideration may also suggest that because we cannot make claims about things beyond our epistemic capacities, that we have no reason to suppose that there are non-perspectival facts to be known (the ontological thesis listed above). The perspectivist hopes, at least, to find support in the limited detection afforded by instruments. Chakravartty argues (Chakravartty, 2010, section 2) that these considerations about how instruments work are still entirely compatible with robust realism. The fact that a detector has limited sensitivity does not need to suggest that a phenomenon does not have other causal features. This is why we use a variety of instruments so that we can understand the complexity of phenomena that are not fully captured by any one instrument. The incompleteness of an instrument’s detection does not require any concessions to anti-realism. So Chakravartty attempts to undermine the support perspectivists seek in representation, especially the selectivity and partiality of scientific instruments, experiments, and models.
The second argument the perspectivist appeals to is the inconsistency argument. Different models offer incompatible descriptions of the same target and if two or more such models are successful, realism is threatened (the problem of inconsistent models). The perspectivist can then say that different models offer different perspectives on the target phenomenon. Chakravartty defuses this argument by suggesting that we think of the models as ascribing different relational properties, rather than straightforward intrinsic properties (Chakravartty, 2010, section 3). Different models capture different non-perspectival dispositional properties, but the same phenomenon can have a variety of relational properties and if that is correct, we also have no need to make any concessions to anti-realism. Salt, for example, has the property of dissolving in water. But of course, salt does not always dissolve in water. The water must be in the right state, such as being unsaturated and being at the right temperature. This shows that salt does not have perspectival properties; the fact that salt sometimes dissolves in water and sometimes does not is not a perspectival fact. What this instead shows is that the property in question is more like a disposition whereby salt always has the disposition to dissolve in water, but whether it in fact does so depends upon having the correct context.
The final argument, about reference, is based on considering the history of science. The idea here is that past sciences had different technical terms or used the same technical terms with different meanings compared with contemporary science. Despite these differences, past science was met with success. The perspectival conclusion is that past science offered a different perspective on the world. Chakravartty attributes this view to Kuhn, especially the later Kuhn (Kuhn, 1990; Kuhn, 1977). Although Kuhn was not a perspectivist, Giere has interpreted him in that light (2013). The past two arguments were synchronic, but this is a diachronic form for supporting perspectivism. Chakravartty interprets this Kuhnian-perspectival hybrid view as committed to the idea that the ontology of the world is causally dependent upon the scientific theories we endorse (2010, page 411). Consequently, whenever theories change, the world literally changes. He finds this view too metaphysically incoherent to take seriously. Whether this is really the right way to interpret Kuhn and Giere is another question (see Hoyningen-Huene (1993) for a thorough examination of Kuhn’s metaphysics). Nevertheless, if one were to take this interpretation of perspectivism seriously, then perspectivism, according to Chakravartty, would be a version of ontological relativity and that would just be a form of anti-realism, not any kind of realism.
These arguments, especially the first two that Chakravartty examines, are brought to bear against a version of perspectivism that appeals to perspectival facts. Such facts do not feature in Giere’s original view and perspectivists may offer a rejoinder here if it turns out that perspectival facts are not a necessary part of perspectivism. Nevertheless, Chakravartty has given several arguments suggesting that a realist, especially a dispositional realist, can accept most of the perspectivist’s claims about the nature of instrumentation and modeling. If his arguments succeed, Chakravartty will have shown that perspectivism is unable to walk the line between realism and anti-realism and instead collapses into a more robust and traditional form of realism, especially dispositional realism.
b. Inconsistent Models and Perspectivism as Instrumentalism
Giere used model pluralism to motivate his version of perspectivism. Model pluralism also proves stimulating for other, epistemic-focused accounts of perspectivism (Rice, Massimi, Mitchell, and others). However, when a plurality of models with the same target conflict with one another, it seems less obvious that model pluralism can be compatible with realism. This is the problem of inconsistent models and it may suggest that perspectivism is just instrumentalism.
Inconsistent models are those that have the same target (represent the same thing), but which ascribe to the target different properties that are incompatible with one another (Chakravartty, 2010; Mitchell, 2009; Weisberg, 2007; Longino, 2013; Morrison, 2015; Weisberg, 2012; Morrison, 2011). So, two models are inconsistent when 1) they have the same target, but 2) describe the target in incompatible ways. If models give us knowledge, inconsistency poses a problem; how can a target have incompatible properties, presuming the various models representing it are successful?
Morrison argues against a perspectival interpretation of inconsistent models (2011). This project is entirely synchronic. She shows, using a case study from nuclear physics, that perspectivism is really a form of instrumentalism. The perspectival account she accuses of instrumentalism is Giere’s and whether her criticisms apply equally to other versions is less clear. Her argument is this: the nucleus of an atom can be modeled in over two dozen different ways, most of which are incompatible with one another. Take the liquid drop model; it treats the nucleus classically, even though the model allows for successful predictions and even though scientists know the nucleus is a quantum, not classical, object (2011, page 350). Perspectivists would claim that each model offers a different perspective on the target. So, from the perspective of the liquid drop model, the nucleus is a classical object whereas from the perspective of, say, the shell model, the nucleus is a quantum object. This amounts to giving each model a realistic interpretation while also denying the possibility of comparing them. However, as Morrison points out, we know the liquid drop model cannot be right and that the nucleus is a quantum object. So the liquid drop model is instrumentally useful but cannot be given a realistic interpretation. It is unclear, according to our current best theories, why a model like the shell model, which correctly treats the nucleus as a quantum object, does not always allow for successful predictions or explanations. This case suggests that at best we can evaluate each model instrumentally, that is, assess a given model’s success based on the successful predictions scientists can use it to make. This further suggests that if we want a realistic understanding of an atomic nucleus, that is, to know its essential properties, then our scientific understanding is deficient and perspectivism does not offer a viable philosophical analysis. Morrison concludes:
In this case, perspectivism is simply a re-branded version of instrumentalism. Given that we assume there is an object called ‘‘the atomic nucleus’’ that has a particular structure and dynamics it becomes difficult to see how to interpret any of these models realistically since each is successful in accounting only for particular kinds of experimental evidence and provides very little in the way of theoretical understanding. In other words, the inference from predictive success to realism is blocked due not only to the extensive underdetermination but also the internal problems that beset each of the individual models (Morrison, 2011, page 350).
Morrison argues here that the connection that realists typically presume exists between truth and success cannot be established for her particular case study because each of the many models of the nucleus cannot be in some sense true because each has substantive problems and only affords some kinds of predictive success. So, any realism, including perspectival realism, is going to fail here.
A perspectivist like Giere might want to say, of these models of the nucleus, that each model offers a different perspective that affords different predictive success. Each model offers true claims about the nucleus, but that truth is only evaluable from within particular models. This interpretation, Morrison thinks, does not work because we think there is a phenomenon, the atomic nucleus, that has definite properties irrespective of any particular model. The various models of it that scientists use ascribe to the nucleus incompatible properties. Therefore, these various models of the nucleus cannot be given a realistic interpretation because they conflict, and conflicting models cannot tell us what the true structure of the nucleus is. How successful this criticism of perspectivism is will depend heavily upon whether one thinks that the nucleus has a definite set of properties, an assumption Giere is unlikely to endorse. It is, therefore, very unlikely that Morrison has really taken on board the central commitments that perspectivism has, namely that there is no such thing as understanding a particular phenomenon independently of some model. Her argument hinges upon this.
Nevertheless, a perspectivist may still need to more thoroughly account for why there is an intuition that phenomena do exist and have definitive properties independently of any given model. This intuition may not seem pressing when we examine different models of the same phenomenon that do not obviously conflict with one another. In such cases, it is easy to make sense of model-dependent knowledge because we might claim, as Giere does, that different models select different properties to represent. This implies that there is a single phenomenon that has a variety of properties and depending upon which we select, we get different models, even though those models have the same target.
However, if the properties associated with the different models do conflict, then the partial selectivity of models does not make as much sense and the perspectivist ought to have an explanation for what is going on in such cases. This is because the models presumably cannot just be selecting a subset of the properties a target has. The selection is impossible because no target can have inconsistent properties. Therefore, a realistic interpretation of the various models that partially represent the same target is impossible when the various models ascribe to the target inconsistent properties. Inconsistent models that are successful, therefore, seem to pose a serious threat to perspectivism, unless one were to reject the notion that a target cannot have inconsistent properties, but this would be difficult to endorse.
Inconsistent models that are successful pose a threat not just to perspectivism, but to realism more generally because most forms of realism seek to infer truth based on success. Inconsistent models seem to show that one can have success without truth and if that is correct, they strike realism at its core.
c. Suitability of the Perspective Metaphor
Chirimuuta criticizes the suitability of the metaphor of the perspective (2016). She argues that philosophers gloss over the fact that the metaphor of a perspective can be interpreted (or used) in three distinct ways, each of which offers different implications for the relationship between scientific knowledge and metaphysics. Those three features of the metaphor are partiality, interestedness, and interaction. Giere uses all three when discussing how scientists use models to represent. A given model is only selectively used to represent parts of the natural world (partiality) (2006, page 35). Which parts of the natural world a model represents is determined in part by the interests of the scientists building and using the model (interestedness) (2006, page 29). Finally, a model is able to represent because it is the product of an interaction with the natural world (interaction) (2006, pages 31-32). These are all logically distinct features of modeling, even though each plays a crucial part in Giere’s account.
Chirimuuta interprets Giere’s model-based account as endorsing all three without clearly distinguishing between them. Therefore, some criticisms of Giere are misapplied because they only target a single understanding of the perspective metaphor. She argues that making these distinctions between the various features of modeling practice would be easier by appealing to a haptic metaphor, that is, by considering touch instead of vision. This metaphor is better, she argues (2016, pages 752 ff.), because our sense of touch more obviously requires the three features listed above than does perception. Particularly important for other criticisms of Giere is the emphasis on interaction. Morrison and Chakravartty’s claims that perspectivism is just instrumentalism or realism presume that perspectivism only involves partiality or interestedness, but not interaction (2016, page 754). If we presumed that scientific modeling provides an objective mirroring or morphism of some kind, then perspectivism would look a lot like realism for some successful cases and it would look a lot like instrumentalism for cases where models do not appear to give us a kind of objective picture. But, Chirimuuta argues, this is not how models work, it is not how vision works, and it is not how touch works. The model is a result of an interaction and the interaction is not a mirroring, but a physical manipulation that changes the world in addition to allowing scientists to achieve given ends. There is a strong parallel with Chang’s active realism (2017b; 2017a) whereby realism is understood in more practical terms.
There are two worries with this proposal. One is that some kinds of modeling practice do not seem to involve the kind of interaction Chirimuuta describes (Cretu, 2020), such as some kinds of astronomy. The other problem, potentially, is that active realism and probably Chirimuuta’s haptic realism require a radical re-conception of what realism entails. How reasonable it is for us to call their views realist, therefore, will depend upon what one thinks a realist account should provide.
5. Defenses of Perspectivism
A representational form of perspectivism is ambitious. Giere, for instance, attempts, based on a theory of representation, to give an epistemic and ontological treatment of science. He hopes that in so doing, we could reject strong forms of objective realism that presupposed a God’s-eye-view as well as relativism. This is a very difficult balance to strike and the main criticisms against his view focus on the instability of walking a path between realism and relativism. The charge, his critics make, is that he might have weak realist commitments that open the door to ontological relativity or instrumentalism. At the same time, if those realist commitments are more robust, then perspectivism looks like a realist position with some interesting methodological commitments, but few new insights. Either way, philosophers would be back to square one with a dichotomy between realism and some form of anti-realism. To make perspectivism more robust, several philosophers have attempted to restrict perspectivism more sharply to the epistemology, and not the metaphysics, of science. The hope is that perspectivism so restricted can avoid the issues Giere faced while remaining true to the original project of mediating between realism and other views.
There are several ways one can attempt to restrict perspectivism to epistemology. One can do so while sharing with Giere a foundation laid in representation or one can develop perspectivism as a view about modal knowledge. These approaches are not without issues.
a. Overcoming the Problem of Inconsistent Models
Rueger (2005) and Massimi have in different ways attempted to use perspectivism to diffuse the threat to realism posed by the problem of inconsistent models. Rueger argues that in cases where different models appear to commit us to multiple and incompatible intrinsic properties to a target system, we should instead understand each model as offering a perspective. Because the properties in question are relativised to a perspective, he thinks we should understand them not as intrinsic properties as Morrison does, but as relational properties. So construed, models that appear inconsistent are not because each model ascribes different relational properties to the same target. This diffuses the problem of inconsistent models because relational properties would not conflict with one another the way intrinsic properties might. Relational properties would not conflict because they would only manifest in certain conditions.
Rueger’s approach is similar in spirit to Chakravaartty’s in that both take different models to ascribe non-intrinsic properties to target systems (in Chakravartty’s case the properties are dispositional properties). Their views can both be realist because it can be a fact of the matter whether a given target system has a property under consideration, regardless of the perspective in question. But the way we study a given property is model relative (and thus, dependent on perspective).
There is an epistemic as well as an ontological feature of this general approach to diffusing the problem of inconsistent models. The epistemic element commits us to the idea that perspectival modeling is required for the examination of properties in the actual world, that is, different properties require different modeling approaches in order for us to study them. The ontological element commits us to consider a specific class of properties as the real, actual properties. That class does not include intrinsic properties, only dispositional properties (Chakravartty) or relational properties (Rueger). The success of these accounts will of course depend upon what kinds of properties there actually are and whether we can know them. There is also a question about whether this kind of realism should be considered perspectival, that is, is there anything distinctly perspectival about this form of realism?
b. Perspectivism and Modality
The criticisms of Giere’s account suggested that a perspectival account that is ontological may not be able to walk the middle path between realism and constructivism or instrumentalism. Focusing perspectivism on the epistemology of science might allow for a middle path without the issues that Giere’s account faced. One way to do this was to build a perspectival account using the selectivity of representation. Such an approach follows Giere not only in emphasizing the representational parts of science; it also uses the same type of vision metaphor. However, there are other ways of developing this metaphor and there are other ways to develop an account of perspectivism that do not have their logical origin in representation. Massimi has developed an account of perspectivism along these lines. Although representation plays a much smaller role, models are still central.
i. Modality and Inconsistent Models
Inconsistent models are a motivation for perspectivism, but also a problem. Massimi deflates the problem they pose and argues for a perspectival interpretation (2018b) that is epistemic, but not based on representation in the way that Mitchell’s and Teller’s accounts do. The deflationary argument is this: critics of perspectivism charge that perspectivism has a weak commitment to realism because upon critical examination, perspectivism yields a version of metaphysical inconsistency if given a strong ontological reading, a version of dispositional realism if given a weaker reading (Chakravartty), or a version of instrumentalism (Morrison). Therefore, perspectivism is not a middle ground between realism and anti-realism. This criticism, Massimi argues, presumes that perspectivism should be understood as a position about representation (2018b, sections 3-4).
Her argument is that although models are about target systems, the aboutness does not stem from a mapping between elements of the model and elements of the target system. This is a key assumption underpinning representational accounts of perspectivism: there is a relation between parts of the model and selected features of the world and that relation is mapping or correspondence of some kind. Instead, the aboutness is associated with the modal knowledge we get from what she calls perspectival models (2018b, section 5).
To deflate the inconsistent modeling argument against perspectivism, it is helpful to understand how Massimi characterizes it. She suggests it is formed with two assumptions. One is the representationalist assumption (scientific models are epistemically valuable because they are used primarily to veridically represent a target system) and the other is the pluralist assumption (there is more than one model that represents the same target system) (2018b, pages 335-336). If we take these models to be veridical (and not just instrumental) then we have a problem for realism because it appears that a collection of models about the same target ascribe conflicting properties. Perspectivism was supposed to help with this issue, but as we saw from Morrison and Chakravartty, if perspectivism maintains the representationalist assumption, one seems forced to go down one of two roads. Either we need to interpret the models instrumentally to avoid inconsistently ascribing properties to a target system, or we need to endorse a very strange ontology whereby one target system can have incompatible properties (or we get dispositional realism). However, we are only forced into this choice if we commit to the representationalist assumption. Massimi argues we should do away with this assumption.
The representationalist assumption can be broken down into two, more specific commitments. One is that a model, in representing a target, offers a mapping that involves a correspondence between elements in the model and elements of the target system. It might be (indeed is likely or must be) the case that only a subset of elements have this mapping. This is consistent with thinking that models, through abstraction and idealization, are partial and selective representations of their targets.
The other commitment is that the target system is a state of affairs. We should understand these states of affairs as the ontological grounds for the success of a given model. According to this Armstrong-inspired picture, models are (approximately) true or false and what makes them true or false are states of affairs. States of affairs are composed of particulars and properties (Armstrong, 1993). Within this framework, a model is (approximately) true if the properties it ascribes to particulars are in fact properties of those particulars in the actual world. The appeal to approximately here is intended to indicate that a model may not ascribe to a particular all of the properties which it has, but if it is approximately true, then it must at least correctly ascribe a subset of those properties.
There are two problems with this picture of modeling, Massimi argues. The first is that the Armstrongian assumptions underlying representationalism are too strict and in being too strict, cannot account for falsehoods (Massimi, 2018b, pages 345–347). The second issue is that mapping is a poor criterion for distinguishing scientific from non-scientific models: too many things could count as scientific models that are just not scientific models (Massimi, 2018b, pages 347–348).
Toward building a positive account of perspectivism (2018b, section 5), Massimi argues we do away with the representationalist commitment: models do not give us knowledge by ascribing essential properties to particulars. Instead, models give us modal knowledge by allowing scientists to ascribe modal properties to particulars. She writes:
I clarify the sui generis representational nature of these perspectival models not as ‘mapping onto’ relevant partial—actual or fictional—states of affairs of the target system but instead as having a modal component. Perspectival models are still representational in that they have a representational content (i.e., they are about X). But their being about X is being about possibilities (2018b, pages 348-349).
This modal account of perspectivism does not do away with representation, but representation should not be understood as mapping, nor should it be understood as allowing for establishing truth via states of affairs. Instead, perspectivism indicates the knowledge we get from modeling is modal knowledge: knowledge about what is possible, impossible, and necessary. This knowledge applies to actual-world systems (hence there is a loose sense in which modal models are representational), but the notion of representation at play here is much weaker than accounts committed to representationalism.
Some open questions remain. What is this weaker notion of representation such that scientists can use models to make modal claims about the actual world? If this representation does not involve mapping states of affairs, what does it involve? We might also wonder what class of models this account covers. If it applies only to models that scientists use to eliminate possible explanations, then the scope is quite narrow. If it applies more generally to models that we might more intuitively think of as representation in the traditional sense (that is, as involving some kind of mapping), how do we characterize modal knowledge in such a way that the result of the analysis matches what scientists appear to be doing with their models? And finally, what kind of realism is perspectival realism?
ii. Perspectival Truth
Giere’s perspectival account took a deflationary stance, even anti-realist, stance toward truth. He argued that claims were true only within a perspective, that is, it makes no sense to ask whether a claim is true simpliciter. Instead, we can only assess the truth of a claim using the resources of specific models or families of models (2006, page 81). As we saw earlier, if one wants to develop a robust conception of perspectivism that has a realist bite, Giere’s account may feel unsatisfactory because it fails to endorse strong metaphysical commitments. Indeed, without metaphysical commitments, perspectivism may stray too close to instrumentalism. Massimi has attempted to develop a more robust conception of perspectival truth that avoids instrumentalist readings.
The main points of this more robust conception of truth within a perspectival account are articulated in “Four Kinds of Perspectival Truth” (Massimi, 2018a). Now the general aim of such an account is to avoid antirealism, especially its constructivist forms, but at the same time avoid reliance on a God’s-eye view for evaluating science, that is, the view that we can make inferences from success to truth as though we could evaluate science from a privileged epistemic position. To achieve this, Massimi defends the idea that scientific knowledge claims can be ontologically grounded while also perspective relative. Overcoming this apparent dichotomy rests on a distinction between the context of use and context of assessment, a distinction originally motivated by MacFarlane (2005) in the context of general epistemology but adapted for problems in the philosophy of science. This view is heavily motivated by the diachronic character of perspectivism but is also relevant for the synchronic character as well. Massimi says:
Each scientific perspective—I suggest—functions then both as a context of use (for its own knowledge claims) and as a context of assessments (for evaluating the ongoing performance-adequacy of knowledge claims of other scientific perspectives). Qua contexts of use, scientific perspectives lay out truth-conditions intended as standards of performance-adequacy for their own scientific knowledge claims. Qua contexts of assessments, scientific perspectives offer standpoints from which knowledge claims of other scientific perspectives can be evaluated. [emphasis in original] (Massimi, 2018a, pages 356-357).
Two crucial concepts in this passage are the context of use and context of assessment. The context of use is straightforward; it is the context in which knowledge claims are developed or used. In using the knowledge claims, scientists are not necessarily evaluating them. Evaluation is the task of the second context (context of assessment), which gives us the means for evaluating scientific claims, both the claims used in the current perspective as well as the claims of past or different perspectives. The evaluation requires standards of performance adequacy, which amount to the truth-conditions for the scientific claims under consideration. It is this element of truth, that is, truth evaluation, that is perspective-relative, but whether a claim is true is not relative to anything. Massimi argues that:==
Knowledge claims in science are perspective-dependent when their truth-conditions (understood as rules for determining truth-values based on features of the context of use) depend on the scientific perspective in which such claims are made. Yet such knowledge claims must also be assessable from the point of view of other (subsequent or rival) scientific perspectives (2018a, page 354).
The idea here is that how we determine the truth of a given knowledge claim is perspective sensitive, that is, the rules we use for determining truth-values are dependent upon the particular modeling practices in which those claims are evaluated. At the same time, whether those claims are actually true does not depend upon the rules, nor indeed any other features, of scientific practice. So the distinction amounts to a difference between how we come to recognize a claim as true and whether that claim is in fact true. The rules we can understand as standards of performance-adequacy (2018a, page 354), which are various. They can include the traditional epistemic values as Kuhn articulated them: values, for example, such as empirical adequacy, consistency, and fruitfulness (1977, chapter 13).
But how are we to establish whether a claim is true and not merely a claim that we have assessed as true? Addressing this issue is the caveat that knowledge claims must be the kinds of claims that can be assessed from other perspectives, mentioned in the quotation above. Now, this might appear to be the thesis that a knowledge claim that is considered true in two or more perspectives is more true than a knowledge claim that is considered true in just one perspective. The problem with this thesis is that it commits us to a view from nowhere and such a perspective-free position is not only impossible according to Massimi’s account, but it is also in general unclear how one could make true evaluations from such a position anyway because we have not specified what standards of assessment we should be using.
Instead, we should understand the standards of performance-adequacy as cross-perspectival and as such, a knowledge claim must satisfy them regardless of the perspective we are using. What counts as an instance of a given standard will vary. For example, what is counted as precise for Newton will not necessarily count as precise in 21st-century high energy physics; how scientists determine what is precise depends upon the experimental tools, theoretical constraints, the questions driving research, and other features of the scientific practice. Despite these deep differences, precision is still a standard that both Newton and contemporary physicists use to evaluate scientific claims.
It is important that these standards are not relative to perspectives. Otherwise, a given perspective would license its own truth and consequently place weak constraints on what claims count as true. Instead, when scientists advance a scientific claim, it is with the hope and expectation that it will satisfy epistemic standards not only as they are currently understood in that particular context, but in future contexts as well.
6. Ontological Commitments of Perspectivism
Most of the authors discussed so far, with the possible exception of Giere, try to keep perspectivism applied only to epistemic elements of science, that is, knowledge claims and how to assess them, but not their truth. By restricting perspectivism in this way, some who defend the position hope that robust metaphysical commitments can be possible, that is, not relative or dependent upon human activity or cognition. We saw this with Massimi’s conception of perspectival truth. This was not the case, however, with Giere, who attempted to recast realism such that the very idea of strong metaphysical commitments is not tenable. His account, in all but name, is not strongly realist, even though he did not endorse antirealism. There is, however, an argument that not only Giere’s account, but perspectivism more generally cannot avoid a form of antirealism: once the door is open to epistemic forms of perspectivism, ontological perspectivism is a possible or perhaps necessary consequence.
Chang argues (Chang, 2020) that perspectivism cannot be only applied to epistemic elements of science but must also include ontology. This pushes the view firmly away from traditional characterizations of realism. He likens a perspective to a conceptual framework and argues that 1) perspectives are typically incommensurable; 2) each perspective offers its own true account of a domain (2020, page 22):
Any phenomenon of nature that we can think or talk about at all is couched in concepts, and we choose from different conceptual frameworks (as C. I. Lewis emphasized), which are liable to be incommensurable with each other. If we take “perspective” to mean a conceptual framework in this sense, then we can see that ontology itself is perspectival.
Chang’s argument here is that the only access we have to the world is via concepts and there is a plurality of conceptual systems for describing the world (multiple perspectives). There is no trans-perspectival method for deciding amongst conceptual systems. The choice of perspective is pragmatic in that it depends upon the interests of the scientists. A consequence of this is that ontological claims can only be made from within a system, and it is only within that system that the claims are true or false. Ontology, therefore, depends upon perspective.
Note, however, that the perspectival metaphor is not, at this point, doing much work. Recall that part of the metaphorical workings involved the idea that our visual experience depends in part upon what the world is like and upon our visual system. Chang is either denying, using the metaphor, that visual experience depends upon what the world is like, or (more charitably) he is denying that we can give an account of what contributes to our visual experience that is independent of that experience. Either way, we cannot clearly distinguish between what the world is contributing to science and what human cognition and activity contributes. Any account of science, even those making use of a visual metaphor, will have to confront metaphysics.
We can also interpret Danks as giving a version of perspectivism that must confront scientific ontology. He argues that our concepts are shaped by the goals of those using them and in being shaped, those concepts structure human perception and language (Danks, 2020). Initially, this seems like an epistemic position that covers human cognition and how that cognition is shaped by goals.
However, it may not be a stable position interpreted epistemically because to remain purely epistemic, one would need a method for demarcating the ontology from the epistemology, but this seems to be impossible if human perception and cognition more generally are shaped in the way Danks suggests. How could we ever be in a position to judge how our concepts and perceptions deviate from reality? For specifically perceptual cases, which are the primary examples that Danks uses, scientists have instruments that can be used to evaluate human perception in specific experimental contexts: there is a tool independent of our perceptions that allows us to make claims about some of our perceptual concepts. The approach is quite specific. Can this strategy be employed more generally to human concepts or even scientific concepts? What would be the instruments we would use to evaluate such concepts?
7. Constructive Empiricism and Perspectivism
Massimi is not the only philosopher to use perspectivism as a way of distancing our understanding of science from strongly representationalist accounts, that is, accounts that treat scientific representations as mirroring, as isomorphisms, mimeses, or some other analysis that involves veridical mapping. Van Fraassen also uses the metaphor of perspective to do this (2008), although his account is directed toward representation in general and is not directly concerned with inconsistent models, realism, and some of the other problems motivating the rest of the perspectival literature.
He argues that instrumentation, as well as theorizing, are a form of representation (2008, see in particular chapters 6 and 7), but not mimetic representation, as he calls it (2008, page 3). Instead, the act of representing situates a measured object or theoretical model in a particular epistemic context. Scientific representations are therefore indexical because they show us what an object is like from within a particular perspective (or epistemic context; he uses these expressions interchangeably in many places).
Van Fraassen continually appeals to the visual as he develops his account of representation, both as a foil as well as an inspiration. He argues that pictorial images can lead us to think that representation is a mimetic relationship. This very persuasive idea received condemnation from Goodman and van Fraassen introduces it as well as Goodman’s critique of it at the very beginning of Scientific Representation (2008, page 13). The alternative, he argues, is that a representation is an artifact that people (such as scientists) use contexts to represent an object as something else (2008, page 21). His views resemble Giere’s in a few respects; van Fraassen appeals to use, context, and agents. There are also crucial differences. Van Fraassen includes the notion of representation as (for example, the representation of the atomic nucleus as a quantum object); his account is also much more detailed and is not so tightly designed to address modeling practice. Indeed, van Fraassen is concerned with representation in general and therefore expects his account to apply not just to science, but also to art, photography, and to other forms of representation.
8. References and Further Reading
Armstrong, David M (1993). “A world of states of affairs”. In: Philosophical Perspectives 7, pp. 429–440.
Brown, Matthew J (2009). “Models and perspectives on stage: remarks on Giere’s scientific perspectivism”. In: Studies in History and Philosophy of Science Part A 40.2, pp. 213–220.
Chakravartty, Anjan (2010). “Perspectivism, inconsistent models, and contrastive explanation”. In: Studies in History and Philosophy of Science Part A 41.4, pp. 405–412.
Chang, Hasok (2017a). “Is Pluralism Compatible with Scientific Realism?” In: The Routledge Handbook of Scientific Realism. Ed. by Juha Saatsi. London and New York: Routledge, pp. 176–186.
Chang, Hasok (2017b). “pragmatist coherence as the source of truth and reality”. In: Proceedings of the Aristotelian Society 117.2, pp. 103–122. issn: 14679264. doi: 10.1093/arisoc/aox004.
Chang, Hasok (2020). “Pragmatism, perspectivism, and the historicity of science”. In: Understanding perspectivism. Taylor and Francis, pp. 10–27.
Chirimuuta, Mazviita (2016). “Vision, perspectivism, and haptic realism”. In: Philosophy of Science 83.5, pp. 746–756.
Cretu, A (2020). “Perspectival realism”. In: Encyclopedia of Educational Philosophy and Theory. Springer, Singapore.
Danks, David (2020). “Safe-and-Substantive Perspectivism”. In: Understanding Perspectivism. New York: Taylor and Francis, pp. 127–140.
Feyerabend, Paul (2001). Conquest of abundance: A tale of abstraction versus the richness of being. University of Chicago Press.
Giere, Ronald N (2006). Scientific Perspectivism. University of Chicago Press.
Giere, Ronald N (2009). “Scientific perspectivism: behind the stage door”. In: Studies in History and Philosophy of Science Part A 40.2, pp. 221–223.
Giere, Ronald N (2010). “An agent-based conception of models and scientific representation”. In: Synthese 172.2, p. 269.
Giere, Ronald N (2013). “Kuhn as perspectival realist”. In: Topoi 32.1, pp. 53–57.
Giere, Ronald N (2016). “Feyerabend’s perspectivism”. In: Studies in History and Philosophy of Science Part A 57, pp. 137–141. issn: 18792510. doi: 10.1016/j.shpsa.2015.11.008.
Hacking, Ian (2002). “Historical Ontology”. In: In the Scope of Logic, Methodology and Philosophy of Science: Volume Two of the 11th International Congress of Logic, Methodology and Philosophy of Science, Cracow, August 1999. Ed. by Peter Gärdenfors, Jan Woleński, and Katarzyna Kijania-Placek. Dordrecht: Springer Netherlands, pp. 583–600. isbn: 978-94-017-0475-5. doi: 10.1007/978-94-017-0475-5_13. url: https://doi.org/10.1007/978-94-017-0475-5%7B%5C_%7D13.
Hoyningen-Huene, Paul (1993). Reconstructing scientific revolutions: Thomas S. Kuhn’s philosophy of science. University of Chicago Press.
Kuhn, Thomas S (1962). The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
Kuhn, Thomas S (1977). The essential tension selected studies in scientific tradition and change. Chicago, Ill. ; London: University of Chicago Press. isbn: 022621723x.
Kuhn, Thomas S (1990). “The Road Since Structure”. In: Biennial Meeting of the PH\hilosophy of Science Association 1990, pp. 3–13.
Longino, Helen E (2013). Studying human behavior: How scientists investigate aggression and sexuality. University of Chicago Press.
MacFarlane, John (2005). “xiv*—making sense of relative truth”. In: Proceedings of the aristotelian society (hardback). Vol. 105. 1. Wiley Online Library, pp. 305–323.
Massimi, Michela (2012). “Scientific Perspectivism and its Foes”. In: Philosophica 84, pp. 25–52.
Massimi, Michela (2014). “Natural Kinds and Naturalised Kantianism”. In: Nous 48.3, pp. 416–449.
Massimi, Michela (2015). “Walking the line: Kuhn between realism and relativism”. In: Kuhn’s Structure of Scientific Revolutions-50 Years On. Ed. by William J. Devlin and Alisa Bokulich. Springer, pp. 135–152.
Massimi, Michela (2018a). “Four kinds of perspectival truth”. In: Philosophy and Phenomenological Research 96.2, pp. 342–359.
Massimi, Michela (2018b). “Perspectival modeling”. In: Philosophy of Science 85.3, pp. 335–359. issn: 00318248. doi: 10.1086/697745. url: https://www.journals.uchicago.edu/doi/abs/10.1086/697745.
Mitchell, Sandra D (1992). “On pluralism and competition in evolutionary explanations”. In: American Zoologist 32.1, pp. 135–144.
Mitchell, Sandra D (2002). “Integrative pluralism”. In: Biology and Philosophy 17.1, pp. 55–70.
Mitchell, Sandra D (2003). Biological complexity and integrative pluralism. Cambridge University Press.
Mitchell, Sandra D (2009). Unsimple truths: Science, complexity, and policy. University of Chicago Press.
Mitchell, Sandra D (2020). “Understanding Perspectivism: Scientific Challenges and Methodological Prospects”. In: ed. by Michela Massimi and Casey D McCoy. Taylor and Francis, pp. 178–193.
Mitchell, Sandra D and Angela M Gronenborn (2017). “After Fifty Years, Why Are Protein X-ray Crystallographers Still in Business?” In: The British Journal for the Philosophy of Science 68.3, pp. 703–723.
Morrison, Margaret (2011). “One phenomenon, many models: Inconsistency and complementarity”. In: Studies In History and Philosophy of Science Part A 42.2, pp. 342–351.
Morrison, Margaret (2015). Reconstructing Reality: Models, Mathematics, and Simulations. Oxford University Press.
Psillos, Stathis (1999). Scientific realism: How Science Tracks Truth. London and New York: Routledge.
Putnam, Hilary (1981). Reason, Truth and History. First. Cambridge: Cambridge University Press. ISBN: 0-521-23035-7.
Rueger, Alexander (2005). “Perspectival models and theory unification”. In: The British journal for the philosophy of science 56.3, pp. 579–594.
Rueger, Alexander (2020). “Some perspective on perspectivism”. In: Metascience 29.2, pp. 193–196. issn: 1467-9981. doi: 10.1007/s11016-020-00501-7. url: https://doi.org/10.1007/s11016-020-00501-7.
Saatsi, Juha (Nov. 2017). The Routledge Handbook of Scientific Realism. Ed. by Juha Saatsi. New York: Routledge, 2017. | Series: Routledge handbooks in philosophy: Routledge. isbn: 9780203712498. doi: 10.4324/9780203712498. url: https://www.taylorfrancis.com/books/9781351362917.
Teller, Paul (2020). “What is Perspectivism, and does it count as realism?” In: Understanding perspectivism. Ed. by Michela Massimi and Casey D McCoy. New York: Taylor and Francis, pp. 49–64.
van Fraassen, Bas (2008). Scientific Representation. Oxford: Oxford University Press.
Weisberg, Michael (2007). “Who is a Modeler?” In: The British journal for the philosophy of science 58.2, pp. 207–233.
Weisberg, Michael (2012). Simulation and similarity: Using models to understand the world. Oxford University Press.
Anti-natalism is the extremely provocative view that it is either always or usually impermissible to procreate. Some find the view so offensive that they do not think it should be discussed. Others think their strongly intuitive disagreement with it is enough in itself to reject all arguments for anti-natalism. In the first twenty years of the twenty-first century, however, a distinct literature emerged that addressed anti-natalism. Sophisticated arguments both in favour of and against anti-natalism have been developed and defended. Philanthropic arguments for anti-natalism, that is, arguments that emphasize liking and trusting human beings (as opposed to misanthropic arguments), focus on the harm done to individuals who are brought into existence. For example, David Benatar’s Asymmetry Argument says that it is wrong to procreate because of an asymmetry between pleasure and pain. The absence of pain is good even if no one experiences that good whereas the absence of pleasure is not bad unless someone is deprived of it. Since everyone who comes into existence will inevitably experience nontrivial harm, it is better that they are not brought into existence since no one would be harmed by their non-existence. Other philanthropic arguments include the idea that individuals cannot consent to their creation, that procreating necessarily involves creating victims, and that procreation involves exploiting babies in order to get fully formed adults. Misanthropic arguments for anti-natalism, on the other hand, appeal to the harm that individuals who are brought into existence will cause. These include the harms that humans inflict upon each other, other animals, and the environment. Finally, it has also been recognized that if we have a duty to relieve extreme poverty when possible, there may be a corresponding duty for both the rich and poor to cease from procreating.
There are numerous ways to expand the debate about anti-natalism. For instance, scholars of religion have had little to say about anti-natalism, but it is unclear that they can completely dismiss certain of these arguments out of hand. Additionally, the debate about anti-natalism has primarily been conducted within the context of Western philosophy. It is an open question how the arguments for anti-natalism would be evaluated by various non-Western ethical theories. Finally, environmental ethics and population ethics have had little to say about anti-natalism, and as such there are many avenues for further exploration.
This section outlines important philanthropic arguments for anti-natalism, which focus on the harm done to individuals who are created. Philanthropic arguments are particularly controversial because they tend to conclude that it is always all-things-considered impermissible to procreate. The specific arguments outlined in this article include the Asymmetry Argument, the Deluded Gladness Argument, the Hypothetical Consent Argument, the No Victim Argument, and the Exploitation Argument. This section concludes by briefly examining the broader implications of philanthropic arguments.
a. Benatar’s Asymmetry Argument
The South African philosopher David Benatar is probably the most influential contemporary proponent of anti-natalism, although later we will see that he has offered a misanthropic argument for anti-natalism, he is best known for defending a strong philanthropic argument which says that it is always impermissible to procreate.
Benatar’s main defense of philanthropic arguments is to be found in his book, Better Never to Have Been: The Harm of Coming into Existence (2006). Since its publication, he has defended the main lines of argument in this book from various critiques and appears not to have wavered from his initial conclusions. Benatar explains that “[t]he central idea of [his] book is that coming into existence is always a serious harm” (2006, 1; emphasis added). He is well aware that the strong evolutionary tendency towards optimism means that many will find such a conclusion repulsive. Finally, while Benatar focuses most of his discussion on human procreation, he is clear from the beginning that his argument applies to all sentient beings because they are capable of experiencing harm (2006, 2).
How does Benatar arrive at such a controversial conclusion? Consider that many people hold that procreation is often permissible because most individuals who come into existence believe that their lives are worth living. In other words, many of us think our lives are worth living despite facing a certain number of obstacles and difficulties throughout our lives. Moreover, a problem about personal identity raised by twentieth-century moral philosopher Derek Parfit complicates matters further. This problem is called the non-identity problem and raises questions about whether it is even possible for an individual with an extremely low quality of life to coherently wish that their life had gone differently (1996). For example, if Sally was born to different parents or in different circumstances, it is doubtful that Sally would really be the same person at all, and not some other different person, Sally*. Benatar argues that all of this is the result of a simple mistake. He suggests that the non-identity problem only arises because people frequently conflate a life worth continuing with a life worth starting. According to Benatar, these are hardly the same. This is because the former judgment is one that a person who already exists makes about themselves, while the latter judgment is one about a potential though non-existent being (Benatar 2006, 19-23). Benatar’s thesis is that no lives are worth starting, even though many lives are worth continuing once they have been started.
One of the main ways that Benatar defends this view is by appealing to important asymmetries between non-existence and existence. For Benatar, “there is a crucial difference between harms (such as pains) and benefits (such as pleasures) which entails that existence has no advantage over, but does have disadvantages relative to, non-existence” (Benatar 2006, 30). Here is a key distinction that Benatar needs to establish the Asymmetry Argument: the absence of pain is good even if no one experiences that good while the absence of pleasure is not bad unless someone is deprived of it. Consider:
(1) the presence of pain is bad,
and that
(2) the presence of pleasure is good (Benatar 2006, 30).
However, Benatar claims that this sort of symmetry does not exist when applied to the absence of pain:
(3) the absence of pain is good, even if that good is not enjoyed by anyone,
whereas
(4) the absence of pleasure is not bad unless there is somebody for whom this absence is a deprivation (Benatar 2006, 30).
One reason for holding the asymmetry between (3) and (4) is that it enjoys great explanatory power. According to Benatar, it explains four different asymmetries better than competing alternatives. The first asymmetry it explains is also probably the most obvious one. This is the asymmetry between the claim that we have a strong duty not to intentionally bring someone into existence who will suffer, but we do not have a corresponding duty to bring happy persons into existence (Benatar 2006, 32). The second asymmetry is between the strangeness of citing the benefit to a potential child as the reason for bringing them into existence versus the coherence of citing the harms to a potential child as the reason for not bringing them into existence (Benatar 2006, 34). The third asymmetry involves our retrospective judgments. While we can regret both bringing an individual into existence and not bringing an individual into existence, it is only possible to regret bringing an individual into existence for the sake of that individual. If that individual had not been brought into existence, they would not exist and hence nothing could be regretted for their sake (Benatar 2006, 34). The fourth asymmetry is between our judgments about distant suffering versus uninhabited regions. We should rightly be sad and regret the former, but we should not be sad or regret that some far away planet (or island in our own world), is uninhabited (Benatar 2006, 35).
Here is a chart Benatar uses to further explain his view (Benatar 2006, 38):
If X exists
If X never exists
Presence of pain (Bad)
Absence of pain (Good)
Presence of pleasure (Good)
Absence of pleasure (Not bad)
Thus, the absence of pain is good even if the best or perhaps only way to achieve it is by the very absence of the person who would otherwise experience it. This asymmetry between harm and pleasure explains why it is wrong to have a child because they will not benefit from that existence, while “it is not strange to cite a potential child’s interests as a basis for avoiding bringing a child into existence” (Benatar 2005, 34). With this asymmetry established, Benatar concludes that coming into existence in our world is always a harm. In sum, “[t]he fact that one enjoys one’s life does not make one’s existence better than non-existence, because if one had not come into existence there would have been nobody to have missed the joy of leading that life and thus the absence of joy would not be bad” (Benatar 2005, 58).
b. Challenges to the Asymmetry Argument
Benatar’s Asymmetry Argument has been challenged in a number of places. Some have suggested that the distinction between a life worth starting and a life worth continuing does not hold up to scrutiny (DeGrazia 2012; Metz 2011, 241). Why think these are two distinct standards? For example, why not hold that a life worth starting just is a life that will be worth continuing? Some have argued that Benatar does not do enough to defend this distinction, which is an important one for the success of his argument. Another objection has been to challenge directly the asymmetries defended by Benatar. While Benatar suggests that an absence of pleasure is not bad unless there is an individual who is deprived of it, perhaps it is better understood as not good (Metz 2011, 242). Likewise, maybe an absence of pain is better understood as not bad (Metz 2011, 242-243). This would modify Benatar’s chart to the following:
If X exists
If X never exists
Presence of pain (Bad)
Absence of pain (Not Bad)
Presence of pleasure (Good)
Absence of pleasure (Not Good)
There are at least two reasons to favour this symmetry to the asymmetry posited by Benatar. First “is the fact of symmetry itself. As many physicists, mathematicians and philosophers of science have pointed out, symmetrical principles and explanations are to be preferred, ceteris paribus, to asymmetrical ones” (Metz 2011, 245). Second, the symmetry may better explain “uncontroversial judgments about the relationship between experiences such as pleasure and pain and their degree of dis/value” (Metz 2011, 245).
Another alternative understanding of the four procreative asymmetries Benatar claims are best explained by the basic asymmetry between pain and pleasure is the idea that the four asymmetries themselves are fundamental. As such they need not rely on a further asymmetry for their explanation (2002, 354-355). For those who disagree, DeGrazia writes that another alternative explanation is that “we have much stronger duties not to harm than to benefit and that this difference makes all the difference when we add the value of reproductive liberty. If so, the asymmetry about procreative duties does not favor the fundamental asymmetry between benefit and harm championed by Benatar” (DeGrazia 2010 322).
Ben Bradley argues that Benatar’s asymmetry fails because “there is a conceptual link between goodness and betterness; but if pleasure were intrinsically good but not better than its absence, there would be no such link” (Bradley 2013, 39; see also Bradley 2010).
Elizabeth Harman claims that Benatar’s Asymmetry Argument “equivocates between impersonal goodness and goodness for a person” (2009, 780). It is true that the presence of pain is bad. It is both personally and impersonally bad. However, the absence of pain is only impersonally good since there is no person who exists to experience its absence (Harman 2009, 780). But for the asymmetry to hold Benatar would have to show that absence of pain is also personally good. All of the various rejoinders to these claims cannot be discussed, but it is noteworthy that Benatar has directly responded to many criticisms of his arguments (for example, Benatar 2013).
c. The Deluded Gladness Argument
Benatar also offers a second argument in support of his anti-natalist conclusion, which can be called the Deluded Gladness Argument. The main thrust of this argument is to show that while typical life assessments are often quite positive, they are almost always mistaken. This serves as a standalone argument for the claim that we should refrain from procreating since all (or almost all) lives are quite bad. It also offers support for the Asymmetry Argument which says that if an individual’s life will contain even the slightest harm, it is impermissible to bring them into existence. This argument aims to show that in the vast majority of cases, the harms contained in human lives are far from slight. Benatar argues that “even the best lives are very bad, and therefore that being brought into existence is always a considerable harm” (2006, 61).
Most people’s own self-assessments of their lives are positive. In other words, most people are glad to have been brought into existence and do not think they were seriously harmed by being brought into existence. The ‘Deluded Gladness Argument’ is Benatar’s reasons for thinking that such self-assessments are almost always the result of delusion. Benatar explains that “[t]here are a number of well-known features of human psychology that can account for the favourable assessment people usually make of their own life’s quality. It is these psychological phenomena rather than the actual quality of a life that explain (the extent of) the positive assessment” (2006, 64). The most important psychological factor is the Pollyanna Principle which says that people are strongly inclined towards optimism in their judgments. People recall positive experiences with greater frequency and reliability than negative experiences. This means that when people look back on the past, they tend to inflate the positive aspects of it while minimizing the negative features. This also affects how people view the future, with a bias towards overestimating how well things will go. Subjective assessments about overall well-being are also consistently over-stated with respect to positive well-being (Benatar 2006, 64-65). Just consider that “most people believe that they are better off than most others or than the average person” (Benatar 2006, 66). People’s own assessments of their health do not correlate with objective assessments of it. The self-assessments of happiness of the poor are (almost) always equivalent to those offered by the rich. Educational and occupational differences tend to make insignificant differences to quality of life assessments too (Benatar 2006, 66-67).
Benatar claims that some of this Pollyannaism can be explained by “adaptation, accommodation, or habituation” (2006, 67). If there is a significant downturn in a person’s life, their well-being will suffer. However, they often readjust their expectations to their worse situation and so eventually their self-assessments do not remain low; they move back towards the original level of well-being (Benatar 66-67). Subjective well-being report changes more accurately than actual levels of well-being. People often also assess their own well-being by making relative comparisons to others. This means that self-assessments are more often comparisons of well-being, instead of assessments of actual well-being (Benatar 2006, 68). Benatar further argues that on three main theories of how well a life is going–hedonistic theories, desire-fulfillment theories, and objective list theories—assessments of how well one’s life is going are almost always too positive. He consistently points out that there is a distinction between the well-being that an individual ascribes to their own life and the actual well-being of that individual’s life. Benatar’s point is that these things do not often align. Once we have a more accurate picture of how bad our lives really are, we should ask whether we would inflict the harms that occur in any ordinary life to a person who already exists. The answer, according to Benatar is clearly ‘no’. (87-88). While it is possible to have a life that avoids most harms, we are not in a good epistemic position to identify whether this will apply to our own offspring. Given that the odds of avoidance are slim to begin with, procreation is akin to a rather nasty game of Russian roulette.
d. Challenges to Deluded Gladness
Regardless of the status of Benatar’s asymmetry thesis, he has also urged that our lives are far worse than the value at which we normally assess them. If it turns out that most lives are actually not worth living, then this is a reason in itself not to procreate. But many have suggested that Benatar is mistaken about this fact. For instance, the fact that so many people are glad for their existence might be evidence in itself that such gladness is not deluded (DeGrazia 2012, 164). Furthermore, any plausible moral theory must be able to account for the fact that most people are glad to be alive and think that their lives are going well (DeGrazia 2012, 158).
Another objection is that it fails to distinguish between higher-order pleasures and minor pains. Being tired or hungry is a harm, but it is outweighed by more valuable goods such as loving relationships. Many of the negative features that Benatar associates with existence can be overridden in this way (Harman 2009, 783).
Alan Gewirth has comprehensively defended the concept of self-fulfillment as key to a meaningful life (1998). Although special relationships like the one between parents and children violate egalitarian norms, having a family does not violate anyone else’s human rights. This forms part of the basis for Gewirth claiming that while “children have not themselves voluntarily participated in setting up the family, their special concern for their parents and siblings is appropriately viewed as derivative, both morally and psychologically, from the parents’ special concern both for one another and for each of their children and, in this way, for the family as a whole” (1998, 143). At least for some people, procreating and the family unit are an important part of self-fulfillment. If Gewirth is right about the value of self-fulfillment, and procreating contributes to self-fulfillment (at least for certain individuals), then these ideas constitute a reason to reject Deluded Gladness. At the very least, Gewirth’s theory of self-fulfilment needs to be considered by Benatar in addition to the hedonistic theories, desire-fulfillment theories, and objective list theories he criticizes for encouraging inaccurate self assessments about quality of life.
There are also important questions about whether the type of self-deception that seems to be required by the Undeluded Gladness Argument is even possible. For example, some theories of deception say that the deceiver knowingly and intentionally deceives another agent. But this makes it difficult to see how self-deception is even possible. The deceiver would know they are deceiving themselves since deceit is intentional. A problem arises because the notion that many people have simply deluded themselves into thinking their lives are better than they really are could plausibly be thought to be a form of self-deception. And yet on the theory of self-deception just described, we might wonder whether such self-deception is even possible. Connections between arguments for anti-natalism and self-deception are surely worthy of more consideration. As it stands, the literature on anti-natalism in general has not taken into account how different theories of self-deception might affect various arguments.
e. Overall’s Sexism Challenge to Benatar
Christine Overall suggests that Benatar’s arguments, even if true, could have harmful consequences for women. This is thus a moral rather than an epistemic challenge to anti-natalism. First, Overall holds that we do not have a duty to procreate because women have procreative control over their own bodies (Overall 2012, 114). Second, she objects to the idea that there are no costs associated with procreation, especially when one considers the nine-month pregnancy and delivery. Third, she worries that adopting Benatar’s views could actually lead to more female infanticide and violence towards pregnant women. One question here, then, is whether Benatar is sufficiently sensitive to the plight of women and the potential consequences his arguments might have for them. Is anti-natalism ultimately a sexist position?
Benatar responds to Overall by claiming that a right to not reproduce only exists if there is no moral duty to reproduce. This reply closely links rights with duties. He also observes that the costs women incur in procreating are not what is under dispute. The question at stake here is whether it is permissible to procreate, not whether there are costs involved in procreating. Finally, Benatar again reiterates that his arguments have to do with morality, which in this case is distinct from the law. This is why he holds that “contraception and abortion should not be legally mandatory even though contraception and early abortion are morally required” (Benatar 2019, 366). Finally, Benatar suggests that Overall has not provided specific evidence that anti-natalism would harm women. In turning this objection on its head, he claims that anti-natalism might actually be good for women. For if widely adopted, then there might be less of a tendency to view women primarily as child birthers and rearers (Benatar 2019, 366-367). Which of Benatar or Overall is correct about the consequences of anti-natalism for women appears to be an empirical question.
f. The Hypothetical Consent Argument
After Benatar’s work, the Hypothetical Consent Argument is probably the most discussed argument for anti-natalism in the literature. The basic idea of the argument is that procreation imposes unjustified harm on an individual to which they did not consent (Shiffrin 1999; Singh 2012; Harrison 2012). But what makes procreation an unjustified harm? For there are clearly certain cases where harming an unconsenting individual is justified. Consider the following oft-discussed case:
Rescue. A man is trapped in a mangled car that apparently will explode within minutes. You alone can help. It appears that the only way of getting him out of the car will break his arm, but there is no time to discuss the matter. You pull him free, breaking his arm, and get him to safety before the car explodes (DeGrazia 2012, 151).
It is permissible in this case to harm the man in a nontrivial way without his consent because doing so clearly prevents the greater harm of his death. We can say that in such a case you have the man’s hypothetical consent because he would (or rationally ought to) consent to the harm if he could. But now consider a different case that is also frequently discussed:
Gold manna. An eccentric millionaire who lives on an island wants to give some money to inhabitants of a nearby island who are comfortably off but not rich. For various reasons, he cannot communicate with these islanders and has only one way of giving them money: by flying in his jet and dropping heavy gold cubes, each worth $1 million, near passers-by. He knows that doing so imposes a risk of injuring one or more of the islanders, a harm he would prefer to avoid. But the only place where he can drop the cubes is very crowded, making significant (but nonlethal and impermanent) injury highly likely. Figuring that anyone who is injured is nevertheless better off for having gained $1 million, he proceeds. An inhabitant of the island suffers a broken arm in receiving her gold manna (DeGrazia 2012, 151-152).
What makes this eccentric millionaire’s actions impermissible in this case is that the benefit imposed does not involve avoiding a greater harm. This is what ethicists refer to as a pure benefit. So, the idea is that it is impermissible to confer a pure benefit on someone who has not consented to it, while it is permissible to confer a benefit on someone to prevent a nontrivial harm to them. In the Rescue case there is hypothetical consent to the harm, whereas in the Gold manna case there is no such consent.
The anti-natalist urges that procreation is analogous to the Gold manna case, not the Rescue case. Procreation imposes a nontrivial and unconsented harm on the individual who is created for the purposes of bestowing a pure benefit. Those who would procreate, then, do not have the hypothetical consent of the individuals they procreate. Why is this the case? If an individual does not exist, she cannot be harmed nor benefitted. Language is misleading here because when procreation does not occur there is no ‘individual’ who does not exist. There is simply nothing. There is no person in a burning car, no people on the island, and no free-floating soul waiting to be created. Procreation always involves bestowing a pure benefit, something this argument says is impermissible.
g. Challenges to Hypothetical Consent
Connected to the counterclaim that our lives usually go well is the idea that it is actually permissible to sometimes bestow a pure benefit on someone. There are cases where parents are better understood as exposing their child to certain harms rather than imposing such harms on them. Even if the act of procreation is ultimately best understood as imposing harms, it may be justified in light of bestowing a pure benefit on the created individual. Parents often make their children participate in activities where the gain is only a pure benefit; the activity has nothing to do with avoiding a greater harm. Consider parents who encourage excellence in scholarship, music, or athletics (DeGrazia 2012, 156-157). If this is right, then there is also reason to reject the Hypothetical Consent Argument for anti-natalism.
h. The No Victim Argument
Gerald Harrison argues that to coherently posit the existence of moral duties means there must be a possible victim (that can be hurt by the breaking of a duty). In light of this, he suggests that “we have a duty not to create the suffering contained in any prospective life, but we do not have a duty to create the pleasures contained in any prospective life” (2012, 94). It is intuitive to think that we have the following two duties: (1) There is a duty to prevent suffering; and (2) There is a duty to promote pleasure (Harrison 2012, 96). Since there would be no victim if one failed to create happy people, this nicely explains why we do not have a duty to procreate even if we are sure our offspring will have very happy lives. However, this also explains why we have a duty not to create suffering people since if we do so there are clearly victims (that is, the suffering people who were created).
Since all lives contain suffering, there is a duty to never procreate. For in procreating, we always fail to do our duty to prevent suffering because there is an actual victim of suffering. That an individual has an on balance or overall happy life cannot outweigh the duty to not procreate because in failing to procreate there is no victim (Harrison 2012, 97-99).
According to Harrison, the duty not to procreate is therefore underpinned by two prima facie duties. First, we have a duty to prevent harm. Second, we have a “duty not to seriously affect someone else with [their] prior consent” (2012, 100). However, Harrison acknowledges that “[f]ulfilling this duty will mean that no more lives are created and this […] is a bad state of affairs, even if it is not bad for anyone” (2012, 100). The reason we do not have a duty from ensuring that this state of affairs does not obtain is that doing so would involve bringing people into existence who will in fact be harmed by their existence. On the other hand, that no more lives are created does not harm anyone. Harrison further notes that though his position entails the strange claim that a person can be happy for being brought into existence, even though they are harmed by it, there is nothing incoherent in it. For example, someone could place a large bet in our name without our consent. Doing so is wrong even if the bet is won and we ultimately benefit from it (Harrison 2012, 100).
i. The Exploitation Argument
The Exploitation Argument for anti-natalism, offered by Christopher Belshaw, involves the idea that procreation fundamentally involves exploitation (Belshaw 2012). Consider that we have the intuition that we should end the lives of animals who are suffering even if there is some chance that they could be happy in the future (Belshaw 2012, 120). Suppose that there are categorical desires, which involve reasons to ensure our future existence. Further suppose that there are also conditional desires, which, assuming a future, offer reason to think that one state of affairs will obtain over some other one (Belshaw 2012, 121). Belshaw continues to suggest that while a baby is a human animal, it is necessarily not a person in a more robust sense. This is because babies are not psychologically knit together, nor do they have categorical or conditional desires (Belshaw 2012, 124). Likewise, there is no continuity between a baby and the adult it becomes; it is implausible to think these are the same person. For a baby:
[H]as no developed notion of itself, or of time, no desire to live on into the future, no ability to think about the pain and decide to endure it. Further, if we think seriously about a baby’s life we’ll probably agree it experiences pain in more than trivial amounts. Even perfectly healthy babies come into the world screaming, cry a lot, suffer colic and teething pains, keep people awake at night. None of us can remember anything about it (Belshaw 2012, 124).
An important claim of the Exploitation Argument is that such a life is not worth living. Even if only through a baby can a person be brought into existence, this does not compensate the baby for the harm it experiences (Belshaw 2012, 124). This means that we must exploit babies in order for there to be humans. I might be glad that there was a baby who was exploited in order for me to come to exist, but it would still be better for that baby had it never been born. In procreating “we are inevitably free-riding on the several misfortunes of small, helpless and shortlived creatures.” (Belshaw 2012, 126).
j. Negative Utilitarianism
Two well-known consequentialist ethical theories are act-utilitarianism and rule-utilitarianism. The former focuses on evaluating the permissibility of individual actions based on their effects while the latter instead seeks to discover a set of rules which if followed will maximize positive effects. This strategy involves categorizing different types of action. It has been observed that a different type of utilitarianism, negative utilitarianism, entails anti-natalism (for example, Belshaw 2012; Metz 2011). On this moral theory, the only salient aspect of morality is avoiding pain. When assessing whether a particular action is permissible (or what set of rules to follow) we should only ask whether the effects of that action will be painful. Obtaining pleasure (in any sense) simply does not factor into moral reasoning on this view.
Since every life contains at least some pain, it is best to avoid it by simply not starting that life in the first place. According to negative utilitarianism, no amount of pleasure could outweigh even the smallest degree of pain, since pleasure does not count for anything morally. While the connection between negative utilitarianism and anti-natalism has been identified, anti-natalists have hardly been eager to adopt this as an argumentative strategy. Not only is negative utilitarianism a highly controversial moral theory in itself, but it seems to entail pro-mortalism, the view that people should end their lives. This is, after all, seemingly the best way to avoid any future pains. Since many anti-natalists have gone to great lengths to show that the view does not in fact entail pro-mortalism, appealing to negative utilitarianism is largely avoided by proponents of anti-natalism.
k. Broader Implications
It is important to note that there is a difference between offering theoretical arguments for a particular conclusion and enforcing policies which ensure that conclusion is enacted. The philosophical debate about anti-natalism is almost entirely theoretical. Many authors defending anti-natalism seem well aware that there are strong prudential and moral reasons not to force anti-natalist policies on people. Likewise, though they think anti-natalism is true there is a general recognition that it will not be widely adopted in practice.
2. Additional Objections to Philanthropic Arguments
a. Procreative Autonomy
One reason that has been offered to reject anti-natalist conclusions in general is that procreative autonomy is more important (for example, Robertson 1994). Procreative autonomy is important because procreation is often central to an individual’s identity, dignity, and life’s meaning. In other words:
Whether or not to have children, when and by what means to have them, how many to have, and similar choices are extremely important to us. These decisions greatly affect our narrative identity—for example, whether or not we become parents, what sort of family we have—and much of the shape of our lives. Few decisions seem as personal and far-reaching as reproductive decisions. That procreative freedom is very important seems too obvious to require further defense (DeGrazia 2012, 155).
Those who are attracted to this type of response could admit that anti-natalists get at important truths about procreation, but simply maintain that procreative autonomy is more important.
b. Pro-Mortalism
Another objection that sometimes gets levelled against anti-natalism is that it entails pro-mortalism, the view that individuals ought to end their lives. As noted above, this is probably one reason why anti-natalists have avoided tying their views to negative utilitarianism. However, it seems doubtful that any of the main arguments for anti-natalism entail pro-mortalism. With respect to Benatar’s work, he consistently states that even though lives are not worth creating, most are worth continuing. The same can be said of the Hypothetical Consent Argument. Once an individual has received the pure benefit of existence, realizing this fact does not imply they should commit suicide, just as the islander whose arm is broken by the gold manna ought not to end his life. The No Victim Argument neatly avoids this worry because one has a duty to promote one’s own pleasure. Once one comes into existence there is an actual victim if one fails to promote their own pleasures, so there is a duty to promote one’s pleasure. Presumably, for most people and throughout most of their lives, suicide would not fulfill this duty. Finally, the Exploitation Argument also avoids this objection. For on this argument most adult human lives are indeed worth continuing, the problem is rather the exploitation of the babies to get such lives in the first place. Benatar says that even though he holds most lives are going poorly, it does not entail that we should commit suicide. This is because we typically each have interests in continuing to live. Our lives would have to be worse than death, which is extremely bad, in order for suicide to be justified. This will only rarely be the case (Benatar 2013, 148).
3. The Misanthropic Argument
The philanthropic arguments which were discussed in the previous section conclude that because of the harm done to the created individual, it is always all things considered wrong to procreate. This section explains what is known as the misanthropic argument for anti-natalism. Unlike the philanthropic arguments, this argument focuses on the harm caused by the individuals who are created. The conclusion of this argument is slightly weaker, claiming that procreation is almost always impermissible, or only impermissible given the current situation in which procreation occurs.
Though Benatar is best known for offering a philanthropic argument for anti-natalism, he has also developed a distinct misanthropic argument. He also speculates that misanthropic arguments are even more likely to be met with scorn than philanthropic arguments. This is because while the latter are in some sense about protecting individuals, the former focuses on the bad aspects of humanity (Benatar 2015, 35). Whether Benatar is right about this remains an open question as most of the anti-natalist literature tends to focus on the philanthropic arguments.
Here’s Benatar’s basic misanthropic argument for anti-natalism:
(1) We have a (presumptive) duty to desist from bringing into existence new members of species that cause (and will likely continue to cause) vast amounts of pain, suffering and death.
(2) Humans cause vast amounts of pain, suffering and death.
Therefore,
(3) We have a (presumptive) duty to desist from bringing new humans into existence. (Benatar 2012, 35).
a. Premise 2 of the Misanthropic Argument
Premise 2 is the one in most need of defense. To defend it Benatar appeals to humanity’s generally poor impulses, their destructiveness towards one another, the suffering they cause other animals, and the damage that they do to the environment.
i. Harm to Humans
Regarding humanity’s poor impulse control in general, Benatar is quick to observe that the vast majority of human achievements are not possible for most humans. We therefore should not judge the human species in general based on the performance of exceptional people. In fact, it is now well-document that humans exhibit numerous cognitive biases which cause us to both think and act irrationally (Benatar 2015, 36). Consider that:
For all the thinking that we do we are actually an amazingly stupid species. There is much evidence of this stupidity. It is to be found in those who start smoking cigarettes (despite all that is known about their dangers and their addictive content) and in the excessive consumption of alcohol—especially in those who drive while under its influence. It is to be found in the achievements of the advertising industry, which bear ample testament to the credulity of humanity (Benatar 2015, 36).
These cognitive failings often cause humans to harm each other. We exhibit an extreme tendency toward conformity and following authority, even when doing so leads us to hurt each other (Benatar 2015, 37).
Even if one contends that our intelligence compensates for these moral deficiencies, it is difficult to defend this claim in light of human destructiveness:
Many hundreds of millions have been murdered in mass killings. In the twentieth century, the genocides include those against the Herero in German South-West Africa; the Armenians in Turkey; the Jews, Roma, and Sinti in Germany and Nazi-occupied Europe; the Tutsis in Rwanda; and Bosnian Muslims in the former Yugoslavia. Other twentieth-century mass killings were those perpetrated by Mao Zedong, Joseph Stalin, and Pol Pot and his Khmer Rouge. But these mass killings were by no means the first. Genghis Khan, for example, was responsible for killing 11.1% of all human inhabitants of earth during his reign in the thirteenth century […] The gargantuan numbers should not obscure the gruesome details of the how these deaths inflicted and the sorts of suffering the victims endure on their way to death. Humans kill other humans by hacking, knifing, hanging, bludgeoning, decapitating, shooting, starving, freezing, suffocating, drowning, crushing, gassing, poisoning, and bombing them (Benatar 2015, 39).
Humans also do not just murder each other. They also “rape, assault, flog, maim, brand, kidnap, enslave, torture, and torment other humans” (Benatar 2015, 40). Though these are the worst harms, humans also frequently “lie, steal, cheat, speak hurtfully, break confidences and promises, violate privacy, and act ungratefully, inconsiderately, duplicitously, impatiently, and unfaithfully” (Benatar 2015, 43). Even if justice is sought, it is hardly ever achieved. Many of the most evil leaders in human history ruled for the course of their natural lives, while others had peaceful retirements or were only exiled (Benatar 2015, 43). In sum, “‘Bad guys’ regularly ‘finish first’. They lack the scruples that provide an inner restraint, and the external restraints are either absent or inadequate” (Benatar 2015, 43).
ii. Harm to Animals
The amount of suffering that humans inflict on animals each year is hard to fathom. Given that the vast majority of humans are not vegetarians or vegans, most of them are complicit in this suffering. Consider that “[o]ver 63 billion sheep, pigs, cattle, horses, goats, camels, buffalo, rabbits, chickens, ducks, geese, turkeys, and other such animals are slaughtered every year for human consumption. In addition, approximately 103.6 billion aquatic animals are killed for human consumption and non-food uses” (Benatar 2015, 44). These numbers exclude the hundreds of millions of male chicks killed every year because they cannot produce eggs. It also excludes the millions of dogs and cats that are eaten in Asia every year (Benatar 2015, 44). Each year there are also 5 billion bycatch sea animals, which are those caught in nets, but not wanted. Finally, at least 115 million animals are experimented on each year (Benatar 2015, 45). Furthermore, “[t]he deaths of the overwhelming majority of these animals are painful and stressful” (Benatar 2015, 44). The average meat eater will consume at least 1690 animals in their lifetime (a rather low estimate) which is an extremely large amount of harm (Benatar 2015, 54-55).
iii. Harm to the Environment
Humans are also incredibly destructive to the environment. The human population is growing exponentially and the negative environment effects per person continue to increase too. This is partly due to industrialization and a steady growth in per capita consumption (Benatar 2015, 48). As a result:
The consequences include unprecedented levels of pollution. Filth is spewed in massive quantities into the air, rivers, lakes, and oceans, with obvious effects on those humans and animals who breath the air, live in or near the water, or who get their water from those sources. The carbon dioxide emissions are having a ‘greenhouse effect,’ leading to global warming. As a result, the icecaps are melting, water levels are rising, and climate patterns are changing. The melting icecaps are depriving some animals of their natural habitat. The rising sea levels endanger coastal communities and threaten to engulf small, low-lying island states, such as Nauru, Tuvalu, and the Maldives. Such an outcome would be an obvious harm to its citizens and other inhabitants. The depletion of the ozone layer is exposing earth’s inhabitants to greater levels of ultraviolet light. Humans are encroaching on the wild, leading to animal (and plant) extinctions. The destruction of the rain forests exacerbates the global warming problem by removing the trees that would help counter the increasing levels of carbon dioxide (Benatar 2015, 48).
CO2 emissions per year per person are massive. While they are lower in developing countries, they tend to have much higher birthrates than their wealthier counterparts. As the population increases, adding more humans will invariably harm the environment.
b. Premise 1 of the Misanthropic Argument
Notice that premise 1 of this argument does not claim that we should kill members of a dangerous species or stop that dangerous species from procreating. Instead, it merely says “that one should oneself desist from bringing such beings into existence” (Benatar 2015, 49). For this premise to be true it also does not have to be the case that every single member of the species is dangerous. The likelihood that a new member of the species will cause significant harm is enough to make procreation too dangerous to be justified. Also notice that this premise is silent on the species in question. It would be easily accepted if it were about some non-human species: “Imagine, for example, that some people bred a species of non-human animal that was as destructive (to humans and other animals) as humans actually are. There would be widespread condemnation of those who bred these animals” (Benatar 2015, 49).
c. The Presumptive Duty Not to Procreate
Presumptive duties are defeasible. The duty only holds if there are no good reasons to do otherwise (Benatar 2015, 51). One possible way of avoiding the misanthropic argument is to counter that the good that humans do is pervasive enough to often defeat this presumptive duty. If this is right, then procreation will often be permissible (Benatar 2015, 51). However, in light of the vast harms that humans do, meeting the burden of proof regarding the amount of counteracting good that humans do, is going to be extremely difficult. Remember that the benefits here do not just have to counter the harms to other humans, but the harm done to billions of animals every year, in addition to the environment more generally (Benatar 2015, 52). We would also need a clear understanding of what constitutes good outweighing bad. Does saving two lives outweigh the bad of one violent murder? Benatar is doubtful, claiming the number of lives needing to be saved to outweigh the bad is much higher than two (2015, 52). Likewise, offering benefits to future generations cannot count as part of the good that outweighs the bad because such humans would not exist if the presumptive duty were followed in the first place. Under the current conditions of the world, more new humans add more additional harms than they do any offsetting benefits (Benatar 2015, 54). Finally, a more modest response is the assertion that the presumptive duty not to procreate can occasionally be defeated. Perhaps the children of a particular individual would do enough offsetting good that it would justify creating them (Benatar 2015, 54). While this scenario is certainly possible, it is doubtful that those considering procreating will be in a good position to know this about their future offspring.
4. Anti-Natalism and Duties to the Poor
Thus far, very little has been said about how our duties to the poor are impacted by anti-natalism. However, there are important connections between duties to the poor and procreative ethics. Consider the following scenario: Suppose that you are walking on your way to work in the morning. You find yourself walking by a pond and observe a drowning child. If you stop to help the child, you will probably ruin your nice new clothes and also be late for work. There is no tangible risk of you drowning since the pond is not very deep for an adult. There is also no one else around. If you do not help the child now, then they will almost certainly drown. What should you do? This example is modified from Peter Singer, a famous utilitarian who is well-known for defending the idea that we have rather strong obligations to help the poor, particularly those in developing countries. Singer thinks it is obvious what you should do in this case. You should stop to help the drowning child. The value of the child’s life is worth much more than what it costs for you to help them, namely, your new clothes and having to explain to your boss why you were late. The next step is to draw an analogy between the children in the pond and the less-well off developing nations. In fact, Singer suggests that people in wealthier countries are in the very position of walking past ponds with drowning children everyday. The people we could help are just a bit farther away and we do not see them directly in front of us. But this is not a morally significant difference. So, those of us in wealthier countries need to devote a lot more of our resources to help those in developing countries who are suffering.
Benatar is so far the only one to make connections between Singer’s argument and anti-natalism. Here is his interpretation of Singer’s argument:
Singer’s Poverty Relief argument on our duty to the poor
(1) If we can prevent something bad from happening, without sacrificing anything of comparable moral importance, we ought to do it.
(2) Extreme poverty is bad.
(3) There is some extreme poverty we can prevent without sacrificing anything of comparable moral importance.
Therefore,
(4) We ought to prevent some extreme poverty (Benatar 2020, 416).
This argument nicely avoids disagreement between utilitarians and non-utilitarians because both sides will agree that we should prevent bad things from happening even if there is some disagreement about how to measure bad or just what constitutes a comparable moral sacrifice. Benatar observes that Singer’s argument has clear implications for procreative ethics. We must either accept those implications or give up Singer’s conclusion.
What does Singer’s argument imply about procreative ethics? The first implication has to do with the opportunity costs of having children. His argument “implies that, at least for now, the relatively affluent ought to desist from having children because they could use the resources that would be needed to raise resulting children to prevent extreme poverty” (Benatar 2020, 417). In wealthy countries it costs anywhere from two hundred thousand to three hundred thousand dollars to raise a child from birth to the age of eighteen years old. Having children should be forgone by the wealthy so that they can spend this money on alleviating extreme poverty. It also implies that even adoption may be impermissible for the wealthy if their resources are still more effectively spent elsewhere (Benatar 2020, 418).
Another implication of Singer’s argument has to do with natality costs. It implies that many people should refrain from procreating, especially the poor, because of the bad things that will inevitably happen to those they bring into existence (Benatar 2020, 420). Sacrificing procreation is not of comparable moral importance since no one is harmed by not being brought into existence. So, it is more important that the poor refrain from procreating to prevent their children from experiencing extreme poverty. Benatar suggests that if Singer is right, we might even have a duty to prevent the poor from procreating, though this would be the prevention of a bad thing, not relief from it. Of course, Benatar is well aware that this conclusion is unlikely to be met by many with approval. According to Benatar at least, no one would be harmed if people refrained from procreating for these reasons, and much suffering would be prevented. While these ideas are far from uncontroversial, and likely to even cause offense to some, it is clear that more work needs to be done exploring how our duties to the poor are connected to anti-natalism.
5. Future Directions
This section identifies three potential areas for future research into anti-natalism. The first regards a lack of direct interaction between religious perspectives and arguments for anti-natalism. The second involves the need for more interaction between anti-natalism and non-Western approaches to ethics. The third is about the surprising dearth of engagement with anti-natalism in the philosophical literature on population ethics.
a. Religious Perspectives on Anti-Natalism
Perhaps more than any other group, religious believers cringe when they hear defenses of anti-natalism. In the classical monotheistic tradition, for example, existence is held to be intrinsically good. This has played out in the prizing of the nuclear family. It might seem rather unsurprising, then, that religious thinkers have had little to say about the anti-natalist debate. However, the rejection of anti-natalism out of hand by the religious believer might turn out to be short-lived. First, theists who are committed to the claim that existence is intrinsically good are committed to the further claim that there are literally no lives not worth continuing. Some might find this conclusion difficult to accept. For even if it were to apply to all actual lives, it is easy to think of possible lives that are so awful they are not worth continuing. Second, in holding that existence is intrinsically good, theists are under pressure to explain why they are not obligated to procreate as much as possible. They need to explain this because an obligation to procreate as much as possible is absurd.
If theists can coherently explain why existence is intrinsically good while avoiding the problematic results just mentioned, they may well have an answer to philanthropic arguments for anti-natalism. They can acknowledge that procreation is a weighty moral decision that ought to be taken more seriously than prospective couples often take it. They can even concede that certain cases of procreation probably are impermissible. However, if procreating really is to bring about an individual whose existence is intrinsically valuable, then many instances of procreation will indeed be permissible. Yet this does not necessarily let the theist off the hook when it comes to the misanthropic arguments for anti-natalism. The harm that most human lives will do seems hard to deny. One possibility for the theist is to say that this type of concern reduces to the problem of evil. Therefore, solutions to the problem of evil can be used as resources to show why procreation is permissible even in light of the harm humans do. But there are many questions for such a strategy. It is one thing to say that once humans are brought into existence God allows them to commit a great deal of evil because of the value of morally significant freedom, to name just one theistic response to evil. However, it is another to say that such solutions justify the act of bringing humans into existence who do not already exist.
Another underexplored connection between a theistic perspective and misanthropic arguments for anti-natalism regards humanity’s treatment of the environment. In the Judeo-Christian tradition, for example, the planet is a gift from God to humans. We are supposed to cherish, protect, and look after the environment and the non-human animals that it contains. Clearly, just the opposite has happened. In light of the fact that population increase is directly tied to the climate crises, might those in religious traditions who hold that the planet is a gift be obligated to cease procreating? These and related ideas are at least worthy of exploration by scholars of religion.
b. Anti-Natalism and Non-Western Ethics
The philosophical literature on anti-natalism is dominated by those working in Western philosophy. It is important to briefly consider ways in which debates about procreative ethics could be forwarded by including non-Western ethical perspectives, African philosophy, for example. This literature emerged (professionally) in the 1960s, with the demise of colonization and the rise of literacy rates on the African continent. There are three main branches of African ethics. First, African thinkers distinguish the normative conception of personhood from the metaphysical or biological conceptions of the person (Menkiti 1984). On this understanding of ethics, the most central feature of morality is for individuals to develop their personhood. This is typically done by exercising other-regarding virtues and hence can only be accomplished within the context of community. On this view personhood is a success term such that one could fail to be a person (in the normative sense) (Ikuenobe 2006; Molefe 2019). Second, harmony has been postulated as the most important aspect of morality in indigenous African societies. Harmony is about establishing a balance or equilibrium amongst humans with each other and all else, including the natural world. Disrupting the harmony of the community is one of the worst things an individual can do. That personhood and harmony are both understood within the context of relating to others shows why, in part, community is of supreme importance in the African tradition (Metz forthcoming; Paris 1995; Ramose 1999). Third, vitalist approaches to morality say that everything, both animate and inanimate, are imbued with life force, a kind of imperceptible energy. On this approach, the goal of morality is to increase life force in oneself and others. Procreation is valuable because it creates a being with life force (Magesa 1997).
On African personhood accounts of morality, an individual can only develop and exercise moral virtue in the context of the community, traditionally including not merely present generations but also future ones, often called the ‘not-yet-born’. To deny the importance of the continuance of the community through procreation seems to fly in the face of such an ethic. Likewise, in African relational ethics, harmony amongst individuals is of the upmost important. Again, the continuance of the community through procreation appears vital to the existence of and promotion of harmony in the community. In other words, there can be no personhood or harmony without a community of persons. Given its importance, the community ought to continue via procreation. Finally, on vitality accounts, anti-natalism appears to deny the importance of creating beings with life force. There is thus a rather apparent tension between anti-natalism and African communitarian ethics.
However, consider that, despite initial appearances, it could be argued that misanthropic arguments for anti-natalism are in conflict with African ethics. Furthermore, it is plausible that philanthropic arguments are consistent with at least some line of thought in African ethics. While tensions may remain between the two views, much more exploration of a possible synthesis between anti-natalism, and African ethics is needed. There are least five possible reasons why these two views might be consistent with each other (and in some cases mutually supportive). First, African ethics emphasizes doing no harm to the community. Procreation right now harms many communities, given that creating more people means making valuable resources even more scarce, for example. Second, procreation harms the global community and environment. An important line in African thought is that humans should strive to be in harmony with the natural environment, not just with each other. Until we find ways to greatly reduce our carbon footprints, procreating harms the environment and thereby produces disharmony. Third, even if strong philanthropic versions of anti-natalism which do not rely on better resource distribution or environmental considerations are followed, consider that even if everyone refrained from procreating there would still be a human community for many years (the next 80 or so), right up until the very last person existed. The opportunity to develop one’s personhood and seek harmonious relationships would remain. Fourth, not procreating arguably allows one to better contribute to the common good because one has more available time, energy, money, among other resources available that are not spent on one’s children. Again, this remains so in the African understanding because developing personhood and harmonious relationships are viewed as essential parts of the common good. Fifth, adoption is a viable alternative to satisfying the interests of the community. This raises interesting questions about whether it is creating the child itself, instead of merely rearing one, that is meaningful, morally significant, or otherwise of importance to the African tradition.
c. Anti-Natalism and Population Ethics
Anti-natalism appears to have relatively underappreciated connections to topics in environmental ethics. This is surprising particularly regarding the misanthropic arguments which focus, in part, on the harm that humans do to the environment. Trevor Hedberg (2020) explains that after a long silence beginning the 1980s and 1990s only very recently have theorists begun to explicitly discuss the population problem in connection with the environment. The continued growth of the planet’s population is a fact. It took all of human history until 1804 to reach one billion people. A century ago, the world had approximately 1.8 billion people. However, the current population sits at approximately 7.8 billion people (Hedberg 2020, 3). Hedberg contends that “population is a serious contributor to our environmental problems, we are morally obligated to pursue the swift deceleration of population growth, and there are morally permissible means of achieving this outcome—means that avoid the coercive measures employed in the past” (Hedberg 2020, 3). Indeed, such coercive measures in the past are probably part of the reason that many environmental organizations and governments which claim to care deeply about the climate crisis virtually never mention the population. Yet it does not matter if humans become more efficient and individually have a less bad impact on the environment if such improvements are outpaced by population growth. So far, any improvements in individual impact have been greatly outpaced by population growth. On the very plausible (if not obvious) assumptions that (1) climate change poses a significant and existential threat to the human species; and (2) that population growth contributes to climate change, environmental ethicists need to start contending with misanthropic arguments for anti-natalism. Remember that these arguments leave open the possibility (however small) that humans may not cause so much damage in the future and as such it might not be impermissible to bring more into existence.
d. Human Extinction as the Goal of Anti-Natalism
At the beginning of Better Never to Have Been: The Harm of Coming into Existence, Benatar acknowledges that his work will likely have no (or almost no) impact on people’s procreative choices. He writes:
Given the deep resistance to the views I shall be defending, I have no expectation that this book or its arguments will have any impact on baby-making. Procreation will continue undeterred, causing a vast amount of harm. I have written this book, then, not under the illusion that it will make (much) difference to the number of people there will be but rather from the opinion that what I have to say needs to be said whether or not it is accepted (2006, vii).
But what would happen if everyone accepted Benatar’s arguments and put them into practice? The current generation of humans (that is, everyone alive right now), would be the last generation of humans. Benatar’s recommendation is that the human species should voluntarily opt to go extinct. Yet if this course of action were followed, life would very likely be quite difficult for the last few remaining humans. An anti-natalist policy, then, would actually increase the harm experienced by at least some humans. Deciding if this is worth it may well come down to whether we agree that the suffering that will inevitably occur by continuing to propagate the human species outweighs the harm done to those whose lives would be made quite difficult by being part of the last few remaining humans. Benatar and those who agree with him appear to believe that the harm of continuing to bring more persons into existence drastically outweighs the harm done to the last remaining humans. Anti-natalists are well aware that they are recommending human extinction.
6. References and Further Reading
Belshaw, Christopher. 2012. “A New Argument for Anti-Natalism.” South African Journal of Philosophy 31(1): 117-127.
Defends the exploitation argument.
Benatar, David. 2006. Better Never to Have Been: The Harm of Coming into Existence. Oxford: Oxford University Press.
Seminal work containing an extensive defense of the Asymmetry and Deluded Gladness Arguments for anti-natalism.
Benatar, David. 2013. “Still Better Never to Have Been: A Reply to (More of) My Critics. Journal of Ethics 17: 121-151.
Replies to Bradley, Grazia, Harman, among others.
Benatar, David. 2015. “The Misanthropic Argument for Anti-natalism.” In Permissible Progeny?: The Morality of Procreation and Parenting. Sarah Hannon, Samantha Brennan, and Richard Vernon (eds). Oxford: Oxford University Press. pp. 34-59.
Defends anti-natalism on the basis of the harm that those who are created will cause.
Benatar, David. 2020. “Famine, Affluence, and Procreation: Peter Singer and Anti-Natalism Lite.” Ethical Theory and Moral Practice 23: 415-431.
Connects anti-natalism to Peter Singer’s claims about duties to the poor.
Bradley, Ben. 2010. “Benatar and the Logic of Betterness.” Journal of ethics & social philosophy.
Critical notice on Benatar 2006
Bradley, Ben. 2013. “Asymmetries in Benefiting, Harming and Creating.” Journal of Ethics 17:37-49.
Discussion of the asymmetries between pain and pleasure.
DeGrazia, David. 2010. “Is it wrong to impose the harms of human life? A reply to Benatar. Theoretical Medicine and Bioethics 31: 317-331.
Reply to Benatar 2006.
DeGrazia, David. 2012. Creation Ethics: Reproduction, Genetics, and Quality of Life. Oxford: Oxford University Press.
Chapter 5 critically discusses Benatar 2006.
Gewirth, Alan. 1998. Self-Fulfillment. Princeton University Press.
Defends a theory of self-fulfillment that includes the goods of family and children.
Harman, Elizabeth. 2009. “Critical Study of David Benatar. Better Never To Have Been: The Harm of Coming into Existence (Oxford: Oxford Univeristy Press, 2006).” Nous 43 (4): 776-785.
Critical notice on Benatar 2006.
Harrison, Gerald. 2012. “Antinatalism, Asymmetry, and an Ethic of Prima Facie Duties.” South African Journal of Philosophy 31 (1): 94-103.
Statement of the No Victim Argument.
Hedberg, Trevor. 2020. The Environmental Impact of Overpopulation. New York: Routledge.
One of the few places to discuss the impact of population on the environment, while also making explicit connections to anti-natalism.
Ikuenobe, Polycarp. 2006. Philosophical Perspectives on Communitarianism and Morality in African Traditions. Lexington Books.
An in-depth analysis of the African conception of personhood.
Magesa, Laruenti. 1997. African Religion: The Moral Traditions of Abundant Life.
Claims that life force is the most important feature of African ethics.
Menkiti, Ifeanyi. 1984. “Person and Community in African Traditional Thought.” In African Philosophy: An Introduction. 3d ed. Richard A. Wright (eds). Lanham, MD: University Press of America. pp. 171-181.
The most influential statement of the African conception of personhood.
Metz, Thaddeus. forthcoming. A Relational Moral Theory: African Ethics in and beyond the Continent. Oxford: Oxford University Press.
Offers a new ethical theory heavily influenced by African conceptions of morality focused on promoting harmony.
Molefe, Motsamai. 2019. An African Philosophy of Personhood, Morality, and Politics. Palgrave Macmillan.
Applies the African conception of personhood to various issues in political philosophy.
Overall, Christine. 2012. Why Have Children? The Ethical Debate. MIT Press.
Chapter 6 criticizes Benatar 2006.
Parfit, Derek. 1994. Reasons and Persons. Clarendon Press, Oxford.
Contains the most influential discussion of the non-identity problem.
Paris, Peter J. 1995. The Spirituality of African Peoples: The Search for a Common Moral Discourse. Fortress Press.
Emphasizes promoting life force as salient to African ethics
Ramose, Mogobe. 1999. African Philosophy through Ubuntu. Mond.
An African-based ethic focused on harmony.
Robertson, John. 1994. Children of Choice. Princeton University Press.
An early defense of procreative freedom.
Shiffrin, Seana. 1999. “Wrongful Life, Procreative responsibility, and the Significance of Harm.” Legal Theory 5: 117-148.
An influential account focusing on issues of consent and harm in procreation.
Singh, Asheel. 2012. “Furthering the Case for Anti-Natalism: Seana Shiffrin and the Limits of Permissible Harm.” South African Journal of Philosophy 31 (1): 104-116.
Develops considerations from Shiffrin 1999 into an explicit argument for anti-natalism.
African Philosophical Perspectives on the Meaning of Life
The question of life’s meaning is a perennial one. It can be claimed that all other questions, whether philosophical, scientific, or religious, are attempts to offer some glimpse into the meaning—in this sense, purpose—of human existence. In philosophical circles, the question of life’s meaning has been given some intense attention, from the works of Qoholeth, the supposed writer of the Biblical book, Ecclesiastics, to the works of pessimists such as Schopenhauer, down to the philosophies of existential scholars, especially Albert Camus and Sören Kierkegaard, and to twenty-first century thinkers on the topic such as John Cottingham and Thaddeus Metz. African scholars are not left out, and this article provides a brief overview of some of the major theories of meaning that African scholars have proposed. This is done by tying together ideas from African philosophical literature in a bid to present a brief systematic summary of African views about meaningfulness. From these ideas, one can identify seven theories of meaning in African philosophical literature. These theories include The African God-purpose theory of meaning, the vital force theory of meaning, the communal normative theory of meaning, the love theory of meaning, the (Yoruba) cultural cluster theory of meaning, the personhood-based theory of meaningfulness, and the conversational theory of meaning. Examining all these begins by explaining the meaning of “meaning” and the distinction between meaning in life and meaning of life.
What is meant by the terms meaning and meaningfulness? The concept of meaning is about what all competing ideas about meaning are about. In the literature, meaning is thought of in terms of purpose (received or determined teleological ends that are worth pursuing for their own sake), transcendence (things beyond our animal nature), normative reasons for actions, and so forth. These singular or monistic views, while interesting, also have their flaws. They barely capture what only and all competing intuitions about meaning. This is why Thaddeus Metz proposed a pluralistic account of meaning where he tells us that meaning consists of:
roughly, a cluster of ideas that overlap with one another. To ask about meaning … is to pose questions such as: which ends, besides one’s own pleasure as such are most worth pursuing for their own sake; how to transcend one’s animal nature; and what in life merits great esteem or admiration (Metz, 2013, p. 34).
It is easy to agree with Metz’s family semblance theory since the pluralism he employs allows one to accommodate most theories or conceptions of meaning while rejecting peripheral ideas – like pleasure, happiness – that do not, on their own, possess the quality of a ‘final value’, needed in the usual understanding of what meaning entails. But while this is so, ideas about subjective accounts of meaning appear missing in Metz’s concept. In addition, what about the meaning of life? These questions have led Attoe (2021) to add two extra variables to Metz’s family of values. The first is the subjective pursuit for those ends that an individual finds worth pursuing (insofar as the individual considers those things/values as ends in themselves). A cursory glance at Metz’s approach shows that although it is tenable, it is more objectivist than it is all-encompassing. By inserting subjectivity into his approach, subjectivist views about meaning are immediately accommodated, which are often found to be instrumentality incompatible with ideas about meaning (See Nagel, 1979, p. 16). The second variable that Attoe (2021) proposes is coherence. By coherence, he means the identification of a narrative that ties together an individual’s life – perhaps those moments of meaningfulness or those actions that dot her life – such that the individual can adjudge her whole life as meaningful. This feature bears more on ideas about the meaning of life.
With this in mind, it is expedient to make further distinctions about the meaning in life versus the meaning of life – as these concepts mean two different things and shall be used in different ways later on in this chapter. By meaning in life, what is meant are those instances of meaning that may dot an individual’s life. Thus, a marriage to a loved one or the successful completion of a degree may subsist as a meaningful act or a moment of meaningfulness. With regards to the meaning of life, there are some, like Richard Taylor, who describes it as involving the meaning of existence, especially with regards to cosmic life, biological life, or human life/existence specifically. One can also use the term ‘meaning of’ in a narrower sense, where ‘meaning of’ delineates a judgment on what makes the life of a human person, considered as a whole (especially within the confines of an individual’s lifetime), meaningful (Attoe, 2021). This understanding is similar to Nagel’s understanding of the meaning of life. This distinction is important to note because instances of meaning do not always pre-judge the meaning of the entire life of an individual. So, the individual obtaining a degree can be a moment of meaningfulness (meaning in), but it would be hard to consider that individual’s life as a whole to be ultimately meaningful (meaning of) if that individual spent his time killing others for no reason, despite gaining a degree or marrying a loved one.
2. African Philosophy and the Meaning of Life
Having exhausted the more expedient distinctions, the following section delves into what could be considered African conceptions of the meaning of life. To fully understand the ideas that shall be put forth, a short detour is necessary to describe the metaphysics undergirding African thought as this would avail the reader the proper lenses with which to see the African view(s). For those who are new to African metaphysics, one can easily imagine that any talk of African metaphysics is predicated by some unsophisticated talk about fantastic religious myths and, perhaps, some witchcraft or voodoo. Fortunately, African metaphysics involves something deeper and it is this metaphysics that usually guides the traditional African worldview.
African metaphysics is grounded in an interesting version of empiricism that allows a monistic-cum-harmonious relationship between material and spiritual aspects of reality. It is empirical because most African thinkers are willing to grant the possibility of material and spiritual manifestations in everyday life. Indeed, it is because one can point to certain acts as manifestations of these spiritual acts that metaphysicians of this type are quick to pronounce the existence of spiritual realities and recognise said acts as spiritual. In other words, knowledge of the spiritual develops from certain manifestations that are verifiable by the senses. What is talked about as spiritual, for instance, does not diminish its empirical worth, since empiricism agrees that knowledge – whatever type it is – is gotten from experience.
Unlike much of Western metaphysics, which, as Innocent Asouzu states, is inundated with all sorts of bifurcations, disjunctions and essentialisation, African metaphysics considers the fragmentations we see in reality as evidence of a harmonious complementary relationship between and among realities. There is a tacit acknowledgment of the interplay between matter and spirit or between realities from all and any spectrum, such that each facet of reality is seen as equally important and the supposedly artificial divide between material and spiritual objects, non-existent. Ifeanyi Menkiti captures this idea:
[T]he looseness or ambiguity regarding what constitutes the domain of the physical, and what the domain of the mental, does not necessarily stem from a kind of an ingrown limitation of the village mind, a crudeness or ignorance, unschooled, regarding the necessity of properly differentiating things, one from the other, but is rather an attitude that is well considered given the ambiguous nature of the physical universe, especially that part of it which is the domain of sentient biological organisms, within which include persons described as constituted by their bodies, their minds, and whatever else the post-Cartesian elucidators believe persons are made of or can ultimately be reduced to. My view on the matter is that the looseness or ambiguity in question is not necessarily a sign of indifference to applicable distinctions demanded by an epistemology, but is itself an epistemic stance, namely: do not make distinctions when the situation does not call for the distinctions that you make. (Menkiti, 2004b, pp. 124-125)
Somehow, this messaging trickles down into the African socio-ethical space where, for the most part, achieving the common good or attaining one’s humanity generally involves communal living or a deep form of mutual coexistence – one that has a metaphysical backing. It is no wonder then that Africa is known for, and has provided the world with, series of philosophies that reflect harmonious co-existence – from Ubuntu (Ramose, 1999; Metz, 2017) to Ukama (Murove, 2007) to Ibuanyidanda philosophy (Asouzu, 2004; Asouzu, 2007) to Harmonious monism (Ijiomah, 2014), to Integrative humanism (Ozumba & Chimakonam, 2014), the list goes on.
What this slight but important detour seeks to show is that within the traditional African metaphysical space, most thinkers are inclined to believe that spiritual entities are, indeed, existent realities. It is also speculated that these spiritual realities can and do relate with other aspects of reality and that spiritual realities are not removed from our everyday reality – at least not in the way Descartes divided mind from matter – but are an important part of our understanding of reality as a whole – even more so than Spinoza’s parallelism. These ideas should be kept in mind as they help guide any exploration of African views of the meaning of life.
a. The African God-Purpose Theory of Meaning
To answer the questions about what constitutes African conceptions of the meaning of life, one can give a few answers. The first is the African God-purpose theory. Although the God-purpose theory is not a new one—as other scholars from other philosophical traditions have written about it (see: Metz, 2007; Metz, 2013; Mulgan, 2015; Poettcker, 2015; Metz, 2019)—nor a uniquely African view, the arguments contained in the view possess salient features that are African.
For some philosophers, a belief in the existence of God is often considered unnecessary when talking about the God-purpose theory. Why this is so, it is argued, is because what is spoken about is not whether or not God exists but rather what conditions are necessary for a God-purpose theory to subsist as a viable theory of meaning (Metz, 2013, p. 80). While this is a much-appreciated argument, it is hard to agree with its logic, for if we were to presume that the idea of a God or that the existence of a God was inconceivable then we would be forced to admit that a theory of meaning based on an inconceivable God, is not conceivable – indeed one can imagine it to be nothing more than wishful thinking. If, for instance, logical arguments were made for the capture of something as inconceivable as a three-winged leopard would grant one meaning, it would be odd for one to take such a theory as a plausible theory of meaning, since three-winged leopards do not exist. The same arguments can be made with regards to the God-purpose view. One must allow for some rational belief in a conceivable God before one can make claims about a God-purpose theory of meaning. For most traditional Africans, this is precisely the case. The African God-purpose theory begins with an all-pervading belief in God or the Supreme Being (the two terms will be used interchangeably). The belief in the Supreme Being features in the everyday life of the traditional African and Pantaleon Iroegbu makes this point clear:
So far, nobody to our knowledge, has disputed the claim that in African traditional societies, there were no atheists. The existence of God is not taught to children, the saying goes. This means that the existence of God is not learnt, for it is innate and obvious to all. God is ubiquitously involved in the life and practices of the people. (Iroegbu, 1995, p. 359)
This belief is not far-fetched and the ideas that govern this belief are plausible enough to grant the African God the mantle of conceivability. The reason for this is simple: From African philosophical literature, what immediately stands out is the fact that for most traditional Africans, nothingness is impossible. What is rather suggested is the idea of being-alone (as the African metaphysical equivalent of nothingness) and being-with-others as the full expression of reality (one immediately sees the communal metaphysics at play here). The African rejection of nothingness for being-alone comes from the African understanding of God as necessarily eternal (at least regressively speaking) and also the progenitor of the universe. Thus, the term being-alone not only encapsulates a necessarily eternal God, but it also underscores a God that necessarily existed without the universe.
However, being-alone also implies an unattractive mode of living, which does not tally with the more attractive communal ontology and/or mode of living. It is for this reason that one can plausibly speculate that the existence of the universe presupposes a supreme rejection of Its (the term “It” is used as a pronoun to denote a genderless God) being-alone in exchange for a more communal relationship with the other (the universe – understood as encapsulating all other existent realities) – one that legitimises Its existence. And so, the first overarching purpose of the universe is encountered – the legitimisation of God’s existence via a communal relationship with the universe (i.e. created reality). Since God – the source from which the Universe and other realities presumably sprang from – existed prior to other forms of existence, being-alone must have been a reality at some point. In the African view, the Cartesian cogito, acknowledging one’s existence, is not enough. Existence ought to be expressed via a relationship with another. It is in this way that other realities, which emerge from God, legitimise God’s existence as a being-with-others.
With this in mind deciphering God’s purpose for man, and how that translates to meaningfulness, becomes a much easier affair. With the ultimate goal of sustaining the harmony which preserves the universe and in turn legitimises the existence of God, living a meaningful life would involve living a life that ensures harmony. Perhaps this is another reason why complementarity is widespread in most communities in traditional Africa. But as far as it concerns living a meaningful life by doing God’s will, traditional African thinkers would agree that two methods stand out – fulfilling one’s destiny and obeying the divine law.
With regards to the destiny view, what is immediately clear is that the Supreme Being is responsible for the creation of destiny, as Segun Gbadegesin tells us. Whether such a destiny is chosen by the individual or imposed on the individual is unclear, but what is important is that such a destiny emanates from God. It must be iterated that destiny – as understood here – should be distinguished from, what may be termed, “fate(fulness)”. When one receives her destiny from God, one does not imply that the individual’s life would follow such a hard (pre)deterministic path such that whatever role one plays is devoid of the free will or the ability to control the trajectory of one’s life. One can still choose to pursue a certain destiny, choose to alter a given destiny, or choose not to pursue her destiny. Destiny is then thought of as an end that is specific to each individual and for which the individual can choose whether s/he wishes to pursue it or not.
Normative progression suggests that one attains personhood as time progresses and as the individual continues to gain moral experience. In this way, the older and more morally experienced an individual is, the closer the individual is to becoming a moral genius and a person. It would be quite plausible to assume that (even though there is no consensus on the matter) destinies are handed to the individual by God, since it would be harder to imagine—if one considers the African view of the normative progression of personhood—that an it, or even a yet to be developed human person, possesses the raw rational capacity to choose something as complex as its destiny. Although Gbadegesin reminds us that good destinies exist just as bad destinies do, and also that destinies are alterable, since one can choose to ignore a bad destiny and do good instead (Gbadegesin, 2004, p. 316), one can also argue that ignoring a bad destiny to do good is no different from ignoring a good destiny to do bad things. In both cases, what is being discussed is not the alteration of one’s destiny, but simply the neglect of one’s destiny. One can go as far as to assume that by ignoring one’s assigned destiny in such a manner, what is expressed is an inability to understand how one’s destiny ties to God’s overarching purpose and/or a willingness to live a meaningless life. Thus, the pursuit of bad destinies (as assigned by God) can also lead to meaningfulness – much like the Christian gospel of salvation is predicated on the betrayal of Jesus by Judas Iscariot. Hence, within the African God-purpose view, meaningfulness readily involves the pursuit and/or fulfilment of one’s God-assigned destiny. It would be meaningful since such a pursuit/fulfilment would be considered a source of great admiration and esteem both by the individual who has done the fulfilling and the members of his/her society who have understood that s/he has fulfilled his/her destiny, and life would be meaningless if one fails to pursue his/her destiny.
Another way in which one can think of the God-purpose theory as one that confers meaning is through divine laws. Divine laws are known to the individual via different conduits that serve as representatives or messengers of the supreme – usually lesser gods, spirits, ancestors, or priests (Idowu, 2005, pp. 186-187). What these laws are vary from culture to culture but the general idea is that one must avoid certain taboos or acts that allow discord in the community, and that one must engage in certain rites, customs, or rituals to flourish as an individual and obtain meaning. Indeed, as Mbiti reminds us, failing to adhere to divine law not only ensures meaninglessness, it also affects the grand purpose of sustaining the harmony that holds the universe to him. This is why acts of reparation – commiserate with the crime – are often advised once such discord is noticed.
It can be immediately noticed that the African God-purpose theory bears on both aspects of meaningfulness – i.e. meaning in life and the meaning of life. In the first instance, it is apparent that insofar as the individual performs those actions that are directly tied to his/her destiny, then those acts constitute for him/her a moment of meaningfulness. With regards to the latter, the narrative that ties the individual’s actions together and gives it its coherence is his/her destiny – or at least the pursuit of it. It is this narrative that allows one to sit back and adjudge a whole life as meaningful or meaningless.
While the African God-purpose theory of meaning offers an interesting approach to the question of meaning, two major criticisms that it is bound to face would be the instrumentality that regals God’s purpose and the narrowness of the view. These criticisms can be levelled against most God-purpose theories – especially those of the extreme kind. By locating meaning in what God wants the individual to do, one inadvertently admits that the individual only plays a functional role in the grand scheme of things. The imposition of God’s will – through destiny and/or divine law – disregards individual autonomy (whether one has the free will to choose or not) since meaning (especially in extreme versions of the God-purpose theory) only resides in doing God’s will. The second criticism lies in the fact that the African God-purpose theory fails to capture those instances of meaning that springs neither from one’s destiny nor divine law. Thus, the individual who strives to become a musical virtuoso would fail to achieve meaningfulness if that achievement or pursuit does not tally with his assigned destiny. Yet, it can be intuited that such a pursuit counts as a moment of meaningfulness.
b. The Vital Force Theory of Meaning
The second theory of meaning that can be gleaned from African philosophical literature is the vital force theory of meaning. To understand what this theory entails, it is important to first understand what is meant by “vital force”. Vital force or life force can be described as some sort of ethereal/spiritual force emanating from God and present in all created realities. Wilfred Lajul, in explaining Maduabuchi Dukor’s views, expresses these claims quite succinctly:
Africans believe that behind every human being or object there is a vital power or soul (1989: 369). Africans personify nature because they believe that there is a spiritual force residing in every object of nature. (2017, p. 28)
This is why African religious practices, feasts, and ceremonies cannot in any way be equated to magical and idolatrous practices or fetishism. Within the hierarchy of being, the vital force expresses itself in different ways, with those in humans and ancestors, possessing an animating and rational character – unlike those in plants (which are supposedly inanimate and without rationality) and those in animals (which possess animation without the sort of rationality found in man). Indeed, Deogratia Bikopo and Louis-Jacques van Bogaert opine that:
All beings are endowed with varying levels of energy. The highest levels characterise the Supreme Being (God), the ‘Strong One’; the muntu (person, intelligent being), participates in God’s force, and so do the non-human animals but to a lesser degree…Life has its origin in Ashé, power, the creative source of all that is. This power gives vitality to life and dynamism to being. Ashé is the creative word, the logos; it is: ‘A rational and spiritual principle that confers identity and destiny to humans.’…What subsists after death is the ‘self’ that was hidden behind the body during life. The process of dying is not static; it goes through progressive stages of energy loss. To be dead means to have a diminished life because of a reduced level of energy. When the level of energy falls to zero, one is completely dead. (Bikopo & van Bogaert, 2010, pp. 44-45)
If one must take the idea of a vital force seriously, then it must be admited that the vital force forms an important part of the individual, and that it can be either diminished or augmented in several ways. To diminish one’s vital force, one must look to illness, suffering, depression, fatigue, disappointment, injustice, failure, or any negative occurrence as contributors to the diminution of vital force. In the same vein, one can posit conversely that good health, certain rituals, justice, happiness, engaging positively with others, and so forth, contribute to the augmentation and fortification of vital force. These ideas lead us to vitalism as a theory of meaning.
On what can constitute a vital force theory of meaning, it should be kept in mind that great importance is placed on augmenting one’s vital force as opposed to diminishing it. Thus, being of paramount importance, it would be important for the individual to continually fortify her vital force. This is done by engaging in certain rituals and prayers and by immersing oneself in morally uplifting acts and positive harmonious relations with one’s community and environment. Thus, meaningfulness is obtained by the continuous augmentation of one’s vital force and/or those of others via the processes outlined above. Indeed, the well criticised Tempels alludes to this when he states that ‘Supreme happiness, the only kind of blessing, is, to the Bantu, to possess the greatest vital force: the worst misfortune and, in very truth, the only misfortune, is, he thinks, the diminution of this power’ (Tempels, 1959, p. 22). On the other hand meaninglessness, and indeed death would involve the inability to augment one’s life force and/or actively engaging in acts that seek to diminish one’s vital force or those of others. This theory of meaning focuses on a transcendental goal whose mode of achievement usually involves acts that garner much esteem and admiration. Thus, by enhancing her vital force, the individual engages in something that is inherently meaningful and valuable.
Beyond this traditional view of vitalism, some scholars of African philosophy have also put up a more naturalistic account of meaning that avoids the problems (mainly of proof) associated with theories dealing with spiritual entities (see: Dzobo 1992; Kasenene 1994; Mkhize 2008, Metz 2012). Within this naturalistic understanding, what is referred to as vital force are wellbeing and creative power, rather than the spiritual force of Tempels’ Bantu ontology. So meaningfulness would then involve engaging in those acts that constantly improve one’s wellbeing and engaging ones creative power freely.
Some criticisms can be leveled against the vital force theory. First is the more obvious denial of the existence of spiritual essences within the human body – especially since the brain and the nervous system are thought of as responsible for animating the human body and for the cognitive abilities of a human person (see: Chimakonam, et al., 2019). The second criticism focuses on the naturalist account and argues that ideas about wellbeing and creative power need not bear the moniker of vitalism to make sense. Indeed, one can refer to the pursuit of wellbeing or the expression of creative genius as separate paths to meaningfulness that need not be seen as vitalist.
c. The Communal Normative Theory of Meaning
The third African theory of meaning has been termed “the communal normative function theory” (Attoe, 2020). This theory of meaning is based on one of the most widespread views in African philosophy – communalism. This idea has been discussed by various African philosophers such as Mbiti, Khoza, Mabogo Ramose, Menkiti, Asouzu, Murove, Ozumba & Chimakonam, Metz, and so forth, in various guises and with reference to the several branches of African philosophy ranging from African metaphysics, African logic, African ethics and even down to African socio-political philosophy. An understanding of communalism is necessary to see how this view speaks to conceptions.Communalism is founded on a metaphysics that understands various realities as missing links of an interconnected complementary whole (Asouzu, 2004). This ontology then flows down to human communities and social relationships, where the attainment of the common good and the attainment of personhood is invariably tied to how one best expresses herself as that missing link. So, within this framework, interconnectedness as encapsulated in ideas such as harmony, solidarity, inclusivity, welfarism, familyhood, and so forth, play a prominent role. This is why dicta like Mbiti’s famous dictum ‘I am because we are and since we are, therefore I am’ (Mbiti, 1990, p. 106) or the Ubuntu mantra “… A person is a person through other persons…” or expressions like “..one finger cannot pick up a grain…. “ (Khoza, 1994, p. 3) are commonplace in explaining communalism. Scholars, like Menkiti have therefore gone on to even tie individual personhood to how well the individual tries to live communally and engage with the community.
From this understanding of communalism, a theory of meaning emerges where meaning involves engaging harmoniously with others. By engaging positively with others, the individual seeks to acquire humanity in its most potent form, and it is by acquiring and enhancing this humanity or this personhood that the individual also acquires meaning. By engaging harmoniously with others, the individual sheds petty animal desires, especially those that spring from selfishness, and instead focuses on moral/normative goals that transcend the individual and centre on communal flourishing. Thus, within this framework, the lives of individuals such as Nelson Mandela or Mother Theresa would count as meaningful because of their constant striving to ensure harmony and uplift the lives of others. While the meaningfulness is gained by performing one’s communal normative function, meaninglessness would then subsist in either not engaging positively with others or performing those acts that ensure disharmony or discord – which, in turn, leads to the loss of one’s humanity.
While being an attractive/plausible theory of meaningfulness from the African space, the major shortcoming of the communal normative function theory is that it does not accommodate other meaningful acts that are not designed for, or may not contribute to, communal upliftment. Thus, if our music enthusiast were to engage in her pursuit of achieving virtuoso status and did so without seeking to engage with others with her music, that achievement would not count as a meaningful act.
d. The Love Theory of Meaning
In an earlier paper titled “On Pursuit of the Purpose of Life: The Shona Metaphysical Perspective”, love (which according to Mawere is similar to the Greek concept of Agape) is understood in this context as the “unconditional affection to do and promote goodness for oneself and others, even to strangers” (Mawere, 2010, p. 280).
A few things can be noted from the above. The first point is that love is an emotion from which the desire to do good emanates. As an emotion, one can speculate that love is an emotional feature available to all human beings in the same way that rationality, anger, and happiness are emotions that are also available to every human being. This point is easily countered by various heinous acts that many human beings have perpetrated throughout history; genocides of all kinds portray a hateful instinct rather than a loving one. However, the response to this point would be that love is not the only emotion that the human being is born with, hence the expression of other unpleasant emotions. The second response would be that love is a capacity that is nurtured. Mawere points to this fact:
However, one may wonder why some human beings do not love if love is a natural gift and the sole purpose of life. It is the contention of this paper that the virtuous quality of love though natural is nurtured by free will. (Mawere, 2010, p. 281)
Mawere is vague with regards to what he means by “free will”, and one can only speculate. However, the preferable route to take in describing how love/agape is nurtured would be to think of it in terms of deliberate cultivation and/or expression of love. When one actively seeks to promote goodness for one’s self and others, one is nurturing a habit that takes advantage of our presumably innate capacity to love. By ridding one’s self of the blockades to unconditional love such as self-interest, nepotistic attitudes, unforgiveness, and so on, the individual begins to find himself or herself expressing love in the way Mawere envisions viz. “unconditional affection to do and promote goodness for oneself and others, even to strangers” (Mawere, 2010, p. 280).
It is from this framework that Yolanda Mlungwana (2020) draws her notion of love. Mlungwana tells us that Mawere’s theory of Rudo (love) is different from Susan Wolf’s version of the love theory. According to her, “Insofar as people are the only objects of love for Mawere, his sense of “love” differs from Susan Wolf’s influential account, according to which it is logically possible to love certain activities, things or ideals”. So, while the love theory of meaning is, for Mawere, people-centred or anthropocentric, the love view for Wolf is much more encompassing and may feature a variety of objects that are not exactly human. One can show love to the environment by advocating for and trying to perpetuate a greener planet earth. It could also be a love for abandoned animals or an endangered species. Of course, one can wonder, here, about the narrowness of the traditional African love theory of meaning, and it is a valid critique to have. But the point here is that the scope of the African love view only encapsulates human beings.
It is agreed by the African love theorists that the purpose of existence is love, and it is the sole purpose of human existence. While this might seem a strange claim to make (since one can think of certain acts that are prima facie meaningful – say becoming a musical virtuoso – without necessarily being an act of love), one must first understand what the claim means before settling for certain conclusions. According to Mawere, love permeates all aspects of human relations with others and society:
The Shona consider Agape as the basis of all good relations in society, and therefore as the purpose of everyone’s life. In fact, for the Shonas, all other duties of man on earth such as reproducing, sharing, promoting peace, respecting others (including the ancestors and God), among others have their roots in love. Had it not been love which is the basis of all relationships, it was impossible to promote peace, respect others. In fact, meaningful life on earth would have been impossible. (Mawere, 2010, p. 280)
This point immediately demonstrates that the idea of love is not as one-dimensional as one would think it – doing good to others and one’s self. In our everyday lives and our everyday practices, insofar as we relate with others in some way, such a relation can be rooted in love. Love, in this sense, is not solely thought of in terms of a direct show of some sort of altruistic behaviour towards a person, community, or thing, it is manifested in many differing things/activities like judgments and reparation, business dealings, governance/leadership, teaching/learning, sportsmanship, self-improvement, and so forth. In this way, our musician, who aims to become a virtuoso (whether or not he plays for an audience), does so because he wishes to improve himself – a manifestation of self-love.
When human beings fail to nurture love and begin to manifest hate, problems begin to arise. Mlungwana alludes to this point when she states “In the absence of love, which is the foundation of all relationships, there is no encouragement of peace, respect, etc.” Since love is the purpose of human existence (for people like Mawere), an existence that shows an absence of love is one that is simply meaningless. This meaninglessness further degenerates into anti-meaning when the individual not only fails to show love but actively seeks to pursue hate.
It is hard to fault the love view but one point stands out: Suppose one’s attempt at love causes harm to some other person, could such an act be judged as meaningful? For instance, suppose a person trains to become a brilliant Special Forces Soldier (self-love) to serve his country (love towards his/her society). Let us further suppose that in service to his country, this individual is responsible for the death and destruction of other communities. While his/her dedicated service to his/her country can be seen as an act of love, the destruction of lives and communities constitutes an act of hate (at least from the viewpoint of the communities he has negatively affected). This problem is compounded by the fact that for Mawere, this love must be unconditional. One response to this problem would be that meaningfulness is not as objective a value as one might like to think. By extension, since meaningfulness is subjective and love/hate is context-dependent, the individual’s life is both meaningful and meaningless, depending on the context involved. While this point might seem problematic within a two-valued logical system, such a view is well captured in the dominant trivalent logical systems in African philosophy, such as Chimakonam’s Ezumezu logic.
e. The Yoruba Cluster Theory of Meaning
Another theory of meaning emanating from African philosophy is what may be described as the “Yoruba Cluster View” (YCV). It is so-called because the cultural values which cluster to form this particular view emanate from the dominant views that are found in traditional Yoruba thought. This view was first systematically articulated by Oladele Balogun (2020) and, to some extent, Benjamin Olujohungbe (2020).
The Yoruba conception of meaningfulness is anchored on what Balogun refers to as, a holism. What this holism represents is a theory of meaningfulness that is not based on single isolated paths to meaningfulness (call it a monism) but is instead based on a conglomeration of different complementary “interwoven and harmonious accounts of a meaningful life considered as a whole and not in isolation” (Balogun, 2020, p. 171). Metz had made a similar move when defining the concept “meaning” but Balogun’s pluralism (or holism, as he prefers) is different in two main ways. These paths to meaningfulness are, for Balogun, necessarily dependent on each other, since he is quick to conclude that “isolating one condition from the other alters the constitutive whole” (Balogun, 2020, p. 171).
These interwoven paths to meaningfulness are drawn from a series of normative values that are referred to as “life goods”. These life goods mainly encompass certain normative values that reflect the spiritual social ethical and epistemological experiences. According to Balogun:
The term “life goods” refers to material comfort symbolised with monetary possession, a long healthy life, children, a peaceful spouse and victory over the vicissitudes of existence. The fulfilment of such “life goods” at any stage of human existence is accompanied by the remarks “X has lived a meaningful life” or “X is living a meaningful life”, where X is a relational agent in a social network. The “life goods”, though materialistic and humanistic, are factual goals providing reasons for how the Yorùbá ought to act in daily life. To the extent that such “life goods” ground and guide human actions, and humans are urged to strive towards them in deeds and acts; they are normative prescriptions in the Yorùbá cultural milieu. (Balogun, 2020, p. 171)
These life goods are positive values, and in desiring to acquire this cluster of values, the individual ipso facto desires meaningfulness. What this invariably means is that acquiring meaningfulness would involve a subjective pursuit of these seemingly objective normative values. On the other hand, acquiring one of these values alone, according to the view, would not translate to living a meaningful life, since these values are thought of as means (not ends) to an end (meaningfulness) and since the necessity of acquiring all the values in the cluster is what gives life meaning, according to the view.
Drawing from William Bascom, Balogun identifies this cluster of values to include the following: ranked in order of importance: long life or “not death” (aiku), money (aje, owo), marriage or wives (aya, iyawo), children (omo), and victory (isegun) over life’s vicissitudes. These values can serve as a yardstick for which one can measure if his/her life is meaningful. Furthermore, the judgment call about whether a life is meaningful or not is not merely tied to subjective valuing, it is also subject to external valuing. In this way, even when one is dead, that individual’s life can still be adjudged as meaningful by those external to the individual (that is, other living persons outside the individual). What this means is that the view that death undercuts meaningfulness in some way, does not hold for friends of the Yoruba cluster view. This is because death is merely a transition to more life, either as an ancestor or as another person, via reincarnation, and because even in death, individuals that are external to the dead individual can still make judgments about the meaning of his/her life.
The firm belief in life after death also allows ancestors to attempt to find meaningfulness themselves since they are very much alive. It is safe to assume that the cluster view does not apply in this particular instance, since aje, aya/iyawo and omo, are not achievable goals for ancestors. However, it is expected that ancestors intervene in the lives of the living by “[protecting] the clan and sanctioning of moral norms among the living” (Balogun 2020, 172).
The cluster view fully expresses itself as a theory of meaning when one realises that the values that make up the cluster intertwine with, and complement, each other. At the base is Aiku which undergirds any claim to meaningfulness. This is because a short-lived life is essentially derived the opportunity to pursue those means that allow for a meaningful life. Menkiti’s suggestion that children cannot attain personhood is instructive here. So, taking good care of one’s health and living a long life provides the individual with the time to achieve other values that are believed to constitute a meaningful life. Aje/owo offers the sort of material comfort that enables the individual to lead a comfortable life and one that allows the individual to take care of others. It is by possessing aje that the individual acquires the financial capacity to marry a wife/wives and take care of children. Without an iyawo, on the other hand, children are not possible except if one decides to bear children outside wedlock (which is frowned upon). As Balogun puts it, A life without a marital relationship and children is culturally held among the Yorùbá as meaningless. Given the pro-communitarian attitude of the Yorùbá, procreation within a network of peaceful spousal relationships is considered necessary for expanding the family lineage and clan. All this, combined with isegun, comes together to form a life that can be looked at an branded as meaningful.
Balogun goes further to augment these cluster values with morality. For the Yoruba, according to Balogun, “a meaningful life is a moral life. Within the Yorùbá cultural milieu, there is no clear demarcation between living a meaningful life and a moral life, for both are associated with each other” (Balogun 2020, 173). Olujohungbe, points to this fact when he concludes that a life filled with quality (one can say, a life that has achieved the cluster values that Balogun alludes to), must also be a morally good life:
A virtuous character thus trumps all other values such as long life, health, wealth, children and those other attributes making up the purported elements of well-being. In this connection, a distinction is thus made between quantity and quality of life. For a vicious person who dies at a “ripe” old age leaving behind children, wealth and other material resources, society often (though clandestinely) says akutunku e l’ona orun – which literally means “may you die severally on your way to heaven” and actually implies good riddance to bad rubbish. (Olujohungbe, 2020, p. 225)
So when one focuses the cluster values towards positively engaging with others, such an individual is living a meaningful life. Thus, when one purposes to alleviate the poverty of others with the aje/wealth that s/he has acquired, s/he is living a meaningful life. If one procreates and bears children, and guides those children into becoming good members of society, that individual is living a meaningful life. This applies to all the other cluster values used in tandem with each other.
f. Personhood and a Meaningful Life
Flowing from the views of scholars like Ifeanyi Menkiti and Kwame Gyekye and systematized into a theory of meaningfulness by Motsamai Molefe is the idea that attaining personhood can invariably lead to a meaningful life. Personhood, in African philosophical thought, is tied to more than mere existence. In other words, merely being a human being that exists is not a sufficient condition for personhood. One must exhibit personhood in others to be called a person.
There has been some debate in African philosophical thought, mainly between Menkiti and Gyekye, about the status of babies and young children with regards to personhood. Menkiti takes the more radical stand that children possess no personhood, and so cannot be persons. Gyekye, on the other hand, supposes that children are born with some level of personhood, and that this personhood, being in its nascent form, can be augmented by one’s level of normative function. What they both agree on, however, is that personhood is achievable and that one can strive to attain the highest form of personhood through positive relationships with the people in one’s community and with one’s culture.
Based on this general framework, Motsamai Molefe provided a systematic account of meaning based on the African idea of personhood. For him, meaningfulness begins when the individual develops those capacities and virtues that allow it to become the best of its kind. Life becomes meaningful with the acquisition of these virtues and “the conversion of these raw capacities to be bearers of moral excellence” (Molefe, 2020, p. 202).
So, because these virtues are bearers of moral excellence, Molefe further informs us that a meaningful life would, according to this theory, be construed in terms of the agent achieving the moral end of moral perfection or excellence. Moral excellence is not automatic, and like Menkiti opines, unlike other types of geniuses, moral geniuses (who have acquired moral excellence), only acquire that status after a long period of time. As a matter of fact, the passage of time only serves to enhance one’s moral experience.
Molefe also ties his idea of personhood and theory of meaning to dignity. Echoing Gyekye’s ideas, Molefe asserts that every individual has the capacity to be virtuous and that every individual ought to build that capacity to a reasonable level. In his words:
Remember, on the ethics of personhood, we have status or intrinsic dignity because we possess the capacity for moral virtue. The agent’s development of the capacity for virtue translates to moral perfection, which we can also think of in terms of a dignified existence. This kind of dignity is the one that we achieve relative to our efforts to attain moral perfection – achievement dignity. As such, a meaningful life is a function of a dignified human existence qua the development of the distinctive human capacity for virtue. I also emphasise that the agent is not required to live the best possible human life. The requirement is that the agent ought to reach satisfactory levels of moral excellence for her life to count as meaningful. (2020, p. 202)
The requirement to have satisfactory levels of moral excellence does not preclude the individual from aiming to live the best possible life. In this way, the requirement to have satisfactory levels of moral excellence stands as a bare minimum for one’s life to be considered meaningful. This allows us to think about the meaningfulness of life in terms of degrees of meaningfulness. In this way, if I live a meaningful life by sufficiently exuding some moral virtues, my life would be meaningful, but not to the degree that one may consider Nelson Mandela’s life to be meaningful.
g. The Conversational Theory
According to Chimakonam (2021), Conversationlism is a theory of meaning-making that strives to improve two main significists – the nwa-nsa and the nwa-nju – through the process intellectual/creative struggle (a process that conversationalists call “arumaristics”), anchored by the construction, deconstruction and reconstruction of seemingly contrary viewpoints. While Conversationalism is focused on conceptual forms of meaning-making, its implications for life is also apparent.
Meaning-making is a matter of conversations within one’s self and between one’s self and the objective values of the various contexts that s/he encounters in life. Within one’s self, meaning lies in self-improvement, achieved through the interrogation of one’s okwu (values, viewpoints, prejudices, and so forth), as Chimakonam (2019) calls it. It is not just that this okwu, which forms the content of the individual’s life, is improved, but the ability to ask new questions in a life-long dialogue is also improved. By questioning himself/herself and finding answers to those questions, the individual improves his/her okwu, each time at higher levels of sophistication. This positive augmentation of one’s okwu is precisely what makes life meaningful for the conversationalist, at least from a subjective point of view.
But the individual also exists within a community, and, for the conversationalists, the ideas often called objective are mainly the intellectual contributions of subjective individuals who belong to a particular communal context. What, then, counts as objective meaning? Objective meaning, in Conversationalism, would involve the individual’s ability to either imbibe (as the nwa-nsa) or interrogate (as the nwa-nju) the ideas, actions or values that a communal context considers worthwhile. By imbibing those values or performing those actions that one’s communal context considers valuable, the individual pursues ends that are worthy for their own sake, merits esteem and admiration, and transcends his/her animal nature – s/he identifies with ends that are beyond him/her. By assuming the role of nwa-nju (or questioner), s/he makes his/her life meaningful by allowing the improvement of the very values for which this communal context relies as purveyors of meaningfulness. In this way, the individual’s life becomes meaningful by becoming a meaning-maker or meaning-curator.
3. Conclusion
What has been presented above are six plausible theories of life’s meaning that can be gleaned from traditional African philosophical thought. While these six accounts of meaning may not necessarily account for all the possible theories of meaning that can be hewed from the African worldview, it is a good start that invites contributions and critical engagements from philosophers interested in this subject-matter. How attractive are these theories of meaningfulness? Are there contemporary African alternatives to the more traditional views of meaningfulness? Are there more pessimistic accounts within the corpus of African thought that embrace a more nihilistic approach to meaningfulness?
4. References and Further Reading
Agada, A., 2020. The African vital force theory of meaning in life. South African Journal of Philosophy, 39(2), pp. 100-112.
The article articulates and discusses the African vital force theory.
Asouzu, I., 2004. Methods and Principles of Complementary Reflection in and Beyond African Philosophy. Calabar: University of Calabar Press.
In this book, Innocent Asouzu develops his idea of complementary as a full-fledged philosophical system.
Asouzu, I., 2007. Ibuanyidanda: New Complementary Ontology Beyond World Immanentism, Ethnocentric Reduction and Impositions. Zurich: LIT VERLAG GmbH.
This book exposes some of the problems in Aristotelian metaphysics and builds a new complementary ontology.
Attoe, A., 2020. Guest Editor’s Introduction: African Perspectives to the question of Life’s Meaning. South African Journal of Philosophy, 39(2), pp. 93-99.
This article offers an introductory overview to the discussions about life’s meaning from an African perspective.
Attoe, A., 2020. A Systematic Account of African Conceptions of the Meaning of/in Life. South African Journal of Philosophy, 39(2), pp. 127-139.
This article curates from available clues, three African conceptions of meaning, namely, the African God’s purpose theory, the vital force theory and the communal normative function theory.
Attoe, A. & Chimakonam, J., 2020. The Covid-19 Pandemic and Meaning in life. Phronimon, 21, pp. 1-12.
This article considers the impact of the COVID-19 on the question of life’s meaning.
Balogun, O., 2007. The Concepts of Ori and Human Destiny in Traditional Yoruba Thought: A Soft-Deterministic Interpretation. Nordic Journal of African Studies, 16(1), pp. 116-130.
This article looks at the concept of destiny in traditional Yoruba thought.
Balogun, O., 2020. The Traditional Yoruba Conception of a Meaningful Life. South African Journal of Philosophy, 39(2), pp. 166-178.
This article examines the account of what makes life meaningful in traditional Yoruba thought.
Bikopo, D. & van Bogaert, L.-J., 2010. Reflection on Euthanasia: Western and African Ntomba Perspectives on the Death of a King. Developing World Bioethics, 10(1), pp. 42-48.
The focus of this article is the Ntomba belief about vitality and ritual euthanasia and its implications for the idea of euthanasia in African thought.
Chimakonam, J., Uti, E., Segun, S. & Attoe, A., 2019. New Conversations on the Problems of Identity, Consciousness and Mind. Cham: Springer Nature.
This book attempts to provide answers to the age-old problem of identity, mind-body problem, qualia, and so forth.
Chimakonam, J., 2019. Ezumezu: A System of Logic for African Philosophy and Studies. Cham: Switzerland.
This book is a novel attempt at curating and systematising African logic.
Chimakonam, J., 2021. On the System of Conversational Thinking: An Overview. Arumaruka: Journal of Conversational Thinking, 1(1), pp. 1-46.
This article presents a survey of the concept of conversationalism.
Descartes, R., 1641. Meditations on first philosophy. Cambridge: Cambridge University Press (1996).
This famous book discusses issues like the existence of God, the existence of the soul/self, and so forth.
Dzobo, N., 1992. Values in a Changing Society: Man, Ancestors and God. In: K. Wiredu & K. Gyekye, eds. Person and Community: Ghanian Philosophical Studies. Washington: Center for Research in Values and Philosophy, pp. 223-240.
In this chapter, Noah Dzobo discusses some African values, the sanctity/value of life, ancestorhood and the idea of vital force.
Gbadegesin, S., 2004. Towards A Theory of Destiny. In: K. Wiredu, ed. A Companion to African Philosophy. Oxford: Blackwell Publishing, pp. 313 – 323.
This article provides an in-depth discussion of the idea of destiny in African (Yoruba) thought.
Gyekye, K., 1992. Person and Community in Akan Thought. In: K. Wiredu & K. Gyekye, eds. Person and Community. Washington D.C.: The Council for Research in Values and Philosophy, pp. 101-122.
Gyekye, in this chapter, challenges Ifeanyi Menkiti’s radical communitarianism and discusses the idea of moderate communitarianism from the Akan perspective.
Idowu, W., 2005. Law, Morality and the African Cultural Heritage: The Jurisprudential Significance of the Ogboni Institution. Nordic Journal of African Studies, 14(2), pp. 175-192.
This paper examines the nature of the concepts of law and morality from the Yoruba (Ogboni group) perspective.
Ijiomah, C., 2014. Harmonious Monism: A philosophical Logic of Explanation for Ontological Issues in Supernaturalism in African Thought. Calabar: Jochrisam Publishers.
This book provides the first real look into the logic and ontology of harmonious monism.
Iroegbu, P., 1995. Metaphysics: The Kpim of Philosophy. Owerri: International University Press.
This book provides an overview of metaphysics, and introduces its own novel uwa ontology, which is based on the African view.
Khoza, R., 1994. Ubuntu, African Humanism. Diepkloof: Ekhaya Promotions.
This book provides a critical exposition of the Southern African notion of Ubuntu.
Lajul, W., 2017. African Metaphysics: Traditional and Modern Discussions. In: I. Ukpokolo, ed. Themes, Issues and Problems in African Philosophy. Cham: Palgrave Macmillian, pp. 19-48.
This chapter provides a brief survey of some of the issues discussed in African Metaphysics.
Mawere, M., 2010. On Pursuit of the Purpose Life: The Shona Metaphysical Perspective. The Journal of Pan African Studies, 3(6), pp. 269-284.
This article seeks to establish the idea of “love” as the purpose of human existence.
Mbiti, J., 1990. African Religion and Philosophy. London: Heinemann.
This famous book provides an overview of some of the religious and philosophical beliefs of some societies in Africa.
Mbiti, J., 2012. Concepts of God in Africa. Nairobi: Acton Press.
This book provides an overview of the various ideas about God in some African societies.
This book introduces readers to traditional African religious philosophy.
Menkiti, I., 2004a. On the Normative Conception of a Person. Oxford: Blackwell Publishing.
This chapter outlines Menkiti’s radical views about personhood in African thought.
Menkiti, I., 2004b. Physical and Metaphysical Understanding: Nature, Agency, and Causation in African Traditional Thought. In: L. Brown, ed. African Philosophy: New and Traditional Perspectives. Oxford: Oxford University Press, pp. 107-135.
This article focuses on the idea of causation in African metaphysics.
Metz, T., 2007. God’s Purpose as Irrelevant to Life’s Meaning: Reply to Affolter. Religious Studies, Volume 43, pp. 457-464.
In this article, Metz responds to Jacob Affolter’s claim that an extensionless God could ground/grant the type of purpose that makes life meaningful.
Metz, T., 2012. African Conceptions of Human Dignity: Vitality and Community as the Ground of Human Rights. Human Rights Review, 13(1), pp. 19-37.
In this article, Metz argues for a more naturalistic account of vitality, based on creativity and wellbeing, that could ground human dignity.
Metz, T., 2013a. Meaning in Life. Oxford: Oxford University Press.
In this book, Metz provides an analytic discussion of the question of meaning in life.
Metz, T., 2017. Towards an African Moral Theory (Revised Edition). In: Themes, Issues and Problems in African Philosophy. Cham: Palgrave Macmillian, pp. 97-119.
In this article, Metz provides a slightly revised version of an earlier article (with the same name), outlining his account of African metaphysics.
Metz, T., 2019. God, Soul and the Meaning of Life. Cambridge: Cambridge University Press.
This short book provides an overview of supernaturalistic accounts of life’s meaning.
Metz, T., 2020. African Theories of Meaning in Life: A Critical Assessment. South African Journal of Philosophy, 39(2), pp. 113-126.
In this article, Metz discusses vitalist and communalistic accounts of meaning.
Mlungwana, Yolanda., 2020. An African Approach to the Meaning of Life. South African Journal of Philosophy, 39(2), pp. 153-165.
In this article, Yolanda Mlungwana provides an examination of some African accounts of meaning such as the life, love and destiny theories of meaning.
Mlungwana, Yoliswa., 2020. An African Response to Absurdism. South African Journal of Philosophy, 39(2), pp. 140-152.
In this article, Yoliswa Mlungwana revisits Albert Camus’ absurdism in the light of African religions and philosophy.
Molefe, M., 2020. Personhood and a Meaningful Life in African Philosophy. South African Journal of Philosophy, 39(2), pp. 194-207.
This article provides an account of meaning that is based on African views on personhood.
Mulgan, T., 2015. Purpose in the Universe: The Moral and Metaphysical Case for Ananthropocentric Purposivism. Oxford: Oxford University Press.
The book argues for a cosmic purpose, but one for which human beings are irrelevant.
Murove, M., 2007. The Shona Ethic of Ukama with Reference to the Immortality of Values. The Mankind Quarterly, Volume XLVIII, pp. 179-189.
This article examines the Shona relational ethics of Ukama.
Nagel, T., 1979. Mortal Questions. Cambridge: Cambridge University Press.
The book explores issues related to the question of life’s meaning, nature, and so forth.
Nagel, T., 1987. What Does It All Mean? A Very Short Introduction to Philosophy. Oxford: Oxford University Press.
This book discusses some of the central problems in Western philosophy.
Nalwamba, K. & Buitendag, J., 2017. Vital Force as a Triangulated Concept of Nature and s(S)pirit. HTS Teologiese Studies/Theological Studies, 73(3), pp. 1-10.
This article examines the concept of vitality in African thought.
Okolie, C., 2020. Living as a Person until Death: An African Ethical Perspective on Meaning in Life. South African Journal of Philosophy, 39(2), pp. 208-218.
This article discusses attaining personhood as a possible route to meaningfulness.
Olujohungbe, B., 2020. Situational Ambivalence of the Meaning of Life in Yorùbá Thought. South African Journal of Philosophy, 39(2), pp. 219-227.
This article provides an account of Yoruba conception of meaning.
Ozumba, G. & Chimakonam, J., 2014. Njikoka Amaka Further Discussions on the Philosophy of Integrative Humanism: A Contribution to African and Intercultural Philosophy. Calabar: 3rd Logic Option.
This book introduces the idea of Njikoka Amaka or integrative humanism.
Poettcker, J., 2015. Defending the Purpose Theory of Meaning in Life. Journal of Philosophy of Life, 5, pp. 180-207.
The article provides an interesting defence of the purpose theory.
Ramose, M., 1999. African Philosophy through Ubuntu. Harare: Mond Books.
In this book, Mogobe Ramose provides a systematic account of Ubuntu and the ontology that undergirds it.
Taylor, R., 1970. The Meaning of Life. In: R. Taylor, ed. Good and Evil: A New Direction. New York: Macmillian, p. 319–334.
This chapter is an honest discussion on the reality of meaninglessness in relation to the question of life’s meaning.
Sentences such as “Galileo believes that the earth moves” and “Pia hopes that it will rain” are used to report what philosophers, psychologists, and other cognitive scientists call propositional attitudes—for example, the belief that the earth moves and the hope that it will rain. Just what propositional attitudes are is a matter of controversy. In fact, there is some controversy as to whether there are any propositional attitudes. But it is at least widely accepted that there are propositional attitudes, that they are mental phenomena of some kind, and that they figure centrally in our everyday practice of explaining, predicting, and rationalizing one another and ourselves.
For example, if you believe that Jay desires to avoid Sally and has just heard that she will be at the party this evening, you may infer that he has formed the belief that she will be at the party and so will act in light of this belief so as to satisfy his desire to avoid Sally. That is, you will predict that he will not attend the party. Similarly, if I believe that you have these beliefs and that you wish to keep tabs on Jay’s whereabouts, I may predict that you will have made the prediction that he will not attend the party. We effortlessly engage in this sort of reasoning, and we do it all the time.
If we take our social practices at face value, it is difficult to overstate the importance of the attitudes. It would seem that, without the attitudes and our capacity to recognize and ascribe them, as Daniel Dennett colorfully puts it, “we could have no interpersonal projects or relations at all; human activity would be just so much Brownian motion; we would be baffling ciphers to each other and to ourselves—we could not even conceptualize our own flailings”.
In fact, if we follow this line of thought, it seems right to say that we would not even be baffled. Nor would we have selves to call our own. So central, it seems, are the attitudes to our self-conception and so effortlessly do we recognize and ascribe them that one could be forgiven for not realizing that there are any philosophical issues here. Still, there are many. They concern not just the propositional attitudes themselves but, relatedly: propositional attitude reports, propositions, folk psychology, and the place of the propositional attitudes in the cognitive sciences. Although the main focus of this article is the propositional attitudes themselves, these other topics must also be addressed.
The article is organized as follows. Section 1 provides a general characterization of the propositional attitudes. Section 2 describes three influential views of the propositional attitudes. Section 3 describes the primary method deployed in theorizing about the propositional attitudes. Section 4 describes several views of the nature of folk psychology and the question of whether there are in fact any propositional attitudes. Section 5 briefly surveys work on a range of particular mental phenomena traditionally classified as propositional attitudes that might raise difficulties for the general characterization of propositional attitudes provided in Section 1.
1. General Characterization of the Propositional Attitudes
This section provides a general characterization of the propositional attitudes.
a. Intentionality and Direction of Fit
The propositional attitudes are often thought to include not only believing, hoping, desiring, predicting, and wishing, but also fearing, loving, suspecting, expecting, and many other attitudes besides. For example: fearing that you will die alone, loving that your favorite director has a new movie coming out, suspecting that foreign powers are meddling in the election, and expecting that another recession is on the horizon. Generally, these and the rest of the propositional attitudes are thought to divide into two broad camps: the belief-like and the desire-like, or the cognitive ones and the conative ones. Among the cognitive ones are included believing, suspecting, and expecting; among the conative ones are included desiring, wishing, and hoping.
It is common to distinguish these two camps by their direction of fit, whether mind-to-world or world-to-mind (Anscombe 1957[1963], Searle 1983, 2001, Humberstone 1992). If an attitude has a mind-to-world direction of fit, it is supposed to fit or conform to the world; whereas if it has a world-to-mind direction of fit, it is the world that is supposed to conform to the attitude. The distinction can be put in terms of truth conditions or satisfaction conditions. A belief is true if and only if the world is the way it is believed to be, and is otherwise false; a desire is satisfied if and only if the world comes to be the way it is desired to be, and is otherwise unsatisfied. (In this respect, beliefs are akin to assertions or declarative sentences and desires to commands or imperative sentences.) In both cases, the attitude is in some sense directed at the world.
Accordingly, the propositional attitudes are said to be intentional states, that is, mental states which are directed at or about something. Take belief. If you believe that the earth moves, you have a belief about the earth, to the effect that it moves. More generally, if you believe that a is F, you have a belief about a, to the effect that it is F. The state of affairs a being F might also be construed as what one’s belief is about. That a is F is then the content of one’s belief, to wit, the proposition that a is F. One’s belief is true if and only if a is indeed F, that is, if the state of affairs a being F obtains. On common usage, an obtaining state of affairs is a fact.
(On various deflationary theories of truth, either there cannot be or there is no need for a substantive theory of facts. However, all that is required here is a very thin sense of fact. To admit the existence of facts, in this relevant thin sense, one needs only to accept that which propositions are true depends on what the world is like. To be sure, the recognition of true modal, mathematical, and moral claims, among others, raises many vexing questions for any attempt to provide a substantive theory of facts; but we can set these aside. No particular metaphysics of facts is here required. For further discussion, see the articles on truth, truthmaker theory, and the prosentential theory of truth.)
Of course, not every proposition is about some particular object. Some are instead general: for example, the proposition that whales are mammals, or the proposition that at least one person is mortal. All the same, there are some conditions that must obtain if these general propositions are to be true. If these conditions do not obtain, the propositions are false. If one believes that whales are mammals, one’s belief is true if and only if it is a fact that whales are mammals; and if one believes that at least one person is mortal, one’s belief is true if and only if it is a fact that at least one person is mortal.
According to many theorists, it is constitutive of belief that one intends to form true beliefs and does not hold a belief unless one takes it to be true (Shah and Velleman 2005). That is, on such a view, if one’s mental state is not, so to speak, governed or regulated by the norm of truth, it is not a belief state. Indeed, the idea is sometimes put in explicitly normative terms: one ought to form true beliefs, and one ought not to hold a belief unless it is true (Wedgwood 2002). It is thus sometimes said that belief aims at truth.
Desire is often said to work similarly. If you desire, for example, that you be recognized by your teammates for your contributions to the team, this desire will go unsatisfied unless and until the world becomes such that you are recognized by your teammates for your contributions to the team. Often, if not always, one desires what one perceives or believes to be good. Thus, it is sometimes said that while belief aims at what is (believed to be) true, desire aims at what is (perceived or believed to be) good. In one form or another, this view has been held by major figures like Plato, Aristotle, and Immanuel Kant.
b. Conscious and Unconscious Attitudes
Famously, Franz Brentano characterized intentionality as the “mark of the mental,” that is, as a necessary and sufficient condition for mentality. On some views, all intentional states are propositional attitudes (Crane 2001). Putting these two views together, it would follow that all mental states are propositional attitudes (Sterelny 1990). Other philosophers hold that there are intentional states that are not propositional attitudes (see Section 5). On still other views, there are non-intentional, qualitative mental states. Candidates include sensations, bodily feels, moods, emotions, and so forth. What is distinctive of these latter mental states is that there is something it is like to be in them, a property widely considered as characteristic of phenomenally conscious states (Nagel 1974). Most theorists have written as if the propositional attitudes do not have such qualitative properties. But others claim that the attitudes, when conscious, have a qualitative character, or a phenomenology—to wit, a cognitive phenomenology.
Some theorists have claimed that there is a constitutive connection between consciousness and mentality: mental states must be at least potentially conscious (Searle 1992). Other theorists, including those working in computational psychology, allow that some mental states might never be conscious. For example, the sequences of mental states involved in processing visual information or producing sentences may not be consciously accessible, even if the end products are. Perhaps, similarly, some propositional attitudes (possessed by some subject) are—and will always be—unconscious.
For example, if linguistic competence requires knowledge of the grammar of the language in question, and knowledge is (at least) true belief, then linguistic competence involves certain propositional attitudes. (This conditional is controversial, but it will still serve as an illustration. See Chomsky 1980.) Manifestly, however, being linguistically competent does not require conscious knowledge of the grammar of one’s language; otherwise, linguistics would be much easier than it is. Consider, for another example, the kind of desires postulated by Freudian psychoanalysis. Suppose, at least for the sake of argument, that this theory is approximately correct. Then, some attitudes might never be conscious without the help of a therapist.
Other attitudes might sometimes be conscious, other times not. For example, for some period of months before acting on your desire, you might desire to propose to your partner. You have the desire during this time, even if you are not always conscious of it. Similarly, you might for most of your life believe that the thrice-great grandson of Georg Wilhelm Friedrich Hegel was born in Louisville, KY, even if the circumstances in which this belief plays any role in your mental life are few and far between.
With these observations in mind, some theorists distinguish between standing or offline attitudes and occurrent or online attitudes. When, for example, your desire to propose to your partner and your belief that now is a good time to do it conspire in your decision to propose to your partner now, they are both online. Occurrent or online attitudes might often be conscious, but not always. Sometimes others are in a better position to recognize your own attitudes than you are. This seems especially likely to be the case when it would embarrass or pain us to realize what attitudes we have.
c. Reasons and Causes
Talk of combinations of beliefs and desires leading to action or behavior might suggest that propositional attitudes are causes of behavior; and this is, in fact, the dominant view in the philosophy of mind and action of the beginning of the 21st century. One common way of construing the notion of online attitudes is in causal terms: to say that an attitude is online is to say that it is constraining, controlling, directing, or in some other way exerting a causal influence on one’s behavior and other mental states. But standing attitudes might also be construed as causes of a kind. For example, Fred Dretske (1988, 1989, 1993) speaks of attitudes generally as “structuring causes”, in contrast to “triggering causes”. I might, for example, have a desire, presumably innately wired into me, to quench my thirst when thirsty. This, alongside a belief about how to go about quenching my thirst, might serve as a structuring cause which, when thirsty (a triggering cause), causes me to go about quenching my thirst.
Of course, sometimes when I am thirsty, have a desire to quench my thirst, and have a belief about how I might go about doing that, I remain fixed to the couch. In this case, barring some physical impediment, it is likely that I have some other desire stronger than the desire to quench my thirst. Just how much influence an attitude has on one’s behavior and other mental states may vary with its strength. Someone might, for another example, desire to lose weight, but not as much as they desire to eat ice cream. In this case, when presented with the opportunity to eat ice cream, they will, all else being equal, be more likely to engage in ice-cream eating than not. If we have information about the relative strengths of their desires, our predictions will reflect this. If our predictions prove correct, we have reason to think that we have in hand an explanation of their behavior—to wit, a causal explanation.
At least, this is how a causalist will put it. But belief-desire combinations are also said to constitute reasons for action, and on some views, reasons are not causes. Suppose, for example, that Mr. Xi starts to go to the fridge for a drink. We ask why. The reply: Because I am thirsty and there is just the fix in the fridge. It is generally agreed that what Xi supplies is a reason for doing what was done. He cites a desire to quench his thirst and a belief about how he might go about doing that. The causalist claims that this reason is also the cause of the behavior. But the anti-causalists deny that this reason is a cause, and for this reason they deny that rationalization is a species of causal explanation. Instead, rationalizations are for making sense of or justifying behaviors. Although this view was widely held in the first half of the 20th century, largely under the influence of Ludwig Wittgenstein and Elizabeth Anscombe, the dominant view at the beginning of the 21st century—owing largely to Donald Davidson—is that rationalizations are a species of causal explanation. Where one sides in this dispute may depend in part on the position one takes on folk psychology, which supplies the framework for rationalizations. We return to this in Section 4.
2. Three Influential Views of the Propositional Attitudes
This section describes three influential views of the propositional attitudes.
a. The Classical View
Gottlob Frege and Bertrand Russell did the most to put the propositional attitudes on the map in analytic philosophy. In fact, Russell is often credited with coining the term, and they both articulate what we might call the Classical View of propositional attitudes (although Russell, whose views on these matters changed many times, does not everywhere endorse it). (See, for example, Frege 1892 [1997], 1918 [1997], Russell 1903.) On this view, attitudes are mental states in which a subject is related to a proposition. They are, therefore, psychological relations between subjects and propositions. It follows from this that propositions are objects of attitudes, that is, they are what one believes, desires, and so on.
Propositions are also the contents of one’s attitudes. For example, when Galileo asserts that the earth moves, he expresses a belief (assuming, of course, that he is sincere, understands what he says, and so forth), namely the belief that the earth moves. What Galileo believes is that the earth moves. So, it is said that the content of Galileo’s belief, which may be true or false, is that the earth moves. This is precisely the proposition that the earth moves, reference to which we secure not only with the expression “the proposition that the earth moves” but also with “that the earth moves”.
Propositions, on this view, are the primary truth-bearers. If a sentence or belief is true, this is because the sentence expresses (or is used to express) a true proposition or because the belief has as its object and content a true proposition. It is in virtue of being related to a proposition that one’s belief can be true or false, as the case may be. As they may be true or false, propositions have truth conditions. They specify the conditions in which they are true, that is, the conditions or states of affairs that must obtain if the proposition is to be true—or, to put it in still another way, what the facts must be.
Propositions have their truth conditions absolutely, in the sense that their truth conditions are not relativized to the sentences used to express them. In using the sentence “the earth moves” to assert that the earth moves, we express the proposition that the earth moves and thus our belief that the earth moves. Galileo expressed this belief, too. Thus, we and Galileo believe the same thing and may therefore be said to have the same belief. Presumably, however, Galileo did not express his belief in English. He might have instead used the Italian sentence, “La terra si muove”. There are, of course, indefinitely many sentences, both within and across languages, that may be used to express one and the same proposition. (It might be said that sentences are inter-translatable if and only if they express the same proposition.) No matter which sentence is used to express a proposition, its truth conditions remain the same.
Propositions also have their truth conditions essentially, in the sense that they have them necessarily. Necessarily, the proposition that the earth moves (which can, again, be expressed with indefinitely many sentences) is true if and only if the earth moves. The proposition does not have these truth conditions contingently or accidentally; it is not the case that it might have been true if and only if, say, snow is white. (On some views, the sentence “the earth moves” might have expressed the proposition that snow is white, or some other proposition; but that is a different matter.) In the language of possible worlds, often employed in the discussion of modal notions like necessity and possibility: there is no possible world in which the proposition that the earth moves has truth conditions other than those it has in the actual world.
(Incidentally, though this is not something discussed by Frege and Russell, propositions are often said to be the primary bearers of modal properties, too. It is said, for example, that if it is necessary that 7 + 5 = 12, the proposition that 7 + 5 = 12 is necessary. It is, in other words, a necessary truth, where a truth is understood to be a true proposition. In the language of possible worlds: it is true in every possible world.)
We can, as already mentioned, share beliefs, and this means, on the going view, that one and the same proposition may be the object of our individual beliefs. This raises the question of what propositions could be, such that individuals as spatiotemporally separated as we and Galileo could be said to stand in relation to one and the same proposition. Frege’s answer, as well as Russell’s in some places, is that they must be mind- and language-independent abstract objects residing in what Frege called the “third realm”, that is, neither a psychological realm nor the physical realm (the realm of space and time). In other words, Frege (and again, Russell in some places) adopted a form of Platonism about propositions, or thoughts (Gedanken) as Frege called them.
To be sure, this view invites difficult questions about how we could come into contact with or know anything about these objects. (Being outside space, it is not even clear in what sense propositions could be objects.) It is generally assumed that whatever can have causal effects must be concrete, that is, non-abstract. It follows from this assumption that propositions, as abstract objects, are not just imperceptible but causally inefficacious. That is, they can themselves have no causal effects, whether on material objects or minds (even if minds are non-physical, as Frege thinks). Frege acknowledges this last observation but insists that, somehow, we do in some sense grasp or apprehend propositions.
In the philosophy of mathematics, serious worries have been raised about how we might gain knowledge of mathematical objects if they are as the Platonist conceives of them (see, for example, Benacerraf 1973), and the same would seem to go for propositions. The difficulty is compounded if, following Frege, we conceive of the mind as non-physical or non-concrete. The concrete is generally conceptualized as the spatiotemporally located. Thus to define the ‘abstract’ as the non-concrete is to define the abstract as the non-spatiotemporally located. On Frege’s view, the mind is not spatiotemporally located, concrete, or physical. And yet, presumably, it is not abstract. What’s more, there seem to be mental causes and effects, for one idea leads to another. In fact, as Frege acknowledges, ideas can bring about the acceleration of masses.
Frege nowhere presents a detailed view of these matters, so it is not clear that he had one. In general, it is not clear what a satisfactory view of propositions as abstract objects would look like. So perhaps it can be understood why, as early as 1918, Russell (despite his important role in developing the theory of propositions) would take the position that “obviously propositions are nothing”, adding that no one with a “vivid sense of reality” can think that there exists “these curious shadowy things going about”. Nevertheless, it is not clear that Russell manages to do without them. In fact, few have so much as attempted to do without them—the nature of propositions being a lively area of research. We return to a discussion of propositions in 3b.
b. Dispositionalism
Dispositionalism, most broadly construed, is the view that having an attitude, for example the belief that it is raining, is nothing more than having a certain disposition, or set of dispositions, or dispositional property or properties. On the simplest dispositionalist view—held by many philosophers when behaviorism was the dominant paradigm in psychology and logical positivism was the reigning philosophy of science (roughly, the first half of the 20th century)—the relevant dispositions are dispositions to overt, observable behavior (Carnap 1959). On this view, to lay stress on the point, to believe that it is raining justis to have certain behavioral dispositions, as exhibited in certain patterns of overt, observable behavior.
Dispositionalists and non-dispositionalists alike agree that patterns of behavior, being manifestations of behavioral dispositions, are evidence for particular beliefs and other attitudes an agent has. However, dispositionalists claim that there is nothing more to the phenomenon: if one were to have exhaustively specified the behavioral dispositions associated with the ascription of the belief that it is raining, one would have said everything there is to say about this belief. In this sense, dispositionalism is a superficial view: in ascribing an attitude, we do not commit ourselves to the existence of any particular internal state of the agent in possession of the attitude—whether a state of the mind or brain. Having an attitude is a surface phenomenon, a matter of how one conducts oneself in the world (Schwitzgebel 2013, Quilty-Dunn and Mandelbaum 2018).
Notoriously, it is very difficult to provide any informative general dispositional characterization of an attitude, such as the belief that it is raining. To take a stock example, we might say that if one believes that it is raining, one will be disposed to carry an umbrella when one leaves the house. Evidently, this will require not only that one has an umbrella on hand but that one desires not to get wet, remembers where the umbrella is located, believes that it will help one to stay dry, and so on. In general, as many theorists have observed, it seems that the behavioral dispositions associated with a particular attitude are not specifiable except by reference to other attitudes (Chisholm 1957, Geach 1957). This is sometimes referred to as the holism of the mental.
This is a problem for the simple dispositional accounts which seek to reduce or analyze away all mental talk into behavioral talk. Such a reductive project was pursued, or at least sketched, by the logical behaviorists, who wished—by analyzing talk of the mind into talk of behavior—to pave the way toward reducing all mental descriptions to physical descriptions. The view was thus a form of physicalism, according to which the mental is physical. Thus, it was sometimes said that mental state attributions are really but shorthand for descriptions of behavioral patterns and dispositions. According to the logical behaviorists, logical analysis was to reveal this. However, it is generally agreed that this project, articulated most influentially by Carl Hempel (1949) and Rudolph Carnap (1959), failed—and precisely on account of the holism of the mental. (There are no prominent logical behaviorists in the 21st century. Carnap and Hempel themselves abandoned the project after having rejected the verificationist criterion of meaning at the heart of logical positivism. According to this criterion, all meaningful empirical statements are verifiable by observation. In the case of psychological statements, the thought was that they should be verifiable by observation of overt behavior. For further discussion, see the articles linked to in the preceding paragraphs of this subsection.)
The holism of the mental is not, however, a problem for every simple dispositionalist account of the propositional attitudes, for not every such account has reductive ambitions. For some, it is enough that we can provide a dispositional characterization of each mental state attribution, albeit one involving reference to other mental states. For example, one might be said to remember where the umbrella is located if one is able to locate it—say, when one wants the umbrella (because, we might add, one desires not to get wet and believes that the umbrella will help one to stay dry). Similarly, one might be said to desire to stay dry if, when one believes that it is raining, one is disposed to adopt some rain-avoiding behavior: say, not leaving the house, or not leaving the house without an umbrella (if one believes that it will help one to stay dry). Despite the difficulty of providing informative general dispositional characterizations of attitudes, everyone semantically competent with the relevant stretches of language is adept at recognizing and ascribing attitudes on the basis of overt, observable behavioral patterns, including (in the case of linguistic beings) patterns of linguistic behavior. The simple dispositionalist view is again just that there is nothing more to know about the attitudes: they are but dispositions to the observed behavioral patterns.
For many dispositionalists, the appeal of dispositionalism is precisely its superficial character. Our everyday practice of ascribing attitudes, so of explaining, predicting, and rationalizing one another and ourselves with reference to the attitudes, appears to be insensitive to whatever is going on, so to speak, under the hood. In fact, many think that the practice has remained more or less unchanged for millennia, even if there have been indefinitely many changes in views of what the mind is and where it is located, if it is has a location. In historical terms, it is only quite recently that we have suspected minds to be brains and their locations to therefore be the interior of the skull. Even still, facts about the brain, specified in cognitive neuroscientific or computational psychological terms, never enter everyday considerations when ascribing attitudes.
Indeed, if an alien being or some cognitively sophisticated descendent of existing non-human animals or some future AI were to seamlessly integrate into human (or post-human) society, forming what are to all appearances nuanced beliefs about, say, the shortcomings of the American constitution, where to invest next year, and how to appease the in-laws this holiday without compromising one’s values, then most of us would be at least strongly inclined to accept this being as a true believer, as really in possession of the attitudes they seem to have—any differences in their physical makeup notwithstanding (see Schwitzgebel 2013, as well as Sehon 1997 for similar examples). Dispositionalism accords well with this.
However, it does seem that one could have an attitude without any associated overt, observable behavioral dispositions—just as one might experience pain without exhibiting or even being disposed to exhibit any pain-related behavior (yelping, wincing, cursing, stamping about, crying, and so forth) (see Putnam 1963 on “super-spartans” or “super-stoics”). A locked-in patient, for example, has beliefs, though no ability to behaviorally exhibit them. (Incidentally, this highlights the implausibility of behaviorism as applied to the mental generally, not just to the attitudes.) If they have the relevant dispositions, this is only in a very attenuated sense.
In addition, it is not clear that the affective or phenomenological should be excluded. For example, if you believe that the earth is flat, might you not be disposed to, say, feel surprised when you see a picture of the earth from space? It is not clear why this should not be among the dispositions characteristic of your belief. What is more, it seems that one might have the associated behavioral dispositions without the attitude. A sycophant to a president, for example, might be disposed to behave as if she thought the president were good and wise, even if she believes the contrary.
Recognizing the force of these and related observations but appreciating the appeal of a superficial account of belief, other more liberal dispositionalists have allowed that the relevant dispositions may include not just dispositions to overt, observable behavior but also dispositions to cognition and affect. Despite his usual mischaracterization as a logical behaviorist, Gilbert Ryle (1949) is a prime example of a dispositionalist of this latter sort. Eric Schwitzgebel (2002, 2010, 2013) is an example of a contemporary theorist who adopts a view similar to Ryle’s.
Like Ryle, Schwitzgebel does not attempt to provide a reductive account. He allows that, when providing a dispositional specification of a particular attitude, we must inevitably make reference to other attitudes. In other words, he countenances the holism of the mental. His view is also like Ryle’s in that he allows that the relevant dispositions are not just dispositions to overt, observable behavior. Unlike Ryle, however, Schwitzgebel makes it a point to emphasize this aspect of his view. He also emphasizes the fact that, when we consider whether someone’s dispositional profile matches “the dispositional stereotype for believing that P”, “what respects and degrees of match are to count as “appropriate” will vary contextually and so must be left to the ascriber’s judgment”(2002, p. 253). He emphasizes, in other words, the vagueness and context-dependency of our ascriptions. Finally, also unlike Ryle, but in line with the dominant view at the beginning of the 21st century, Schwitzgebel is at least inclined to the view that attitudes are causes and belief-desire explanations thus causal explanations.
Combining dispositionalism and a causal picture of the attitudes poses some difficulties. On most views of dispositions developed in the second half of the 20th century and the beginning of the 21st century, dispositions must have categorical bases. Consider, for example, a new rubber band. It has the property of elasticity, and this is a dispositional property: it is disposed to stretch when pulled and to return to its prior shape when released. We can intelligibly ask why, and the answer will tell us something about the categorical basis of this property—something about the material constitution of the rubber band. Similarly, we explain the brittleness of glass and solubility of sugar in terms of their material constitutions (perhaps, a little more specifically, their microphysical structures). In the case of attitudes, construed as dispositions, the most plausible categorical bases would be states of the brain. Now the question arises whether we should identify these dispositions with their categorical bases or not. A dispositionalist attracted to the position for its superficial character is not likely to make this identification. (That attitudes are brain states would seem to be a deep view.) However, without this identification, more work would need to be done to explain how dispositions can be causes.
c. Computational-Representationalism
On the classical picture, described in Section 2a, to believe that the earth moves, Galileo must grasp the proposition that the earth moves in the way the belief suggests it—where, as discussed, the proposition is an abstract mind- and language-independent object with essential and absolute truth-conditions, residing somewhere in the so-called third realm (neither the physical nor the mental realms, but somewhere else altogether—Plato’s Heaven perhaps). As discussed, one trouble with this view is that, if an object is in neither time nor space, there is no clear sense in which it is anywhere, let alone an object. But even supposing we can make sense of this, it remains to explain how we can grasp this object, as well as the nature of the grasping relation. Frege and Russell are not of much help here.
Many philosophers—and Jerry Fodor is the primary architect here—essentially take the classical picture and psychologize it. Thus, the proposition grasped becomes a mental representation (expressing, meaning, or having this proposition as content)—the mental representation being a physical thing, literally located in one’s head—and the grasping relation becomes a functional role or, still more exactly, a computational role. That is, according to Fodor: if Galileo believes that the earth moves, there is in Galileo’s brain a mental representation that means that the earth moves and that plays the functional (or computational) role appropriate to belief. The details are complex (see the articles just linked to), but the basic idea is simple: it is in virtue of having a certain object moving around in your head in a certain way that you bear a certain relation to it and so may be said to have the attitude you have. So put, Fodor’s computational-representational view, unlike the dispositional views discussed above, is a deep view.
In the vocabulary of computational psychology, the object of an attitude is a computational data structure, over which attitude-appropriate computations are performed and so definable. At the level of description at which this structure is so identified, it is multiply-realizable, meaning here that algorithmic and implementation details may vary (Marr 1982). A schematic picture might help:Roughly, the top-level (the computational level) provides a formal specification of some function a mechanism might compute; the middle level (the algorithmic level) says how it is computed; and the bottom level (the implementation level) says how all this is physically realized in the brain. As depicted, there is more than one way to execute a function, and more than one way to physically realize its execution. In the end, the idea goes, we can account for the aspects of the mind thus theorized in purely physical terms (thus making the view another form of physicalism). Bridging these levels is not easy, but Fodor thinks that commonsense psychology can help in limning the computational structure of the mind-brain, or our cognitive architecture.
On this view, again, the proposition of the classical view becomes a mental representation to which the subject stands in a particular attitude relation in virtue of the fact that the mental representation plays a certain computational role in the cognitive architecture of the subject: a belief-role or a desire-role, as the case may be. On Fodor’s view, since the content of this mental representation is propositional, and so is a proposition, the representation must have a linguistic form, that is, it must be syntactically structured—and thus, he reasons, a mental sentence, to wit, a sentence of Mentalese, our language of thought. Since he conceptualizes mental representations as objects, he speaks of their syntactic shapes (see, for example, his 1987). In fact, it is supposed to be in virtue of their shapes that mental representations and so the attitudes with which they are associated have causal powers—that is, the ability to make other things move, including the subject with the mental representation in her head.
Fodor (1987) thinks, in fact, that if you look at attitude reports and our practice of ascribing attitudes (in other words, at commonsense belief-desire psychology, or folk psychology), what you will find is that attitudes have at least the following essential properties: “(i) They are semantically evaluable. (ii) They have causal powers. (iii) The implicit generalizations of common-sense belief/desire psychology are largely true of them.” (p. 10).
As an example generalization of common-sense belief-desire psychology, Fodor (1987) provides the following:
If x wants that P, and x believes that not-P unless Q, and x believes that x can bring it about that Q, then (ceteris paribus) x tries to bring it about that Q. (p. 2)
Generalizations like this are implicit in that they need not be—and often are not—explicitly entertained or represented when they are used to explain and predict behavior with reference to beliefs and desires. Taking an example from Fodor (1978), consider the following instance of the above generalization: if John wants that it rain, and John believes that it will not rain unless he washes his car, and John believes that he can bring it about that he washes his car, then (ceteris paribus) John tries to bring it about that he washes his car. According to Fodor, such explanations are causal, and the attitude ascriptions involved individuate attitudes of a given type (belief, desire) by their contents (that it will not rain unless I wash my car, that it will rain). Moreover, such explanations are largely successful: the predictions pan out more often than not; and Fodor reasons, we therefore have grounds to think that these ascriptions are often true. If true, a scientific account of our cognitive architecture should accord with this.
These generalizations, moreover, are counterfactual-supporting: if a subject’s attitudes are different, we folk psychologists, equipped with these generalizations, will produce different predictions of their behavior. So, the generalizations have the characteristics of the laws that make up scientific theories. Granted, there are exceptions to the generalizations, and so exceptional circumstances. In other words, these generalizations hold ceteris paribus, that is, all else being equal. For example: if one wants it to rain, and believes that it will not rain unless one washes one’s car, and one believes that one can wash one’s car, then one will wash one’s car—unless it suddenly begins to rain, or one is immobilized by fear of the neighbor’s unleashed dog, or one suffers a seizure, and so forth. (As competent folk psychologists, we are very capable of recognizing the exceptions.) However, this does not mean that the generalizations or instances thereof are empty—that is, true unless false—argues Fodor (1987); for this would make the success of folk psychology miraculous. Besides, all the generalizations of the special sciences (that is, all the sciences but basic physics) have exceptions; and that is no obstacle to their having theories. So, in fact, Fodor thinks that folk psychology is or involves a bona fide theory—and that this theory is vindicated by the best cognitive science. As vindicated, the posits of folk psychology, namely beliefs, desires, and the rest, are therefore shown to be real. We return to this in Section 4.
As a bona fide theory, Fodor reasons that the referents of folk psychology’s theoretical terms are unobservable. Beliefs, desires, and the rest are therefore unobservables and thus inner as opposed to outer mental states. If the ascriptions are largely true, then—on a non-instrumentalist reading—what they refer to must exist and moreover have the properties the truth of the ascriptions requires them to have. The explanations in which these ascriptions figure are again causal, so these inner states must be causally efficacious. Since, according to Fodor (1987), “whatever has causal powers is ipso facto material” (p. x, Preface), it follows that mental states are physically realized (presumably, in the brain). Since, once more, they are individuated by their contents, they are content-bearing. Putting this together, then, the propositional attitudes are neurally-realized causally efficacious content-bearing internal states. As Fodor (1987) states the view:
For any organism O, and any attitude A toward the proposition P, there is a (‘computational’/‘functional’) relation R and a mental representation MP such that
MP means that P, and
O has A iff O bears R to MP. (p. 17)
Importantly, according to Fodor, computational-representationalism—unlike any other theoretical framework before it—allows us to explain precisely how mental states like propositional attitudes can have both contents and causal powers (so the first two essential properties noted above). Attitudes, indeed mental states more generally, do not just cause behaviors. They also causally interact with one another. For example, believing that if it rains, it will pour, and then coming to believe that it will rain (say, on the basis of a perceptual experience), will typically cause one to believe that it will pour. What is interesting about this is that this causal pattern mirrors certain content-relations: If “it pours, if it rains” and “it rains” are true, “it pours” is true. This fact may be captured formally or syntactically: P → Q, P ⊢ Q. (This indicates that Q is derivable or provable from P → Q and P. This common inference pattern is known as Modus Ponens. See the article on propositional logic.) This, in turn, permits us to build machines—computers—which exhibit this causal-inferential behavior. In fact, not only may computer programs model our cognitive processes, we also are, on this view, computers of a sort ourselves.
Among those who accept the computational-representational account of propositional attitudes, some deny that the relevant representations are sentences, and some maintain agnosticism on this question (see, for example, Fred Dretske, Tyler Burge, Ruth Millikan). Moreover, not everyone who accepts that they are sentences accepts that they are sentences of a language of thought distinct from any public language (Harman 1973). Still others deny that only language is compositional—arguing, for example, that maps, too, can be compositional (see Braddon-Mitchell and Jackson 1996, Camp 2007). Views also differ on how the relevant mental representations get their content (see the article on conceptual role semantics, as well as the article on Fodor, for some of the views in this area). In any case, Fodor’s view has been the most influential articulation, and the above theoretical identification is general enough to be accepted by any computational-representationalist.
3. Propositions, Propositional Attitude Reports, and the Method of Truth in Metaphysics
Theorizing about propositions, propositional attitudes, and propositional attitude reports have traditionally gone together. The connection is what Davidson (1977) called “the method of truth in metaphysics”, or what Robert Matthews (2007) calls the “reading off method”—that is, the method of reading off the metaphysics of the things we talk about from the sentences we use to talk about these things, provided that the logical form and interpretation of the sentences have been settled. This section discusses this method and the metaphysics of propositional attitudes and propositions arrived at by its application.
a. Reading Off the Metaphysics of Propositional Attitudes
Many valid natural language inferences involving propositional attitude reports seem to require that these reports have relational logical forms—the reports thereby reporting the obtaining of a relation between subjects and certain objects to which we seem to be ontologically committed by existential generalization:
Galileo believes that the earth moves.
Bgp
∴ Galileo believes something.
∴ ∃xBgx
(Ontology is the study of what there is. One’s ontological commitments are thus what one must take to exist. This notion of ontological commitment is most famously associated with Quine (1948, 1960); as he put it: “to be is to be the value of a bound variable”.) That is, if a report like
(1) Galileo believes that the earth moves.
has the logical form displayed above, then if Galileo believes that the earth moves, there is something—read: some thing, some object—Galileo believes, to wit, the proposition that the earth moves. If you believe that Galileo believes this, you are committed to the existence of this object.
Some of these inferences, moreover, appear to require that the objects to which subjects are related by such reports are truth-evaluable:
Galileo believes that the earth moves.
Bgp
That the earth moves is true.
Tp
∴ Galileo believes something true.
∴ ∃x(Bgx & Tx)
If Galileo’s belief is true, then the proposition that the earth moves is true. We thus say that the object of Galileo’s belief, the proposition, is also the content of the belief. Still other inferences appear to require that attitudes are shareable:
Galileo believes that the earth moves.
Bgp
Sara believes that the earth moves.
Bsp
∴ There is something they both believe.
∴ ∃x(Bgx & Bsx)
On the classical view owing to Frege and Russell (see Section 2a), objects and contents of belief, truth-evaluable, and shareable, not to mention the referents of that-clauses (for example “that the earth moves”) and expressible by sentences (for example “the earth moves”), are among the specs for propositions. Thus, a report like (1) appears to be true just in case Galileo stands in the belief-relation to the proposition that the earth moves, the subject and object being respectively the referents of “Galileo” and “that the earth moves”.
Of course, it is not just beliefs that we report. We also report fears and hopes and many other attitudes besides. For example:
(2) Bellarmine fears that the earth moves.
(3) Pia hopes that the earth moves.
So, we see that various attitudes may have the same proposition as their object (and content). Of course, the same type of attitude can be taken towards different propositions, referred to with different that-clauses (for example “that the earth is at the center of the universe”). Generalizing on these data, we therefore seem to be in a position to say the following:
Instances of x V that S are true if and only if x bears the relation expressed by V (the V-relation) to the referent of that S.
with (1)–(3) being instances of this schema.
This analysis is typically extended also to reports of speech acts of various kinds, which are sometimes included under the label propositional attitudes (Richard 2006). For example:
(4) Galileo {said/asserted/proclaimed/hypothesized…} that the earth moves.
In fact, replacing the attitude verb with a verb of saying, inferences like the following appear to be equally valid:
Galileo said that the earth moves.
Sgp
∴ Galileo said something.
∴ ∃xSgx
Galileo said that the earth moves.
Sgp
That the earth moves is true.
Tp
∴ Galileo said something true.
∴ ∃x(Sgx & Tx)
Galileo said that the earth moves.
Sgp
Sara said that the earth moves.
Ssp
∴ There is something they both said.
∴ ∃x(Sgx & Ssx)
As one can believe what is said (asserted, proclaimed, and so forth), inferences like the following likewise appear valid:
Sara believes everything that Galileo says.
∀x(Sgx ⊃ Bsx)
Galileo said that the earth moves.
Sgp
∴ Sara believes that the earth moves.
∴ Bsp
The objects of these reports are again often thought to be propositions. These inferences thereby lend further support for the above view of the form and interpretation of reports. This view, what Stephen Schiffer (2003) calls the Face-Value Theory, has long been the received view.
b. Reading Off the Metaphysics of Propositions
One remaining question concerns the nature of these propositions, taken to be the objects and contents of the attitudes. Getting clear on this would seem crucial to getting clear on what the propositional attitudes are, if they are indeed attitudes taken towards propositions. To this end, the same method has been used.
Provided that that-clauses like “that the earth moves” are replaced by individual constants in the logical translations of reports like (1)–(4), it seems right to construe that-clauses as singular referring terms (similar to proper names) and so their referents—propositions, by the foregoing reasoning—as objects. Moreover, if it is another property of propositions to be expressed by (indicative, declarative) sentences, then provided that these sentences have parts which compose in systematic ways to form wholes, it seems natural to think that the propositions are likewise structured—with the parts of the propositions corresponding to the parts of the sentences. So, propositions are structured objects, though distinct from the sentences used to express them. Moreover, since they are shareable, even countenancing vast spatiotemporal separation between subjects (both we and Galileo can believe that the earth moves), they must be, it is reasonable to think, abstract and mind-independent. Or so Frege (1918) reasoned.
With this granted, we might then ask what the nature of the constituents of propositions is. If, for example, when we use a sentence like
(5) The earth moves.
we refer to the earth and ascribe to it the property of moving, and we express propositions with sentences, then it is natural to think that the constituents of propositions are objects, like the earth, and properties (and relations), like the property of moving. This is the so-called Russellian view of the constituents of propositions, after Russell.
If the propositional contribution of a term just is a certain object, namely the one to which the term refers, then any other term that refers to the same object will have the same propositional contribution. This seems to mirror an observation made concerning sentences like (5). If, for example, “the earth” and “Ertha” are co-referring, then if (5) is true, so is
(6) Ertha moves.
That is, if
(7) The earth is Ertha.
is true, then—holding constant the sentences in which these terms are embedded—the one term should be substitutable for the other without change in truth value of the embedding sentence. The terms are, as it is sometimes put, intersubstitutable salva veritate (saving truth).
This seems, however, not to hold generally, as Frege (1892) famously observed. For suppose that Galileo believes that the earth and Ertha are distinct. Then even if the earth is Ertha and Galileo believes that the earth moves,
(8) Galileo believes that Ertha moves.
is false, or so it might seem. (Some Russellians deny this; see, for example, Salmon 1986.) Such apparent substitution failures are widely known as Frege cases. Frege thought that they cast doubt on the Russellian view of propositional constituents. For if propositional attitudes are relations between subjects and propositions, he reasoned, then provided that the type of attitude ascribed to Galileo is the same (belief, in the running example), there must be a difference in the proposition which accounts for the difference in the truth value of these reports.
Frege cases are related to another puzzle discussed by Frege (1892), widely known as the puzzle of cognitive significance. To take a widely used example from Frege, while the Babylonians would have found
(9) Hesperus is Hesperus.
and
(10) Phosphorus is Phosphorus.
as trivial as anyone else, it would have come as a surprise to them that
(11) Hesperus is Phosphorus.
Establishing the truth of (11) was a non-trivial astronomical discovery. It turns out, contrary to what the Babylonians believed, that Hesperus, the heavenly body which shines in the evening, and Phosphorus, the heavenly body which shines in the morning, are one and the same—not distinct stars, as the Babylonians believed, but the planet Venus. Yet, if (11) is true, and “Hesperus” and “Phosphorus” are two names for one and the same object, there is a sense in which (9), (10), and (11) all say the same thing. Therefore, an explanation of the fact that (9) and (10) are trivial while (11) is cognitively significant seems to be owed.
One possible explanation for the difference in cognitive significance is that we may attach distinct senses to distinct expressions, even if they are co-referring. If propositions are what we grasp when we understand sentences, then perhaps, Frege hypothesized, propositional constituents are not individuals and relations but senses. This is the so-called Fregean view of propositions.
In the first instance, senses are whatever difference accounts for the difference in cognitive significance between (9) and (10), on the one hand, and (11) on the other. More exactly, Frege suggested that we think of senses as ways of thinking about or modes of presenting what we are talking about which are associated with the expressions we use. For example, the sense associated with “Hesperus” by the Babylonians could be at least roughly captured with the description “the star that shines in the evening” and the sense associated with “Phosphorus” by the Babylonians could be at least roughly captured with the description “the star that shines in the morning”. (For further discussion, see the articles on Frege’s philosophy of language and Frege’s Problem.)
Taking this idea on board, compatibly with accepting the idea that co-referring terms are intersubstitutable salva veritate, Frege suggested that we might then account for substitution data like the above by a systematic shift in the referents of the expressions embedded in the scope of an attitude verb like “to believe”—in particular, a shift from customary referent to sense. On this view, for example, the referent of “Hesperus”, when embedded in
(12) Bab believes that Hesperus shines in the morning.
is not Hesperus (that is, Venus) but the sense of “Hesperus”. Similarly, the semantic contribution of the predicate “shines in the morning” would be the sense of that expression, not the property of shining in the morning. Putting these senses together, we have the proposition (or thought) expressed by “Hesperus shines in the morning”, the referent of the that-clause “that Hesperus shines in the morning”. This way we can see how (12) and
(13) Bab believes that Phosphorus shines in the morning.
may have opposite truth values, even if (11) is true.
Employing his theory of descriptions, Russell (1905) offered a different solution which is compatible with accepting the Russellian view of propositions. Still other solutions have been proposed, motivated by a variety of Frege cases. Common to almost all positions in this literature is the assumption that attitudes are relations between subjects and certain objects. Not all of the proposed solutions, however, take propositions to be structured objects (see, for example, Stalnaker 1984). In fact, not all of the proposed solutions take the objects to be propositions. Some instead propose different proposition-like objects, including: natural language sentences, mental sentences, interpreted logical forms, and interpreted utterance forms (see, for example, Carnap 1947, Fodor 1975, Larson and Ludlow 1993, and Matthews 2007). On such views, the expression “propositional attitudes” turns out to be something of a misnomer, as they are not, strictly speaking, attitudes toward propositions.
c. A Challenge to the Received View of the Logical Form of Attitude Reports
It should be noted that the received view of the logical form of propositional attitude reports discussed in Section 3a has not gone unchallenged. Much recent work in this area has been motivated by a renewed attention to a puzzle known as Prior’s Substitution Puzzle (after Arthur Prior, who is often credited with introducing the puzzle in his 1971 book).
If “that the earth moves” refers to the proposition that the earth moves, then assuming (as is common) that co-referring terms are intersubstitutable salva veritate, we should expect that substituting “that the earth moves” for “the proposition that the earth moves” in (2) will not change the sentence’s truth value. But here is the result:
(14) Bellarmine fears the proposition that the earth moves.
It seems clear that one may fear that the earth moves without fearing any propositions. We could give up the commonly held substitution principle, but this would be a last resort.
A natural thought is that the problem is peculiar to fear, but the problem is seen with many other attitudes besides. Take (3), for example, and perform the substitution. The result:
(15) Sara hopes the proposition that Galileo is right.
Clearly, something has gone wrong; for (15) is not even grammatical.
At this point, one might begin to question whether propositions are in fact the objects of the attitudes. However, it appears that none of the available alternatives to propositions will do:
Bellarmine fears the {proposition/(mental) sentence/interpreted logical form…} that the earth moves.
Some proponents of the received view of the logical form of attitude reports have provided responses to this problem which are compatible with maintaining the received view (see, for example, King 2002, Schiffer 2003). Others argue that the received view must be abandoned, and on some of these alternative views, that-clauses are not singular referring terms but predicates (see, for example, Moltmann 2017).
Insofar as one accepts the reading off method, different views of the logical form of attitude reports may lead to different views of the metaphysics of the propositional attitudes. Of course, not everyone accepts this method. For some general challenges to the method, see, for example, Chomsky (1981, 1992), and for challenges to the method specifically as it applies to theorizing about propositional attitudes, see, for example, Matthews 2007.
4. Folk Psychology and the Realism/Eliminativism Debate
This section discusses how propositional attitudes figure in folk psychology and how the success or lack of success of folk psychology has figured in debates about the reality of propositional attitudes.
a. Folk Psychology as a Theory (Theory-Theory)
The term “folk psychology” is sometimes used to refer to our everyday practice of explaining, predicting, and rationalizing one another and ourselves as minded agents in terms of the attitudes (and other mental constructs, such as sensations, moods, and so forth). Sometimes, it is more specifically used to refer to a particular understanding of this practice, according to which this practice deploys a theory, also referred to (somewhat confusingly) as “folk psychology”. This theory about the practice of folk psychology is sometimes referred to as the Theory-Theory (TT). Wilfrid Sellars (1956) is often credited with providing the first articulation of TT, and Adam Morton (1980) with coining the term.
The idea that folk psychology deploys a theory immediately raises the question of what a theory is. At the time TT was introduced, the dominant view of scientific theories in the philosophy of science was that theories are bodies of laws, that is, sets of counterfactual-supporting generalizations (see Section 2c), generally codifiable in the form:
If___, then ___.
Where the first blank is filled by a description of antecedent conditions, and the second blank is filled by a description of consequent conditions. If the law is true, then if the described antecedent conditions obtain, the described consequent conditions will obtain. Thus, the law issues in a prediction and thereby gives us an explanation of the conditions described in its consequent—to wit, a causal explanation, the one condition (event, state of affairs) being the cause of the other.
This idea in turn raises the question of what the laws of folk psychology, understood as a theory (FP, for short), are supposed to be. The following example was provided in Section 2c:
If x wants that P, and x believes that not-P unless Q, and x believes that x can bring it about that Q, then (ceteris paribus) x tries to bring it about that Q.
And another example is the following (see Carruthers 1996):
If x has formed an intention to bring it about that P when R, and x believes that R, x will act so as to bring it about that P.
Additional laws take a similar form. It is acknowledged that they all admit of exceptions; but this, it is argued, does not undermine their status as laws: after all, all the laws of the special sciences have exceptions (again, see discussion in 2c).
Given our competence as folk psychologists, one might expect many more laws like this to be easily formalizable. Perhaps surprisingly, however, only a few more putative laws of FP have ever been presented (but see Porot and Mandelbaum 2021 for a report on some recent progress). Considering how rich and sophisticated folk psychology is (we are, after all, remarkably complex beings), one would expect FP to be a very detailed theory. As a result, one might think that the relative dearth of explicitly articulated laws might cast doubt on the idea that folk psychology in fact employs a theory. The line that proponents of folk psychology as a theory (FP) take is that the laws are implicitly or tacitly known and need not be explicitly entertained or represented when deploying the theory. This position is not an ad hoc one, since many domains in the cognitive sciences take a similar view—for example, in Chomskyan linguistics, which aims to provide an explicit articulation of natural language grammars. We are all competent speakers of a natural language and so must have mastered the grammar of this language. Manifestly, however, it is quite another thing to have an explicitly articulated grammar of the language in hand. We speak and comprehend our natural languages effortlessly, but coming up with an adequate grammar is devilishly difficult. The same may be true when it comes to our competence as folk psychologists.
It is generally agreed that the core of FP would comprise those laws concerning the attitudes (including, for example, the above laws). The key terms of the theory—its theoretical terms—would thus include “belief”, “desire”, “hope”, “fear”, and so forth, or their cognates. If FP is indeed a successful theory, and this is not a miraculous coincidence, then we have reason to think that its theoretical terms succeed in referring—that is, we have reason to think that there are beliefs, desires, hopes, fears, and the rest. If, however, FP is not a successful theory, then the attitudes may have to go the way of the luminiferous aether, phlogiston, and other theoretical posits of abandoned theories.
This understanding of folk psychology and the stakes at hand provide the shared background to the realism/eliminativism debate.
b. Realism vs. Eliminativism
The realist position is represented most forcefully and influentially by Fodor (whose position is described in Section 2c). Fodor takes the success of FP, considered independently of the cognitive sciences, to be obvious. Here is a typical passage from his (1987) book:
Commonsense psychology works so well it disappears… Someone I don’t know phones me at my office in New York from—as it might be—Arizona. ‘Would you like to lecture here next Tuesday?’ are the words that he utters. ‘Yes, thank you. I’ll be at your airport on the 3 p.m. flight’ are the words that I reply. That’s all that happens, but it’s more than enough; the rest of the burden of predicting behavior—of bridging the gap between utterances and actions—is routinely taken up by theory. And the theory works so well that several days later (or weeks later, or months later, or years later; you can vary the example to taste) and several thousand miles away, there I am at the airport, and there he is to meet me. Or if I don’t turn up, it’s less likely that the theory has failed than that something went wrong with the airline. It’s not possible to say, in quantitative terms, just how successfully commonsense psychology allows us to coordinate our behaviors. But I have the impression that we manage pretty well with one another; often rather better than we cope with less complex machines. (p. 3)
In fact, he adds: “If we could do that well with predicting the weather, no one would ever get his feet wet; and yet the etiology of the weather must surely be child’s play compared with the causes of behavior.” (p. 4)
What is more, he argues, signs are that the cognitive sciences—computational psychology, in particular—will vindicate FP by giving its theoretical posits pride of place (see also Fodor 1975). Eliminativists, of course, have a very different view.
Perhaps the most widely discussed and influential argument, or set of arguments, against the realist position and in favor of eliminativism is set forth by Paul Churchland in his (1981) essay “Eliminative Materialism and the Propositional Attitudes.” There, the eliminativist thesis is stated as follows:
Eliminative Materialism is the thesis that our commonsense conception of psychological phenomena constitutes a radically false theory, a theory so fundamentally defective that both the principles and the ontology of that theory will eventually be displaced, rather than smoothly reduced, by completed neuroscience. (p. 67)
He continues:
Our mutual understanding and even our introspection may then be reconstituted within the conceptual framework of completed neuroscience, a theory we may expect to be more powerful by far than the common-sense psychology it displaces, and more substantially integrated within physical science generally. (ibid.)
Churchland argues not only that FP will be shown to be false, but that it will be eliminated—that is, replaced by a more exact and encompassing theory, in terms of which we may then reconceptualize ourselves.
Whereas Churchland welcomes the prospect, Fodor (1990) has this to say:
If it isn’t literally true that my wanting is causally responsible for my reaching, and my itching is causally responsible for my scratching, and my believing is causally responsible for my saying… If none of that is literally true, then practically everything I believe about anything is false and it’s the end of the world. (p. 156)
This might strike some as hyperbolic at first. However, even the eliminativist thesis seems to presuppose what it denies; for after all, is not the assertion of the thesis an expression of belief? So, does Churchland not have to believe that there are no beliefs? (See Baker 1987)
Churchland offers three main arguments for his eliminative view. The first is that FP does not explain a wide range of mental phenomena, including “the nature and dynamics of mental illness, the faculty of creative imagination…the nature and psychological function of sleep…the rich variety of perceptual illusions” (1981, p. 73), and so on. The second is that, unlike other theories, folk psychology seems resistant to change, has not shown any development, is “stagnant”. The third is that the kinds of folk psychology (belief, desire, and so on) show no promise of reducing to, or being identified with, the kinds of cognitive science—indeed, no promise of cohering with theories in the physical sciences more generally.
A number of responses have been provided by those who take FP to be a successful theory. Regarding the first argument, one might simply reply that the theory is successful when applied to phenomena within its explanatory scope. FP needs not be the theory of everything mental. Regarding the second, one might observe that a remarkably successful theory does not call for revision. The third argument is the strongest. However, at the beginning of the 21st century (and all the more so, then, in the last decades of the 20th century, when this debate was an especially hot topic) it turns on little more than an empirical bet, about which there can be reasonable disagreement. Many theorists who appeal to the cognitive sciences in advancing eliminativism appeal in particular to developments in the connectionist paradigm or to other developments in lower-level computational neuroscience (see, besides Churchland 1981, Churchland 1986, Stich 1983, Ramsey et al. 1990), the empirical adequacy of which has been a subject of debate—particularly when it comes to explaining higher-level mental capacities, such as the capacities to produce and comprehend language, which are centrally implicated in folk psychology. (For more on this, see the article on the language of thought.)
There are many other responses to eliminativist arguments besides these, including some which involve rejecting TT (see Section 4c). If folk psychology does not involve a theory, then it cannot involve a false theory; and by the same token, then, beliefs, desires, and the rest cannot be written off as empty posits of a false theory. Even among those who accept that folk psychology involves a theory though, some might reject the idea that the falsity of the theory (namely, FP) entails the nonexistence of beliefs, desires, and the rest.
Stich (1996), a one-time prominent eliminativist, came then to suggest (following Lycan 1988) that the general form of the eliminativist argument—
(1) Attitudes are posits of a theory, namely FP;
(2) FP is defective;
(3) So, attitudes do not exist.
—is enthymematic, if it is not invalid. The suppressed premise Stich identifies is an assumption about the nature of theoretical terms, according to which their meanings are fixed by their relations with other theoretical terms in the theory in which they are embedded (see Lewis 1970, 1972). In other words, a form of descriptivism, according to which the meanings of terms are fixed by associated descriptions, is assumed. On this view, for example, the meaning of “water” is fixed by such descriptions as that “water falls from the sky, fills lakes and oceans, is odorless, colorless, potable, and so forth”. Water, in other words, just is whatever uniquely satisfies these descriptions. Similarly, then, beliefs would be those mental states which are, say, semantically evaluable, have causal powers, a mind-to-world direction-of-fit, and so forth—or in brief, those states of which the laws of FP featuring the term “belief” or its cognates are true. If it turns out that nothing satisfies the relevant descriptions, there are no beliefs. This works the same for desires and the rest. However, one might well reject descriptivism and so block this implication. In fact, the dominant view at the beginning of the 21st century in the theory of reference is not descriptivism but the causal-historical theory of reference.
According to the causal-historical theory of reference (owing principally to the work of Saul Kripke and Hilary Putnam), the referent of a term is fixed by an original baptism (typically involving a causal connection to the referent), later uses of the term depending for their referential success on being linked to other successful uses of the term linking back to the original baptism. Such a view allows for the possibility that we can succeed in referring to things even when we have very mistaken views about them, as it in fact seems possible to many. For example, it seems right that the ancients succeeded in referring to the stars, despite having very mistaken views about what they are. Similarly, then, the idea goes, if the causal-historical theory is correct, it should be possible that we succeed in referring to propositional attitudes even if we have very mistaken views about their nature, so even if FP is defective.
But in fact, Stich’s (1996) skepticism about the eliminativist argument goes even deeper than this, extending to the very method of truth or reading off method in metaphysics (what Stich calls “the strategy of semantic ascent”). It is not clear, he argues, what is required of a theory of reference, or whether there might be such a thing as the correct theory of reference. After all, descriptivism might seem to better accord with cases where we do reject the posits of rejected theories—the luminiferous aether, phlogiston, and so on.
One idea might be to have a close look at historical cases where theoretical posits have been retained despite theory change and cases where theory change leads to a change in ontology and see if we can uncover implicit general principles for deciding between (a) “we were mistaken about Xs” and (b) “Xs do not exist”. But Stich (1996) despairs of the prospects:
It is entirely possible that there simply are no normative principles of ontological reasoning to be found, or at least none that are strong enough and comprehensive enough to specify what we should conclude if the [premises] of the eliminativist’s arguments are true. (p. 66-7)
Moreover, he continues:
In some cases it might turn out that the outcome was heavily influenced by the personalities of the people involved or by social and political factors in the relevant scientific community or in the wider society in which the scientific community is embedded. (p. 67)
If this is correct, then it is indeterminate whether there are propositional attitudes, and it will remain indeterminate “until the political negotiations that are central to the decision have been resolved” (ibid., p. 72).
This gives us a very different view of the stakes at hand, as the reality of the attitudes no longer appears to be a question of what is the case independently of human interests and purposes. The question is, as Stich puts it, political. But this view, which is a sort of social constructivism, is controversial—even if some of the main lines of thought leading to this view have less controversial roots in pragmatism.
c. Alternatives to Theory-Theory
As noted in Section 4b, not every theorist of folk psychology accepts TT. One of the main alternatives to TT, which was developed against the backdrop of the realism/eliminativism debate, is the simulation theory (ST). According to this view, we do not deploy a theory in explaining, predicting, and rationalizing one another (that is, in brief, in practicing folk psychology). Instead, what we do is to simulate the mental states of others, put ourselves in their mental shoes, or take on their perspective (Heal 1986, Gordon 1986, Goldman 1989, 1993, 2006). Different theorists spell out the details differently, but there is a common core, helpfully summarized by Weiskopf and Adams (2015):
In trying to decide what someone thinks, we imaginatively generate perceptual inputs corresponding to what we think their own perceptual situation is like. That is, we try to imagine how the world looks from their perspective. Then we run our attitude-generating mechanisms offline, quarantining the results in a mental workspace the contents of which are treated as if they belonged to our target. These attitudes are used to generate further intentions, which can then be treated as predictions of what the target will do in these circumstances. Finally, explaining observed actions can be treated as a sort of analysis-by-synthesis process in which we seek to imagine the right sorts of input conditions that would lead to attitudes which, in turn, produce the behavior in question. These are then hypothesized to be the explanation of the target’s action. (p. 227)
One perceived advantage of this view, insofar as folk psychology is thought not to involve a theory, is that the attitudes (and other mental constructs) appear to be immune to elimination. Of course, just what folk psychology is or involves is itself an empirical question, and in particular a question for psychology. For a sustained empirical defense of ST, pointing to paired deficits and neuroimaging, among other lines of evidence, see Goldman 2006.
Above, TT and ST were described as alternatives. Indeed, ST was initially developed as an alternative to TT. However, subsequent theorists developed a number of hybrid views, suggesting that we both simulate and theorize, depending on how similar or dissimilar the targets of explanation are (Nichols and Stich 2003). In fact, it might be wondered why simulation could not just be seen as our way of applying a theory. After all, it is not claimed by the proponents of TT that the theory is consciously entertained; it is instead tacitly known. It could be, the suggestion goes, that simulation is what application of the theory looks like from the conscious, personal-level (see Crane 2003). Whether this view is correct is however an empirical question. For more on the empirical literature, see the article on the theory of mind.
Perhaps unsurprisingly, there are many other views of folk psychology besides TT, ST, and the hybrid theories. Another widely discussed class of views goes under the label interpretationism, with the key theorists here being, among others, Davidson and Dennett. We focus on Dennett’s articulation.
Whereas Fodor and Churchland agree that if propositional attitude reports are true, they are made true by the presence of certain causally efficacious and semantically evaluable internal states, Dennett demurs. There might well not be such states as Fodor believes there are, as Churchland argues; but pace Churchland, Dennett thinks that this would not impugn our folk psychological practices or call into question the truth of our propositional attitude reports. In providing a folk psychological explanation, we adopt what Dennett (1981 [1998], 1987 [1998]) calls the “intentional strategy” or “intentional stance”, in which we treat objects or systems of interest as if they are rational agents with beliefs and desires and other mental states. According to Dennett (1981 [1998]):
Any object—or…any system—whose behavior is well-predicted by this strategy is in the fullest sense of the word a believer. What it is to be a true believer is to be an intentional system, a system whose behavior is reliably and voluminously predictable via the intentional strategy. (p. 15)
This view emphasizes that:
all there is to being a true believer is being a system whose behavior is reliably predictable via the intentional strategy, and hence all there is to really and truly believing that p (for any proposition p) is being an intentional system for which p occurs as a belief in the best (most predictive) interpretation. (p. 29)
This gives us what Dennett characterizes as a “milder sort of realism”.
One obvious objection to this view is that it is not just the systems we intuitively take to be intentional that are reliably predictable from the intentional stance. Thermostats, rocks, stars, and so forth are also reliably predictable from the intentional stance. Dennett’s response to this is to observe that taking the intentional stance toward the latter objects is gratuitous; the “physical” or “design” stances instead suffice. Whereas, in the case of the systems we intuitively take to be intentional systems, the intentional stance is practically indispensable. This response has the curious result that whether a system is intentional is relative to who is trying to predict the behavior of the system. Presumably, Laplace’s Demon, capable of predicting any future state from any prior state under purely physical descriptions, would have no need for the intentional stance. For this reason, many have interpreted the view as a form of anti-realism, albeit not an eliminativist form of anti-realism: there are no attitudes, but it is useful to speak as if there were. For a view similar to Dennett’s in a number of respects but with a more decidedly realist slant, see Lynne Rudder Baker’s (1995) exposition of a view she calls “practical realism”.
Common to all the views of folk psychology mentioned above is the idea that folk psychology is for explaining and predicting. Such is the dominant view in the early 21st century. But it should be mentioned that on other views, folk psychology is not so much for reading minds as it is for shaping minds. It is, in other words, a normative-regulative system. For developments of this line, see, for example, Hutto 2008, McGeer 2007, Zawidzki 2013.
5. More on Particular Propositional Attitudes, or on Related Phenomena
Much of the foregoing has concerned belief and desire, as they are generally regarded to be the paradigmatic propositional attitudes, and especially belief, as is customary in much of the philosophy of mind and language. However, it must be noted that not every attitude fits easily under the general characterization of the attitudes discussed in Section 1, which divides the attitudes into the belief-like and desire-like. For example, imagining (make-believing) and entertaining, as well as delusions of various kinds, pose difficulties for this general classification, as they do not quite match either direction of fit associated with belief and desire. Other controversial candidate propositional attitudes, even if clear in their direction of fit, include judging, knowing, perceiving, and intending. What follows is a brief discussion of each.
a. Imagining
Typically, one does not imagine what is (or, what one takes to be) the case. One imagines what is not (or, what one takes not to be) the case. One imagines, for example, what the world will be like in 2100 if climate change is not meaningfully addressed. One imagines what it was like (or might have been like) to be a prehistoric hunter-gatherer. One imagines what the universe would have been like if one or another of the fundamental physical constants had been different. One imagines (or tries to imagine) what it would be like to be a bat, or to see a ripe tomato for the first time, or to taste Vegemite. One imagines that a particular banana is a telephone. One imagines flying by flapping one’s arms. One imagines that one’s best friend is a fairy. One imagines a weight and feather dropped from the Leaning Tower of Pisa. One imagines a replica of the Leaning Tower of Pisa made of Legos. One imagines listening to Charlie Parker live.
These examples seem to illustrate different kinds of imaginings, or imaginings with different kinds of contents;: imagining an event or state of affairs, or that such and such is the case, imagining objects and experiences, counterfactual imaginings, and imaginings of the past and future. Whatever more we can say of them, these are, plausibly, attitudes of a kind. Moreover, they seem relevant to everything from philosophical and scientific thought experiments and the enjoyment of fiction to pretending and childhood make-believe. As such, imagining is of interest to a range of areas of research, including meta-philosophy, philosophy of science, aesthetics, and philosophy of mind.
The question that interests us here is whether imagining is a propositional attitude. Imagining that such and such is the case (that the banana is a telephone, that one can fly by flapping one’s arms, that one’s best friend is a fairy) seems very plausibly to be a kind of propositional attitude. However, it does not quite seem to be either a belief-like or a desire-like propositional attitude. Beliefs, remember (see Section 1), have a mind-to-world direction of fit; they aim at the truth, or at what is the case—that is, they aim to bring the content of the belief in accord with the world. Desires, by contrast, have a world-to-mind direction of fit; they aim to bring the world in accord with their contents. While imaginings are, like desires, typically about what is not the case, in imagining one does not aim to bring the world in accord with the content of the imagining.
If anything, imagining’s direction of fit is that of belief, though again its aim is not to get the world right. (The use of the expression “make-believe” is suggestive.) Perhaps imagining might be said to have mind-to-counterfactual-world direction of fit; though in this case it still might not be clear what ‘getting it right’ amounts to. Relatedly, although beliefs tend to cohere, there seems to be no requirement that imaginings cohere with one another and with what one believes (except, perhaps, to the extent that they presuppose some beliefs; for example, if you are to imagine an elephant, you must presumably have beliefs about what they look like). Another difference between imagination and belief is seen when we consider its relation to the will. It is widely agreed that we cannot directly decide what to believe; in other words, we do not have direct voluntary control of our beliefs, though we may have indirect control. The same, in fact, may be true of desire. By contrast, we seem to have direct voluntary control of at least some of our imaginings. (There do also seem to be involuntary imaginings.)
It could be that what might be called propositional imagining and believing lie on a continuum (Schellenberg 2013). Perhaps, in fact, there are some mental states which are something of a combination of the two (Egan 2009). It seems that much the same might be said of a number of other attitudes or other attitude-like phenomena, including entertaining, supposing, conceiving, and the like.
Concerning the imagining of objects and experiences, they are perhaps not as obviously propositional attitudes. Nonetheless, such imagining plays an important role in our mental lives. Traditionally, they have been thought to involve images, taken to be rather like pictures (if we have in mind visual imagery), only in the mind. But the nature of such images has been the subject of much debate. For more on this, see the article on imagery and imagination.
b. Judging
Judging—though it often served as the focus of theorizing in Frege and Russell—might be better construed as an action or act (and so an event), rather than a state. Accordingly, some theorists distinguish between propositional attitudes and acts, where acts include not only judging, but also saying, asserting, entertaining, and hypothesizing. However, most theorists, it should be noted, do not draw this distinction. In fact, some identify occurrent beliefs and judgments, as Russell seems to do (see also, for example, Lycan 1988). The distinction may however be important to some views. For example, Hanks (2011, 2015) and Soames (2010, 2014) argue that propositional attitudes, like believing, are partly constituted by dispositions to perform certain propositional acts, like entertaining and judging; and that the contents of propositional attitudes can be identified with types of such acts. A primary motivation for this view is that it can constitute a solution to the problem of the cognitive accessibility of propositions, which the classical view (discussed Section 2) gives rise to.
c. Knowing
Epistemologists distinguish between knowledge-that, knowledge-how, and knowledge-wh (which includes knowledge-who, -what, -when, -whether, and -why). Examples include knowing that Goldbach’s conjecture has not been proven, knowing how to construct a truth-table, and knowing who Glenn Gould is. (Knowing your uncle might be an example of yet another kind of knowledge. Compare the contrast between kennen and wissen in German, or connaître and savoir in French. Related to this is the distinction drawn by Russell between knowledge by acquaintance and knowledge by description.) Most theorists view knowledge-wh as a species of knowledge-that. On a commonly held view, knowledge-that involves propositions as objects: for example, the proposition that Goldbach’s conjecture has not been proven. Famously, Ryle (1946, 1949) argued that knowledge-how is an ability, construed as a complex of dispositions, and is not reducible to knowledge-that (characterized in the above manner). Others argue that the one is reducible to the other. For example, Stanley and Williamson (2001) argue that knowledge-how reduces to knowledge-that. While some argue that knowing-that is a sui generis propositional attitude (Williamson 2000), a more traditional view is that such states of knowledge are just belief states which meet certain additional extra-mental conditions (most obviously, being true and justified). Similar remarks go for other factive attitudes, including recognizing and discovering.
d. Perceiving
The question of whether perceiving, and seeing in particular, is a propositional attitude has inspired a voluminous literature in the epistemology of perception. According to certain classical views, for example sense-data theory, propositional content does not belong to perceptual states, which cannot therefore be propositional attitudes, but instead to the judgments or beliefs they occasion. On this view, perceptual states themselves, insofar as they are conscious, are composed of just raw feels or qualia; for example, the redness of a visual experience of a ripe tomato. Other theories hold that perceptual states have a kind of non-conceptual content (Evans 1982, Peacocke 2001, Crane 2009), while still others maintain that they must have conceptual or propositional content if they are to justify perception-based beliefs (McDowell 1994). These positions touch on many vexing issues at the intersection of epistemology, philosophy of mind, and cognitive science. For further discussion, see the article on Cognitive Penetrability of Perception and Epistemic Justification, in addition to those already linked to.
e. Intending
When you trip, sneeze, or blink, these are in some weak sense things you do. However, they are not, in a philosophically significant sense, actions you perform; they are not manifestations of your agency. It is less misleading to say that they are things that happen to you, or events of which you are subject, not agent. To distinguish between the events of which you are subject and those of which you are agent, most philosophers point to the involvement of intention. There are, however, many difficulties here.
To begin with, the way in which intention is a unified phenomenon, if it is, is not straightforward. Clearly, we can intend to thread the needle and pin the tail on the donkey. We can intentionally thumb our noses, swim the channel, and cross our eyes. We can move our fingers in a certain way, with the intention to tie our shoes. At first blush, it might seem that we have three distinct phenomena here: intending to act (prospective intention), acting intentionally, and acting with an intention (intention with which). We also have the suspicion that these phenomena are ultimately unified. After all, we use the term “intention” and its cognates to describe them all. Attempting to explain this unity has animated much of the philosophy of action in the late 20th and early 21st century.
In her very influential work Intention, Elizabeth Anscombe—partly influenced by the later Wittgenstein—identified intending to Φ (where “Φ” here stands in place of some action, for example: threading the needle) with Φ-ing intentionally and denied that intention is a mental state. This view has as a consequence that if, for example, you intend to visit Cairo in four years, you have already embarked. This consequence has struck many as implausible, though a view along these lines has be revived and given extensive defense by Michael Thompson (2008). Others, for example Davidson (1978), have claimed that one can intend to Φ even if one is not Φ-ing, intentionally or otherwise, and even if one is not currently performing any actions with the intention to Φ. In other words, there are cases of “pure intending” (for example, my intending to visit Cairo) and these are instances of prospective intention or intention for the future. Davidson also observed that the same intention may be present when the relevant actions are being performed and this indicates that intention, whether prospective or present in the course of action, is the basic phenomenon. To the question of what is intention or intending, Davidson’s answer is that intention is a mental state, and an attitude more specifically. (The Anscombe-type view is usually paired with an anti-causal view of folk psychological explanation, while the Davidsonian view is usually paired with a causal view. See Section 1.) However, if intention is an attitude, it is still to be determined what kind of attitude it is and what (if anything) its object is; and answers vary widely.
Some hold that intention is a species of desire (for example, to intend to Φ is to desire to Φ), others that it is a species of belief (for example, to intend to Φ is to believe that one ought to Φ, or will Φ, or is Φ-ing), and still others that it is some combination of desire and belief (for example, the desire to Φ and a belief about how to go about Φ-ing). Finally, some argue that intention is a sui generis state—for example, something like a plan which controls behavior (Bratman 1987). In any case, if intentions are attitudes of some sort, there remains the question of whether they are more specifically propositional attitudes.
Above, it was noted that some theorists have held that attitudes generally are propositional attitudes and so in some way involve propositions—usually, with propositions as objects of the attitudes, where these latter are construed as relations of some kind. Unlike with, say, belief reports, which are naturally worded with that-clauses like “that the earth moves”, as featured in reports like “Galileo believes that the earth moves”, reports of intentions are typically more naturally worded without that-clauses. (On that-clauses, see Section 3.) For example, we typically say things like “I intend to pay off my debts this year”, not “I intend that I will have paid off my debts this year” or “I intend that I should pay off my debts this year”. Likewise, we say “I intend to sleep”, not “I intend that I will be sleeping” or “I intend that I should be sleeping”. This might suggest that the objects of intentions (if there are any) are not propositions but acts, activities, processes, events, or the like—in which case, intentions would not, strictly speaking, be propositional attitudes.
Detailed discussions of intention are to be found primarily in metaethics and philosophy of action, which rests at the intersection of metaethics and philosophy mind, though an adequate account of intention would be foundational to research in other areas, including, for example, in the philosophy of language, and in particular to intention-based accounts of meaning and communication.
f. Non-Propositional Attitudes
Above, it was noted that reports of intentions are often most naturally worded without that-clauses. As it happens, the same is true of at least some reports of desires—desires being one of our paradigmatic propositional attitudes. Consider, for example, a report like “The baby desires to be held”. Here, instead of a that-clause we have a non-finite complement, “to be held”, which seems to designate not a proposition but an action. Of course, we do not see this just with desire reports. Consider: “You want to visit Cairo”, “Everyone wishes to live happily ever after”, “When he does that, he means to insult you”. Moreover, we sometimes have general terms and proper names instead of clauses following the attitude verb. For example: “Molly desires cake”, “Maxine desires Molly”. Similarly, “Jack fears dogs”, “Ibram likes chocolate ice cream”, “Jill loves Xi”, and so forth. In fact, we even see this with some belief reports. For example: “Sara believes Galileo”. If reports with these forms suggest that at least some attitudes are not propositional attitudes, that is because we are using the form of reports to guide our views about the nature of the things reported on (Ben-Yami 1997)—that is, in other words, we are using the reading off method (see Section 3).
Montague (2007), Buchanan (2012), and Grzankowski (2012) think that the reading off method reveals that there are, in addition to propositional attitudes, non-propositional (or objectual) attitudes, as ascribed, for example, by reports like “Desdemona loves Cassio”. Of course, there are competing views about the form of these reports, with some arguing that, despite surface appearances, these reports have the same form as reports like “Galileo believes that the earth moves” (Larson 2002). Also, the legitimacy of this method of reading off the nature of what is reported from the form and interpretation of reports has been challenged (again, see Section 3). Moreover, some arguments for the thesis that at least certain putative propositional attitudes, desires for example, are in fact not propositional attitudes are based not on linguistic observations but on comparative psychological and neuroscientific evidence (Thagard 2006). It could be that we need a finer-grained taxonomy.
g. Delusions
According to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), a delusion is “a false belief based on incorrect inference about external reality that is firmly sustained despite what almost everyone else believes and despite what constitutes incontrovertible and obvious proof or evidence to the contrary” (p. 819). In other words, according to the DSM-5, delusions are pathological beliefs. These include, for example, Cotard delusion, the delusion that one is dead, and Capgras delusion, the delusion that one’s significant other has been replaced by an imposter.
At first blush, it is plausible that these delusions are indeed beliefs, however pathological. After all, people who have these delusions do sincerely assert that they are dead or that their significant other has been replaced by an imposter, and sincere assertions are typically expressions of belief. Moreover, these delusions have beliefs’ direction of fit, although they are false per definition. In other respects, however, these and other delusions are unlike typical beliefs.
For one, they often do not cohere with the delusional subject’s other beliefs and behavior. For example, a person suffering Capgras may continue to live with the “imposter” and not seek the “real” significant other they have replaced. Additionally, delusions are often not sensitive to evidence in the same way beliefs typically are, both with regard to their formation and maintenance. For example, someone with Cotard delusion may have no perceptual evidence for their delusion and may maintain the delusion despite overwhelming evidence to the contrary. In these respects, delusions might be closer to imaginings than beliefs (Currie 2000). It might be best to say that delusions are a sui generis kind of propositional attitude, somewhat like belief and somewhat like imagining (Egan 2009). After all, it is often not the case that the delusion is completely severed from all non-verbal behavior and affect. Subjects with Cotard delusion, for example, may be very distressed by their delusion and may stop bathing or caring for themselves (Young and Leafhead 1996).
Delusions are not easy to place. On the one hand, reflection on delusions highlights respects in which many ordinary beliefs might be considered delusional in some respects. Many sincerely professed religious, political, and philosophical beliefs, for example, might fail to cohere with one’s other beliefs and behavior or to be sensitive to evidence in the way other beliefs are. It is also very common to falsely believe, for example, that one is better than average at driving, that one’s children are smarter than average, and so on. It could be that delusions are not qualitatively different from many typical beliefs. Perhaps, what marks off delusions from typical beliefs is the degree to which they depart from societal norms or norms of rationality.
On the other hand, the underlying mechanisms at work may be different. As we learn more about the brain, we are beginning to get plausible explanations of some delusions (for example Capgras) in terms of neuropathology. Perhaps not all delusions can be explained in this way. Some—for example, erotomania, the delusion that one is loved by someone, usually in a position of power or prestige—might be motivated in a way amenable to psychological explanations at the personal level, much as the typical beliefs noted above may be. This in turn may point up the fact that delusion and self-deception, a vexing and philosophically fraught topic in its own right, may overlap.
As their inclusion in the DSM would suggest, delusions are typically thought to be maladaptive— and certainly they often are. However, recent work by philosopher Lisa Bortolotti (2015, 2020) and others point up the fact that delusions—as above defined, and irrational beliefs more generally, including optimistic biases, distorted memory-based beliefs, and confabulated explanations—can not only be psychologically adaptive but even (in Bortolotti’s terminology) “epistemically innocent”: they can help one to secure epistemic goods one would not otherwise enjoy (as may be seen, Bortolotti argues, in cases of anosognosia, in which the subject is unable to understand or perceive their illness or disorder). Epistemically innocent beliefs (including overestimations of our capacities) appear to be widespread in the non-clinical population—a fact which, if more widely appreciated, could modify potentially stigmatizing attitudes to clinical cases. Work in this area may also have significance for treatment.
h. Implicit Bias
Recent psychological research in implicit social cognition suggests that people often make judgments and act in ways contrary to their professed beliefs—ways that evidence implicit bias, including gender, racial, and prestige bias. For example, changing only the name at the top of a CV from a stereotypically White name to a stereotypically Black name or from a stereotypically male name to a stereotypically female name results on average in lower evaluations of the CV (Bertrand and Mullainathan 2004, Moss-Racusin et al. 2012). In a striking 1982 study, Peters and Ceci took papers already published in top peer-reviewed journals, altered the names and institutional affiliations associated with the papers, and resubmitted them to the same journal: 89% were rejected (only 8% being identified as plagiarized). Perhaps the most well-known and widely discussed measure of implicit bias is the Implicit Association Test (IAT), a reaction-time measure that requires subjects to sort images or words into categories as quickly and accurately as possible. Among the findings with this test are that subjects find it easier to sort stereotypically White names with positive words (like “morality” and “safe”) and stereotypically Black names with negative words (like “bad” and “murder”’) than the other way round. Subjects are also more likely to mistake a harmless tool for a weapon when a Black face is flashed on the screen than they are when a White face is flashed on the screen (Payne 2001). Members of the stigmatized groups, it has been found, are not immune to these biases. Such findings raise many questions regarding the metaphysics of implicit bias and its epistemological ramifications (regarding self-knowledge, for example), as well as questions of ethics.
Provided that we often read off people’s beliefs from their behaviors, it is natural to count implicit biases as beliefs. On the other hand, as noted in the preceding subsection, it is often thought that beliefs generally cohere with one another and are responsive to reasons. They are also often accessible to consciousness. In these respects, implicit biases appear to be unlike beliefs. One may be utterly oblivious to one’s implicit biases. One may find these biases disturbing, abhorrent, or irrational. However, it appears that it is not enough, in order to get rid of them, to sincerely believe that they are not tracking facts about the world.
On considerations like these, Tamar Gendler (2008b) has proposed that implicit biases fall into a category distinct from beliefs, namely what she calls “aliefs”, which are “associative, automatic, and arational” and typically “affect-ladden and action-generating” mental states (p. 557). An example of a less ethically charged alief would be alieving that the Grand Canyon Skywalk is unsafe (as manifest, for example, in trembling, clutching the handrails, shrieking), while firmly believing that it is safe (Gendler 2008a).
Some theorists have cast doubt on the idea that aliefs form a separate psychological kind (Nagel 2012, Mandelbaum 2013). Some argue that implicit measures like IAT and explicit measures (for example, surveys) alike measure beliefs; it is just that one’s implicit and explicit beliefs, which may be contradictory, are activated (that is, come online) in different contexts. In other words, we have compartmentalized or fragmented systems of belief (Lewis 1982, Egan 2008). Proponents of this kind of view generally adopt a computational-representational view of belief. On a dispositionalist view, beliefs are rather like traits; and just as one can be kind of a jerk and kind of not, in different respects and in different contexts, perhaps one can sort of believe that the Skywalk is safe and sort of not: one imperfectly realizes or exemplifies a certain dispositional profile or stereotype associated with the relevant beliefs. Perhaps the phenomenon of belief, in brief, allows for in-between cases (Schwitzgebel 2010).
There are still other proposals besides—and criticisms for each. The literature here is vast and growing, and the issues are many and complex (see, for example, the collections by Brownstein and Saul 2016a, 2016b). This subsection provides just a gesture at the literature.
The foregoing should give some indication of the significance of the propositional attitudes, and so theories thereof, to a great many intellectual projects, across various domains of philosophy and allied disciplines. It should also give an idea of how contested the positions are. It is only somewhat recently that much work has been devoted to discussion of the complexities of particular attitudes, or related phenomena, as opposed to the attitudes generally. Indeed, much of the general discussion of the attitudes has drawn on consideration of belief alone. The literature on particular attitudes, or related phenomena, such as the above, is growing rapidly. It is an exciting, interdisciplinary area of research. To be sure, there is here much more work to be done.
6. References and Further Reading
American Psychiatric Association. 2013. Diagnostic and Statistical Manual of Mental Disorders (5th ed.). Arlington, VA.
Anscombe, G.E.M. 1957 [1963]. Intention. Harvard University Press.
Baker, L.R. 1987. Saving Belief: A Critique of Physicalism. Princeton University Press.
Baker, L.R. 1995. Explaining Attitudes: A Practical Approach to the Mind. Cambridge University Press.
Benacerraf, P. 1973. Mathematical Truth. Journal of Philosophy 70 (19): 661-79.
Ben-Yami, H. 1997. Against Characterizing Mental States as Propositional Attitudes. Philosophical Quarterly 47 (186): 84-9.
Bertrand, M., and Mullainathan, S. 2004. Are Emily and Greg More Employable than Lakisha and Jamal? American Economic Review 94 (4): 991–1013.
Bortolotti, L. 2015. The Epistemic Innocence of Motivated Delusions. Consciousness and Cognition 33: 490-99.
Bortolotti, L. 2020. The Epistemic Innocence of Irrational Beliefs. Oxford University Press.
Braddon-Mitchell, D. and Jackson, F. 1996. The Philosophy of Mind: An Introduction. Blackwell.
Bratman, M. 1987. Intention, Plans, and Practical Reason. Harvard University Press.
Brownstein, M. & J. Saul (eds.). 2016a. Implicit Bias and Philosophy: Volume I, Metaphysics and Epistemology. Oxford University Press.
Brownstein, M. & J. Saul (eds.). 2016b. Implicit Bias and Philosophy: Volume 2, Moral Responsibility, Structural Injustice, and Ethics. Oxford University Press.
Buchanan, R. 2012. Is Belief a Propositional Attitude? Philosophers’ Imprint 12 (1): 1-20.
Burge, T. 2010. Origins of Objectivity. Oxford University Press.
Camp, E. 2007. Thinking with Maps. Philosophical Perspectives 21 (1): 145-82.
Carnap, R. 1947 [1956]. Meaning and Necessity: A Study in Semantics and Modal Logic. The University of Chicago Press.
Carnap, R. 1959. Psychology in Physical Language. In A.J. Ayer (ed.). Logical Positivism. Free Press.
Carruthers, P. 1996. Simulation and Self-Knowledge: A Defence of Theory-Theory. In P. Carruthers and P. Smith (eds.), Theories of Theories of Mind. Cambridge University Press: 22-38.
Chisholm, R. 1957. Perceiving: A Philosophical Study. Cornell University Press.
Chomsky, N. 1980. Rules and Representations. Columbia University Press.
Chomsky, N. 1981. Lectures on Government and Binding. Mouton.
Chomsky, N. 1992. Explaining Language Use. Philosophical Topics 20 (1): 205-31.
Churchland, P. M., 1981. Eliminative Materialism and the Propositional Attitudes. Journal of Philosophy 78: 67–90.
Churchland, P.S. 1986. Neurophilosophy: Toward a Unified Science of the Mind/Brain. MIT Press.
Crane, T. 2001. Elements of Mind: An Introduction to the Philosophy of Mind. Oxford University Press.
Crane, T. 2003. The Mechanical Mind: A Philosophical Introduction to Minds, Machines and Mental Representation. Routledge.
Crane, T. 2009. Is Perception a Propositional Attitude? Philosophical Quarterly 59 (236): 452-469.
Currie, G. 2000. Imagination, Delusion and Hallucination. Mind & Language 15 (1): 168-83.
Davidson, D. 1977. The Method of Truth in Metaphysics. In H. Wettstein et al. (eds.), Midwest Studies in Philosophy, II: Studies in Metaphysics. University of Minnesota Press.
Davidson, D. 1978. Intending. Philosophy of History and Action 11: 41-60.
Dennett, D. 1981. True believers: The Intentional Strategy and Why It Works. In A.F. Heath (Ed.), Scientific Explanation: Papers Based on Herbert Spencer Lectures given in the University of Oxford. Clarendon Press. [Reprinted in Dennett, D. 1987 [1998]. The Intentional Stance. MIT Press: 13-35]
Dennett, D. 1987 [1998]. The Intentional Stance. MIT Press.
Dennett, D. 1991. Real Patterns. The Journal of Philosophy 88 (1): 27-51.
Dretske, F. 1988. Explaining Behavior: Reasons in a World of Causes. MIT Press.
Dretske, F. 1989. Reasons and Causes. Philosophical Perspectives 3: 1-15.
Dretske, F. 1993. Mental Events as Structuring Causes of Behavior. In Mental Causation, J. Heil and A. Mele (eds.). Oxford University Press: 121-136.
Egan, A. 2008. Seeing and Believing: Perception, Belief Formation and the Divided Mind. Philosophical Studies 140 (1): 47–63.
Egan, A. 2009. Imagination, Delusion, and Self-Deception. In T. Bayne and J. Fernandez (eds.), Delusion and Self-Deception: Motivational and Affective Influences on Belief-Formation. Psychology Press: 263-80.
Evans, G. 1982. The Varieties of Reference. Oxford University Press.
Fodor, J. 1975. The Language of Thought. Harvard University Press.
Fodor, J. 1978. Propositional Attitudes. The Monist 61 (4): 501-23.
Fodor, J. 1987. Psychosemantics: The Problem of Meaning in The Philosophy of Mind. MIT Press.
Fodor, J. 1990. A Theory of Content and Other Essays. MIT Press.
Frege, G. 1892 [1997]. On Sinn and Bedeutung (trans. M. Black). In M. Beaney (ed.), The Frege Reader. Blackwell: 151-71.
Frege, G. 1918 [1997]. Thought (trans. P. Geach and R.H. Stoothoff). In M. Beaney (ed.), The Frege Reader. Blackwell: 325-45.
Geach, P. 1957. Mental Acts: Their Content and Their Objects. Routledge and Kegan Paul.
Gendler, T. 2008a. Alief and Belief. The Journal of Philosophy 105 (10): 634–63.
Gendler, T. 2008b. Alief in Action (and Reaction). Mind & Language 23 (5): 552–85.
Goldman, A. 1989. Interpretation Psychologized. Mind & Language 4 (3): 161-85.
Goldman, A. 1993. The Psychology of Folk Psychology. Behavioral and Brain Sciences 16 (1): 15-28.
Goldman, A. 2006. Simulating Minds. Oxford University Press.
Gordon, R. 1986. Folk psychology as simulation. Mind & Language 1 (2): 158-71.
Grzankowski, A. 2012. Not All Attitudes Are Propositional. European Journal of Philosophy 3: 374-91.
Hanks, P. 2011. Structured Propositions as Types. Mind 120 (477): 11-52.
Hanks, p. 2015. Propositional Content. Oxford University Press.
Harman, G. 1973. Thought. Princeton University Press.
Heal, J. 1986. Replication and Functionalism. In J. Butterfield (Ed.), Language, Mind, and Logic. Cambridge University Press: 135-50.
Humberstone, I.L. 1992. Direction of Fit. Mind 101 (401): 59-83.
Hutto, D. 2008. Folk Psychological Narratives. MIT Press.
King, J. 2002. Designating Propositions. Philosophical Review 111 (3): 341-71.
Kripke, S. 1972. Naming and Necessity. Harvard University Press.
Larson, R. 2002. The Grammar of Intensionality. In G. Preyer and G. Peter (eds.), Logical Form and Language. Oxford University Press: 228-62.
Larson, R. and Ludlow, P. 1993. Interpreted Logical Forms. Synthese 95 (3): 305-55.
Lewis, D. 1970. How to Define Theoretical Terms. Journal of Philosophy 67 (13): 427-46.
Lewis, D. 1972. Psychophysical and Theoretical Identifications. Australasian Journal of Philosophy 50 (3): 249-58.
Lewis, D. 1982. Logic for Equivocators. Noûs 16 (3): 431–441.
Lycan, W. 1988. Judgment and Justification. Cambridge University Press.
Mandelbaum, E. 2013. Against Alief. Philosophical Studies 165 (1): 197-211.
Marr, D. 1982. Vision. W.H. Freeman.
Matthews, R. 2007. The Measure of Mind: Propositional Attitudes and Their Attribution. Oxford University Press.
McDowell, J. 1994. Mind and World. Harvard University Press.
McGeer, V. 2007. The Regulative Dimension of Folk-Psychology. In D. Hutto and M. Ratcliff (eds.), Folk-Psychology Reassessed. Springer.
Millikan, R. 1993. White Queen Psychology and Other Essays for Alice. MIT Press.
Moltmann, F. 2017. Cognitive Products and the Semantics of Attitude Verbs and Deontic Modals. In F. Moltmann and M. Textor (eds.), Act-Based Conceptions of Propositional Content. Oxford University Press.
Montague, M. 2007. Against Propositionalism. Noûs 41 (3): 503-18.
Morton, A. 1980. Frames of Mind: Constraints on the Common-Sense Conception of the Mental. Oxford University Press.
Moss-Racusin, C. et al. 2021. Science Faculty’s Subtle Gender Biases Favor Male Students. Proceedings of the National Academy of Sciences of the United States of America 109 (41): 16474-9.
Nagel, J. 2012. Gendler on Alief. Analysis 72 (4): 774–88.
Nagel, T. 1974. What Is It Like to Be a Bat? Philosophical Review 83 (10): 435-50.
Nichols, S. and Stich, S. 2003. Mindreading. Oxford University Press.
Payne, B. 2001. Prejudice and Perception: The Role of Automatic and Controlled Processes in Misperceiving a Weapon. Journal of Personality and Social Psychology 81 (2): 181–92.
Peacocke, C. 2001. Does Perception Have Nonconceptual Content? The Journal of Philosophy 98 (5): 239-64.
Peters, D. and Stephen Ceci, J. 1982. Peer-Review Practices of Psychological Journals: The Fate of Published Articles, Submitted Again. Behavioral and Brain Sciences 5 (2):187–255.
Porot, N. and Mandelbaum, E. 2021. The Science of Belief: A Progress Report. WIREs Cognitive Science 12 (2): e1539.
Prior, A.N. 1971. Objects of Thought. Clarendon Press.
Putnam, H. 1963. Brains and Behavior. In R.J. Butler (ed), Analytical Philosophy: Second Series. Blackwell: 1-19.
Putnam, H. 1975. The Meaning of “Meaning”. Minnesota Studies in the Philosophy of Science 7: 131-93.
Quilty-Dunn, J. and Mandelbaum, E. 2018. Against Dispositionalism: Belief in Cognitive Science. Philosophical Studies 175 (9): 2353-72.
Quine, W.V. 1948. On What There Is. Review of Metaphysics 2 (5): 12-38.
Quine, W.V. 1960. Word and Object. MIT Press.
Ramsey, W., Stich, S. and Garon, J. 1990. Connectionism, Eliminativism and The Future of Folk
Richard, M. 2006. Propositional Attitude Ascription. In M. De Witt and R. Hanley (eds.), The Blackwell Guide to The Philosophy of Language. Blackwell.
Russell, B. 1903. The Principles of Mathematics. Allen and Unwin.
Russell, B. 1905. On Denoting. Mind 14 (56): 479-93.
Russell, B. 1918. The Philosophy of Logical Atomism, Lectures 1-2. The Monist 28 (4): 495-527.
Ryle, G. 1946. Knowing How and Knowing That: The Presidential Address. Proceedings of the Aristotelian Society 46 (1): 1-16.
Ryle, G. 1949. The Concept of Mind. Chicago University Press.
Salmon, N. 1986. Frege’s Puzzle. Ridgeview.
Schiffer, S. 2003. The Things We Mean. Oxford University Press.
Schellenberg, S. 2013. Belief and Desire in Imagination and Immersion. Journal of Philosophy 110 (9): 497-517.
Schwitzgebel, E. 2002. A Phenomenal, Dispositional Account of Belief. Noûs 36 (2): 249-75.
Schwitzgebel, E. 2010. Acting Contrary to Our Professed Beliefs, or The Gulf between Occurrent Judgment and Dispositional Belief. Pacific Philosophical Quarterly 91 (4): 531-53.
Schwitzgebel, E. 2013. A Dispositional Approach to Attitudes: Thinking Outside the Belief Box. In N. Nottelmann (ed.), New Essays on Belief. Palgrave Macmillan: 75–99.
Searle, J. 1983. Intentionality. Cambridge University Press.
Searle, J. 1992. The Rediscovery of Mind. MIT Press.
Searle, J. 2001. Rationality in Action. MIT Press.
Sehon, S. 1997. Natural Kind Terms and the Status of Folk Psychology. American Philosophical Quarterly 34 (3): 333-44.
Sellars, W. 1956. Empiricism and the Philosophy of Mind. Minnesota Studies in the Philosophy of Science 1: 53-329.
Shah, N. and Velleman, D. 2005. Doxastic Deliberation. Philosophical Review 114 (4): 497-534.
Soames, S. 2010. What Is Meaning? Princeton University Press.
Soames, S. 2014. Cognitive Propositions. In J. King, S. Soames and J. Speaks (eds.), New Thinking About Propositions. Oxford University Press:
Stalnaker, R. 1984. Inquiry. MIT Press.
Stanley, J. and Williamson, T. 2001. Knowing How. The Journal of Philosophy 98 (8): 411-4.
Sterelny, K. 1990. The Representational Theory of Mind. Blackwell.
Stich, S. 1983. From Folk Psychology to Cognitive Science. MIT Press.
Stich, S. 1996. Deconstructing the Mind. Oxford University Press.
Thagard, P. 2006. Desires Are Not Propositional Attitudes. Dialogue 45 (1): 151-6.
Thompson, M. 2008. Life and Action. Harvard University Press.
Wedgwood, R. 2002. The Aim of Belief. Philosophical Perspectives 16: 267-97.
Weiskopf, D. and Adams, F. 2015. An Introduction to the Philosophy of Psychology. Cambridge University Press.
Williamson, T. 2000. Knowledge and Its Limits. Oxford University Press.
Wittgenstein, L. 1953. Philosophical Investigations. Wiley-Blackwell.
Young, A.W. and Leafhead, K. 1996. Betwixt Life and Death: Case Studies of the Cotard Delusion. In P. Halligan and J. Marshall (eds.) Method in Madness: Case Studies in Cognitive Neuropsychiatry. Psychology Press: 147-71.
Zawidzki, W. 2013. Mindshaping: A New Framework for Understanding Human Social Cognition. MIT Press.
Time is what clocks are used to measure. Also, information about time tells the durations of events and when they occur and which events happen before which others, so time plays a very significant role in the universe’s structure. But the attempt to carefully describe time’s properties has led to many unresolved issues, both philosophical and scientific.
Consider this issue upon which philosophers are deeply divided: What sort of ontological differences are there among the present, the past and the future? There are three competing philosophical theories. Presentism implies that necessarily only present objects and present events are real, and we conscious beings can recognize this in the special vividness of our present experiences compared to our relatively dim memories of past experiences and dim expectations of future experiences. So, the dinosaurs have slipped out of reality even though our current ideas of them have not. However, the growing-past theory implies the past and present are both real, but the future is not, because the future is indeterminate or merely potential. Dinosaurs are real, but our future death is not. The third theory, eternalism, is that there are no objective ontological differences among present, past, and future because the differences are merely subjective, depending upon whose present we are referring to.
In no particular order, here is a list of other issues about time that are discussed in this article:
•Whether there was a moment without an earlier one.
•Whether time exists when nothing is changing.
•What kinds of time travel are possible.
•Whether time has an arrow.
•How time is represented in the mind.
•Whether time itself passes or flows.
•How to distinguish an accurate clock from an inaccurate one.
•Whether what happens in the present is the same for everyone.
•Which features of our ordinary sense of the word time are, or should be, captured by the concept of time in physics.
•Whether contingent sentences about the future have truth-values now.
•Whether tensed facts or tenseless facts are ontologically fundamental.
•The proper formalism or logic for capturing the special role that time plays in reasoning.
•Whether an instant can have a zero duration and also a very next instant.
•What happens to time near or inside a black hole.
•What neural mechanisms account for our experience of time.
•Whether time is objective or subjective.
•Whether there is a timeless substratum from which time emerges.
•Which specific aspects of time are conventional.
•How to settle the disputes between proponents of McTaggart’s A-theory and B-theory of time.
This article does not explore how time is treated within different cultures and languages, nor how persons can more efficiently manage their time, nor what entities are timeless.
Philosophers of time want to build a robust and defensible philosophical theory of time, one that resolves as many of the issues that they can on the list of philosophical issues mentioned in the opening summary, or at least they want to provide a mutually consistent set of proposed answers to them that is supported by the majority of experts on these issues. That list of issues is very long. Here is a shorter list of the most important issues in the philosophy of time:
Are only present objects and present events real?
Is time fundamental, or does it emerge from something more fundamental?
Does time itself have an intrinsic arrow?
Is time is independent of space and of physical objects and of what they are doing?
Is time smooth or does it have a smallest allowable duration?
Does the extent of time include an infinite past and infinite future?
What aspects of time are conventional?
Is time best understood with McTaggart’s A-theory or his B-theory?
This last question is asking about the complicated relationship between the commonsense image of time and the scientific image of time. This is the relationship between beliefs about time held by ordinary speakers of our language and beliefs about time as understood through the lens of contemporary science, particularly physics. Our fundamental theories of physics are the general theory of relativity and quantum mechanics. They are precise and quantitative. They are fundamental because they cannot be derived from other theories.
When describing time, the commonsense image is expressed with non-technical terms such as now, flow, and past and not with technical physics terms such as continuum,reference frame, and quantum entanglement. Also. the scientific image uses underlying mechanisms such as atoms, fields, and other structures that are not detectable by us without scientific instruments. The Greek philosopher Anaxagoras showed foresight when he said, “Phenomena are a sight of the unseen.” What he might say today is that, “There is so much more to the world than we were evolved to see” (Frank Wilczek).
The manifest image or folk image is the understanding of the world as it appears to us using common sense untutored by advances in contemporary science. It does not qualify as a theory in the technical sense of that term but is more an assortment of tacit beliefs. The concept is vague, and there is no good reason to believe that there is a single shared folk concept. Maybe different cultures have different concepts of time. Despite the variability here, a reasonable way to make the concept a little more precise is to say it contains all the following beliefs about time [some of which are declared to be false according to the scientific image]: (1) The universe has existed for longer than five minutes. (2) We all experience time via experiencing change. (3) The future must be different from the past. (4) Time exists in all places. (5) You can change the direction you are going in space but not in time. (6) Every event has a duration, and, like length of an object and distance between places, duration is never negative. (7) Every event occurs at some time or other. (8) The past is fixed, but the future is open. (9) A nearby present event cannot directly and immediately influence a distant present event. (10) Time has an intrinsic arrow. (11) Time has nothing to do with space. (12) Given any two events, they have some objective order such as one happening before the other, or else their being simultaneous. (13) Time passes; it flows like a river, and we directly experience this flow. (14) There is a present that is objective, that every living person shares, and that divides everyone’s past from their future. (15) Time is independent of the presence or absence of physical objects and what they are doing.
Only items 1 through 7 of the 15 have clearly survived the impact of modern science. Item 9 fails because of quantum entanglement. Item 12 fails because of the relativity of simultaneity in the theory of relativity. Item 15 fails because of relativistic time dilation. Also, the scientific image has taken some of the everyday terms of the manifest image and given them more precise definitions.
The scientific image and the manifest image are not images of different worlds. They are images of the same reality. Both images have changed over the years. The changes have sometimes been abrupt. Regarding time, the most significant impact on its scientific image was the acceptance of the theory of relativity. See (Callender 2017) for a more detailed description and discussion of the controversies between the manifest image and the scientific image.
A popular methodology used by some metaphysicians is to start with a feature of the manifest image and then change it only if there are good reasons to do so. Unfortunately, there is no consensus among philosophers of time about what counts as a good reason, although there is much more consensus among physicists. Does conflict with relativity theory count as a good reason? Yes, say physicists, but Husserl’s classic 1936 work on phenomenology, The Crisis of European Sciences and Transcendental Phenomenology, criticized the scientific image because of its acceptance of so many of the implications of relativity theory, and in this spirit A. N. Prior said that the theory of relativity is for this reason not about real time.
Ever since the downfall of the Logical Positivists‘ program of requiring all meaningful, non-tautological statements to be reducible to commonsense statements about what is given in our sense experiences (via seeing, hearing, feeling, and so forth), few philosophers of science would advocate any explicit reduction or direct translation of statements expressed in the manifest image to statements expressed in the scientific image, or vice versa, but the proper relationship between the two images is an open question.
With the rise of the importance of scientific realism in both metaphysics and the philosophy of science in the latter part of the twentieth century, many philosophers of science would summarize the relationship between the two images by saying our direct experience of reality is real but overrated. They suggest that defenders of the manifest image have been creative, but ultimately they have wasted their time in trying to revise and improve the manifest image to lessen its conflict with the scientific image. Regarding these attempts in support of the manifest image, the philosopher of physics Craig Callender made this sharp criticism:
These models of time are typically sophisticated products and shouldn’t be confused with manifest time. Instead they are models that adorn the time of physics with all manner of fancy temporal dress: primitive flows, tensed presents, transient presents, ersatz presents, Meinongian times, existent presents, priority presents, thick and skipping presents, moving spotlights, becoming, and at least half a dozen different types of branching! What unites this otherwise motley class is that each model has features that allegedly vindicate core aspects of manifest time. However, these tricked out times have not met with much success (Callender 2017, p. 29).
In some very loose and coarse-grained sense, manifest time might be called an illusion without any harm done. However, for many of its aspects, it’s a bit like calling our impression of a shape an illusion, and that seems wrong (Callender 2017, p. 310).
Some issues listed in the opening summary are intimately related to others, so it is reasonable to expect a resolution of one to have deep implications for another. For example, there is an important subset of related philosophical issues about time that cause many philosophers of time to divide into two broad camps, the A-camp and the B-camp, because the camps are on the opposite sides of so many controversial issues about time.
The next two paragraphs summarize the claims of the two camps. Later parts of this article provide more introduction to the philosophical controversy between the A and B camps, and they explain the technical terms that are about to be used. Briefly, the two camps can be distinguished by saying the members of the A-camp believe McTaggart’s A-theory is the fundamental way to understand time; and they accept a majority of the following claims: past events are always changing as they move farther into the past; this change is the only genuine, fundamental kind of change; the present or “now” is objectively real; so is time’s passage or flow; ontologically we should accept either presentism or the growing-past theory because the present is somehow metaphysically privileged compared to the future; predictions are not true or false at the time they are uttered; tensed facts are ontologically fundamental, not untensed facts; the ontologically fundamental objects are 3-dimensional, not 4-dimensional; and at least some A-predicates are not semantically reducible to B-predicates without significant loss of meaning. The word “fundamental” in these discussions is used either in the sense of “not derivable” or “not reducible.” It does not mean “most important.”
Members of the B-camp reject all or at least most of the claims of the A-camp. They believe McTaggart’s B-theory is the fundamental way to understand time; and they accept a majority of the following claims: events never undergo what A-theorists call genuine change; the present or now is not objectively real and neither is time’s flow; ontologically we should accept eternalism and the block-universe theory; predictions are true or false at the time they are uttered; untensed facts are more fundamental than tensed facts; the fundamental objects are 4-dimensional, not 3-dimensional; and A-predicates are reducible to B-predicates, or at least the truth conditions of sentences using A-predicates can be adequately explained in terms of the truth conditions of sentences using only B-predicates. Many B-theorists claim that they do not deny the reality of the human experiences that A-theorists are appealing to, but rather they believe those experiences can be best explained from the perspective of the B-theory.
To what extent is time understood? This is a difficult question, not simply because the word understood is notoriously vague. There have been a great many advances in understanding time over the last two thousand years, especially over the last 125 years, as this article explains, so we can definitively say time is better understood than it was—clear evidence that philosophy makes progress. Nevertheless, in order to say time is understood, there remain too many other questions whose answers are not agreed upon by the experts. Can we at least say only the relatively less important questions are left unanswered? No, not even that. So, this is the state of our understanding time. It is certainly less than a reader might wish to have. Still, it is remarkable how much we do know about time that we once did not; and it is remarkable that we can be so clear about what it is that we do not know; and there is no good argument for why this still sought-after knowledge is beyond the reach of the human mind.
2. Physical Time, Biological Time, and Psychological Time
Physical time is public time, the time that clocks are designed to measure. Biological time is indicated by regular, periodic biological processes, and by signs of aging. The ticks of a human being’s biological clock are produced by heartbeats, the rhythm of breathing, cycles of sleeping and waking, and periodic menstruation, although there is no conscious counting of the cycles as in an ordinary clock. Biological time is not another kind of time, but rather is best understood as the body’s recording of physical time, in the sense that biological time is physical time measured with a biological process.
Psychological time is private time; it is also called subjective time and phenomenological time. Our psychological time can change its rate, compared to physical time, depending on whether we are bored or instead intensively involved. The position advocated by most philosophers is that psychological time is best understood not as a kind of time but rather as awareness of physical time, although there is no general agreement on this within the philosophical community. Psychological time is what people usually are thinking of when they ask whether time is just a construct of the mind.
There is no experimental evidence that the behavior of a clock that measures physical time is affected in any way by the presence or absence of mental awareness, or by the presence or absence of any biological phenomenon. For that reason, physical time is often called objective time and scientific time. The scientific image of time is the product of science’s attempt to understand physical time.
When a physicist defines speed to be distance traveled divided by the duration of the travel, the term time in that definition refers to physical time. Physical time is more helpful than psychological time for helping us understand our shared experiences in the world; but psychological time is vitally important for understanding many mental experiences, as is biological time for understanding biological phenomena.
Psychological time and biological time are explored in more detail in Section 17. Otherwise, this article focuses on physical time.
3. What is Time?
Time may not be what it seems.
Clocks can tell you what time it is, but they cannot tell you what time is. “Time is succession,” Henri Bergson declared, but that remark is frustratingly vague. So is the remark that time is the quality of the world that allows change to exist.
There is disagreement among philosophers of time as to what metaphysical structure is essential to time, that is, to physical time. Two historically important, competing recommendations are that time is (1) a one-dimensional structure of ordered instants satisfying McTaggart’s A-series, or (2) the same structure but satisfying his B-series. Think of an instant as a snapshot of the universe at a time, but relative to a single reference frame. More is said about the A and B series later in this article.
Maybe we can decide what time is by considering what our world would be like if it did not contain time. Where do we proceed from here, though? We cannot turn off time and look at the result. Unfortunately, our imagining the world without time is not likely to be a reliable guide.
Information about time tells the durations of events, and when they occur, and which events happen before which others, so any definition of time or theory of time should allow for this. Both relativity theory and quantum mechanics provide a linear ordering of the temporal points of spacetime by the asymmetric relation of happens-before. But this is just the tip of the iceberg when it comes to describing and explaining time.
Here are some considerations. Is it helpful to distinguish what time is from what it does? Should we be aiming to say time is what meets certain necessary and sufficient criteria, or should we aim for a more detailed and sophisticated philosophical theory about time, or should we say time is whatever plays this or that functional role such as accounting for our temporal phenomenology? Baron and Miller have argued that, if a demon plays the functional role of providing us with our temporal phenomenology, then we would not agree that time is a demon, so more constraints need to be placed on any functionalist account of time.
Many physicists have said time is whatever satisfies the requirements on the time variable “t” in the fundamental equations of physics. In reaction to this last claim, its opponents among philosophers of physics usually complain of scientism. Other researchers say time is what best satisfies our many intuitions about time in our manifest image Their opponents usually complain here of overemphasis on subjective features of time and of insensitivity to scientific advances.
Sometimes, when we ask what time is, we are asking for the meaning of the noun “time.” It is the most frequently used noun in the English language. A first step in that direction might be to clarify the difference between its meaning and its reference. The term time has several meanings. It can mean the duration between events, as when we say the trip from home to the supermarket took too much time because of all the traffic. It can mean, instead, the temporal location of an event, as when we say he arrived at the time they specified. It also can mean the temporal structure of the universe, as when we speak of investigating time rather than space. This article uses the word in all these senses.
Ordinary Language philosophers have carefully studied talk about time. This talk is what Ludwig Wittgenstein called the language game of discourse about time. Wittgenstein said in 1953, “For a large class of cases—though not for all—in which we employ the word ‘meaning’ it can be defined this way: the meaning of a word is its use in the language.” Perhaps an examination of all the uses of the word time would lead us to the meaning of the word. Someone, such as John Austin, following the lead of Wittgenstein, might also say more careful attention to how we use words would then enable us to dissolve rather than answer most of our philosophical problems about time. They would be shown to be pseudo-problems, and the concept of time would no longer be so mysterious.
That methodology of dissolving a problem was promoted by Austin in response to many philosophical questions. However, most philosophers of time in the twenty-first century are not interested in dissolving their problems about time nor in precisely defining the word time. They are interested in what time’s important characteristics are and in resolving philosophical disputes about time that do not seem to turn on what the word means. When Isaac Newton discovered that both the falling of an apple and the circular orbit of the Moon were caused by gravity, this was not primarily a discovery about the meaning of the word gravity, but rather about what gravity is. Do we not want some advances like this for time?
To emphasize this idea, notice that a metaphysician who asks, “What is a ghost?” already knows the meaning in ordinary language of the word ghost, and does not usually want a precise definition of ghost but rather wants to know what ghosts are and where to find them and how to find them; and they want a more-detailed theory of ghosts. This theory ideally would provide the following things: a consistent characterization of the most important features of ghosts, a claim regarding whether they do or do not exist and how they might be reliably detected if they do exist, what principles or laws describe their behavior, how they typically act, and what they are composed of. This article takes a similar approach to the question, “What is time?” The goal is to discover the best concept of time to use in understanding the world and to develop a philosophical theory of time that addresses what science has discovered about time plus what should be said about the many philosophical issues that practicing scientists usually do not concern themselves with.
There is much to learn about time from scientific theories, the fundamental scientific theories. The exploration in sections ahead adopts a realist perspective on these scientific theories. That is, it interprets them usually to mean what they say, even in their highly theoretical aspects, while appreciating that there are such things as mathematical artifacts. The perspective does not take a fictionalist perspective on scientific theories, nor treat them as merely useful instruments, nor treat them operationally. It assumes that, in building a scientific theory, the goal is to achieve truth even though most theories achieve this goal only approximately; but what makes them approximately true is not their corresponding to some mysterious entity called approximate truth. This approach to understanding has occasionally been challenged in the philosophical literature, and if one of the challenges is correct, then some of what is said below will require reinterpretation or rephrasing.
Everyone agrees that time has something to do with change. Presumably we can learn about the structure of time by studying changes and the structure of change, and presumably we can use clocks in order to measure time. This article’s supplement of “Frequently Asked Questions” discusses what a clock is, and what it is for a clock to be accurate as opposed to precise, and why we trust some clocks more than others. Saying physical time is what clocks measure, which is how this article began, is a remark about clocks and not a definition of time. But the remark is not as trivial as it might seem since it is a deep truth about our physical universe that it is capable of having clocks. We are lucky to live in a universe with so many different kinds of regular, periodic processes that tick in only one temporal direction and never tick backwards and that humans can use these for measuring time. However, some philosophers of physics claim that there is nothing more to time than whatever numbers are displayed on our clocks. In the anti-realist spirit of those who do say there is nothing more to time than whatever numbers are displayed by our clocks, the distinguished philosopher of science Henri Poincaré said in 1912, “The properties of time are…merely those of our clocks just as the properties of space are merely those of the measuring instruments.” The vast majority of philosophers of physics disagree with that claim. They say time is more than those numbers; it is what we intend to measure with those numbers.
Suppose we answer the question “What is time?” by saying time is the set of instants in their natural order, their happens-before order. If we add that the group of instants wouldn’t really be time if they weren’t asymmetric (so that, if instant a happens before instant b, then instant b cannot happen before instant a), this would rule out closed time loops a priori. To prevent this and make the question of whether time can circle back on itself be an empirical question, physicists will not require asymmetry for all of time’s instants, but only for smaller segments of time, “neighborhoods” of instants.
What then is time really? This is still an open question. Let’s consider how this question has been answered in different ways throughout the centuries. Here we are interested in very short answers that give what the proponent considers to be the key idea about what time is.
Aristotle proposed what has come to be called the relational theory of time when he remarked, “there is no time apart from change….” (Physics, chapter 11). He clarified his remark by saying, “time is not change [itself]” because a change “may be faster or slower, but not time…” (Physics, chapter 10). For example, a leaf can fall faster or slower, but time itself cannot be faster or slower. Aristotle claimed that “time is the measure of change” (Physics, chapter 12) of things, but he never said space is the measure of anything. Elsewhere he remarked that time is the steps between before and after.
René Descartes, known for doubting many things, never doubted the existence of time. He answered the question, “What is time?” by claiming that a material body has the property of spatial extension but no inherent capacity for temporal endurance and that God by his continual action sustains (or re-creates) the body at each successive instant. Time is a kind of sustenance or re-creation (“Third Meditation” in Meditations on First Philosophy, published in 1641). Descartes’ worry is analogous to that of Buddhist logicians who say, “Something must explain how the separate elements of the process of becoming are holding together to produce the illusion of a stable material world.” The Buddhist answer was causality. Descartes’ answer was that it is God’s actions.
In the late 17th century, Gottfried Leibniz, who is also a relationist as was Aristotle, said time is a series of moments, and each moment is a set of co-existing events in a network of relations of earlier-than and simultaneous-with. Isaac Newton, a contemporary of Leibniz, argued instead that time is independent of events. He claimed time is absolute in the sense that “true…time, in and of itself and of its own nature, without reference to anything external, flows uniformly…” (1687). This difference about time is also reflected in their disagreement about space. Newton thought of space as a thing, while Leibniz disagreed and said it is not a thing but only a relationship among the other things.
Both Newton and Leibniz assumed that time is the same for all of us in the sense that how long an event lasts is the same for everyone, no matter what they are doing. Their assumption would eventually be refuted by Albert Einstein in the early 20th century.
In the 18th century, Immanuel Kant made some very influential remarks that suggested he believed time and space themselves are forms that the mind projects upon the things-in-themselves that are external to the mind. In the twenty-first century, this is believed to be a misinterpretation of Kant’s intentions, even though he did say things that would lead to this false interpretation. What he actually believed was that our representations of space and time have this character. So, Kant’s remarks that time is “the form of inner sense” and that time “is an a priori condition of all appearance whatsoever” are probably best understood as suggesting that we have no direct perception of time but only have the ability to experience individual things and events within time. The “we” here is human beings; Kant left open the possibility that the minds of non-humans perceive differently than we humans do. Also, he left open the possibility that the world-in-itself, that is, the world as it is independently of being perceived, may or may not be temporal. The much more popular theory of mind in the 21st century implies conscious beings have unmediated access to the world; we can experience the external world and not merely experience internal representations of that world.
Ever since Newton’s theory of mechanics in the 17th century, time has been taken to be a theoretical entity, a theory-laden entity, in the sense that we can tell much about time’s key features by looking at the role it plays in our confirmed, fundamental theories. One of those is the theory of relativity that was created in the early 20th century. According to relativity theory, time is not fundamental, but is a necessary feature of spacetime, which itself is fundamental. Spacetime is all the actual events in the past, present, and future. In 1908, Hermann Minkowski argued that the proper way to understand relativity theory is to say time is really a non-spatial dimension of spacetime, and time has no existence independent of space. Einstein agreed. The most philosophically interesting feature of the relationship between time and space, according to relativity theory, is that which part of spacetime is space and which part is time are relative to a chosen frame of reference. We humans do not notice the unification of space and time into spacetime in our daily lives because we move slowly compared to the speed of light, and we never experience large differences in gravitational force.
In the early 20th century, the philosophers Alfred North Whitehead and Martin Heidegger said time is essentially the form of becoming. This is an idea that excited a great many philosophers, but not many scientists, because the remark seems to give ontological priority to the manifest image of time over the scientific image.
In the 21st century, the physicist Stephen Wolfram speculated that perhaps nature is a cosmic computer. He said every physical process is a natural computation, and time is the inexorable progress of this computation, and this progress is what other scientists have been calling “evolving according to the laws of nature.” Physicists have not warmed to Wolfram’s idea. No single updating has been observed experimentally. Specialists in relativity caution that Wolfram’s use of finite lengths violates Lorentz invariance and thus relativity theory. One of Wolfram’s critics, the philosopher of physics Tim Maudlin, remarked that, “The physics determines the computational structure, not the other way around.”
Whatever time is, one should consider whether time has causal powers. The musician Hector Berlioz said, “Time is a great teacher, but unfortunately it kills all its pupils.” Everyone knows not to take this joke literally because, when you are asleep and then your alarm clock rings at 7:00, it is not the time itself that wakes you. Nevertheless, there are more serious reasons to believe that time has causal powers. Drawing a conclusion from relativity theory, Princeton physicist John Wheeler said, “Spacetime tells matter how to move and matter tells spacetime how to curve.” There is a scientific consensus on this point that the general theory of relativity implies space and time are dynamic actors, not a passive stage where events occur, as Newton mistakenly believed.
Since Newton, nearly all physicists believed that time is like a mathematical line. Relativity theory and quantum mechanics both imply time is smooth and continuous so that, for any duration, there is a shorter duration. In 1916, Albert Einstein had written privately that, even though he was assuming space is a continuum in his new theory of relativity, in the end it would turn out that space is actually discrete. Presumably he believed the same for time, but there is no more information in the historical record. Werner Heisenberg and Niels Bohr, founders of quantum mechanics, also did not personally believe in a temporal and spatial continuum; but they did not promote their doubts because they could not reconcile their ideas and models of it with quantum mechanics, the physical theory they were in the process of creating. So, for the rest of the twentieth century it was accepted by nearly all experts that spacetime is a continuum, and textbooks promoted it to subsequent generations.
Yet during the first quarter of the 21st century, many experts began to suspect that both space and time are not continuous at the fundamental level. Their belief is that time is real, but it is real only because it emerges as the scale increases—in analogy to how the reality of temperature emerges at higher scales without any molecule having a temperature. There is a rising suspicion among 21st century physicists that, for shorter and shorter durations below the Planck time scale of 10-44 seconds, the notions of time and spacetime become progressively less applicable to reality. That is why some experts say that the whole idea of time is just an approximation. Even if this were to be so, it would still be appropriate to say time is real because human beings need to use the concept of time as the scale increases. Laplace’s Demon does not. The Demon has no limits on its computational capabilities and needs no simple models or coarse graining or approximations. Counter to this “rising suspicion” among their fellow physicists about time being less applicable to reality as the smallest scales, David Gross and Lee Smolin argue that time will remain fundamental at all scales in our future, fundamental theories.
Later sections of this article, including the supplement “What Else Science Requires of Time,” introduce other conjectures about how to answer the question, “What is time?”
a. Kinds of Physical Time
Are there different kinds of time?” There are many ways to measure time—by sundials, pendulums, crystal watches, atomic clocks, and so forth—but our question is whether they are all measuring the same thing. Could a pendulum measure gravitational time while an atomic clock measures electromagnetic time, and the difference between the two times is small enough that their difference has not yet been noticed?
In the 1930s, the physicists Arthur Milne and Paul Dirac worried about this question. Independently of each other, they suggested their fellow physicists should investigate whether there may be many correct, but differing, time scales. For example, Milne and Dirac worried that there could be the time of atomic processes and the time of nuclear processes and perhaps yet another time of gravitational processes. Perfectly-working clocks for any pair of these processes might drift out of synchrony after being initially synchronized without there being a reasonable explanation for why they do not stay synchronized. It would be a time mystery.
In 1967, physicists rejected the gravitational standard of time for the atomic standard of time because the observed deviation between periodic atomic processes in atomic clocks and periodic gravitational processes such as the Earth’s revolutions could be explained better by assuming that the atomic processes were more regular. Physicists still have no reason to believe a gravitational periodic process that is not affected by friction or impacts or other forces would ever mysteriously drift out of synchrony with a regular atomic process, yet this is the possibility that worried Milne and Dirac.
However, we may in the future be able to check on the possibility that electromagnetic time differs from nuclear time. So far, there has been no deep investigation of this possibility. The best atomic clocks are based on the stable resonances of light emitted as electrons transition from one energy level to another in the electron clouds of the atoms of the elements ytterbium and strontium. This electron activity is vulnerable to influence from all sorts of stray electric and magnetic fields, so significant shielding is required. Future nuclear clocks will not have this vulnerability. They will be based on transitions of energy levels of neutrons inside the atom’s nucleus, for example a thorium nucleus. Neutrons are better than protons because neutrons are electrically neutral and so are impervious to those stray fields that can easily affect atomic clocks. Neutrons are affected, though, in another way; they are affected by the strong nuclear force. This force is much stronger than the electromagnetic force, but only over extremely short distances on the order of less than a small atom’s width; and the likelihood of stray, strong nuclear forces that might affect a nuclear clock are expected to be minimal compared to the problems of ensuring the regularity or stability of an atomic clock.
What if the best atomic clocks were discovered to drift in comparison with the best nuclear clocks? Would that show that there are two distinct kinds of physical time? Maybe, but only if we have first eliminated the more mundane hypothesis that the drift is due to interference affecting the atomic clock (or perhaps the nuclear clock). If we could not account for the drift and were presented with a mystery, only then would we consider the more exotic hypothesis that worried Milne and Dirac.
4. Why There Is Time Instead of No Time
According to the scientific realist, the fundamental theories of physical science have ontological implications, one of which is that time exists. The theories imply it exists at least relative to a chosen reference frame (that is, a formal viewpoint or coordinate system). The reference frame is an abstract tool that explains what part of spacetime is its time part and what part is its space part. However, the fundamental theories have nothing to say about why spacetime exists and thus why time exists.
Among physicists and philosophers of physics, there is no agreed-upon answer to why our universe contains time instead of no time, why it contains dynamical physical laws describing change over time, whether today’s physical laws will hold tomorrow, why the universe contains the fundamental laws that it does contain, and why there is a universe instead of no universe, although there have been interesting conjectures on all these issues. For instance, Permenides first said nothing can come from nothing, but current philosophers and physicists speculate that perhaps there is something rather than nothing because throughout all time there has always been something and there is no good reason to believe it could transition to nothing since doing so would violate the law of conservation of energy.
There is little support for the claim that any of these unsolved problems are intractable, something too difficult for the human mind, in analogy to how even the most clever tuna fish will never learn the composition of the water it swims in.
Here is one not-too-serious linguistic explanation for why time exists: Without time there would be no verbs. A serious and interesting theological explanation for why time exists is that God wanted the world to be that way. Here is an anthropic explanation. If time were not to exist, we would not now be taking the time to ask why it does exist. Here is an intriguing non-theological and non-anthropic explanation. When steam cools, eventually it suddenly reaches a tipping point and undergoes a phase transition into liquid water. Many cosmologists agree with James Hartle’s speculation that the universe should contain laws implying that, as the universe cools, a phase transition occurs during which four-dimensional space is eventually produced from infinite-dimensional space; then, after more cooling, another phase transition occurs during which one of the four dimensions of primeval space collapses to become a time dimension. The previous sentence is a bit misleading because of its grammar which might suggest that something was happening before time began, but that is a problem with the English language, not with Hartle’s suggestion about the origin of time.
There is a multiverse answer to our question, “Why does time exist?” The reason why our universe exists with time instead of no time is that nearly every kind of universe exists throughout the inflationary multiverse; there are universes with time and universes without time. Like all universes in the multiverse, our particular universe with time came into existence by means of a random selection process without a conscious selector, a process in which every physically possible universe is overwhelmingly likely to arise as an actual universe, in analogy to how continual re-shuffling a deck of cards makes it overwhelmingly likely that any specific ordering of the cards will eventually appear. Opponents complain that this multiverse explanation is shallow. To again use the metaphor of a card game, they wish to know why their poker opponent had four aces in that last hand, and they are not satisfied with the shallow explanation that four aces are inevitable with enough deals or that it is just a random result. Nevertheless, perhaps there is no better explanation.
5. The Scientific Image of Time
Time has been studied for 2,500 years, but only in the early twentieth-century did time become one of the principal topics in professional journals of physics, and soon after in the journals of philosophy of science. The primary reason for this was the creation of the theory of relativity.
Any scientific theory can have its own implications about the nature of time, and time has been treated differently in different scientific theories over the centuries. When this article speaks of the scientific image of time or what science requires of time it means time of the latest, accepted theories that are fundamental in physics and so do not depend upon other theories. For example, Einstein’s theory of relativity is fundamental, but Newton’s theory of mechanics is not, nor is Newton’s theory of gravitation or Maxwell’s theory of electromagnetism. Newton’s concept of time is useful only for applications where the speed is slow, where there are no extreme changes of gravitational forces, and where durations are very large compared to the Planck time because, under these conditions, Newton’s theory agrees with Einstein’s. For example, Newton’s two theories are all that were needed to specify the trajectory of the first spaceship that landed safely on the Moon.
When scientists use the concept of time in their theories, they adopt positions that metaphysicians call metaphysical. They suppose there is a mind-independent universe in which we all live and to which their fundamental theories apply. Physical scientists tend to be what metaphysicians call empiricists. They also usually are physicalists, and they would agree with the spirit of W.V.O. Quine’s remark that, “Nothing happens in the world … without some redistribution of microphysical states.” This physicalist position can be re-expressed as the thesis that all the facts about any subject matter such as geophysics or farming are fixed by the totality of microphysical facts about the universe. Philosophers sometimes express this claim by saying all facts supervene on microphysical facts. Philosophers and some scientists are especially interested in whether the human mind might be a special counterexample to this physicalist claim. So far, however, no scientific experiments or observations have shown clearly that the answer to the metaphysical question, “Does mind supervene upon matter?” is negative. Nor do scientific observations ever seem to need us to control for what the observer is thinking.
In the manifest image, the universe is fundamentally made of objects rather than events. In the scientific image, the universe is fundamentally made of events rather than objects. Physicists use the term “event” in two ways, and usually only the context suggests which sense is intended. In sense 1, something happens at a place for a certain amount of time. In sense 2, an event is simply a location in space and time. Sense 2 is what Albert Einstein had in mind when he said the world of events forms a four-dimensional continuum in which time and space are not completely separate entities. In either of these two senses, it is assumed in fundamental scientific theories that longer events are composed of shorter sub-events and that events are composed of instantaneous events, called point-events. The presumption of there being instantaneous events is controversial. That presupposition upset Alfred North Whitehead who said: “There is no nature apart from transition, and there is no transition apart from temporal duration. This is why an instant of time, conceived as a primary simple fact, is nonsense” (Whitehead 1938, p. 207).
Frames of reference are perspectives on the space or the spacetime we are interested in. A coordinate system is what the analyst places on a reference frame to help specify locations quantitatively. A coordinate system placed on a reference frame of spacetime normally assigns numbers as names of temporal point-locations (called point-times) and spatial locations (point-places). The best numbers to assign are real numbers (a.k.a. decimals), in order to allow for the applicability of calculus. A duration of only a billionth of a second still contains a great many point-times, a nondenumerable infinity of them. Relativity theory implies there are an infinite number of legitimate, different reference frames and coordinate systems. No one of them is distinguished or absolute in Isaac Newton’s sense of specifying what time it “really” is, and where you “really” are, independently of all other objects and events. Coordinate systems are not objective features of the world. They vary in human choices made about the location of their origins, their scales, the orientation of their coordinate axes, and whether the coordinate system specifies locations by things other than axes, such as the angle between two axes. In relativity theory, reference frames are often called “observers,” but there is no requirement that conscious beings be involved.
The fundamental theories of physics having ontological implications are the general theory of relativity and quantum mechanics—that includes the standard model of particle physics but not the big bang theory and not statistical physics with its second law of thermodynamics. The two fundamental theories are often called collectively the Core Theory. The Core Theory is discussed in more detail in a companion article. For scientists, it provides our civilization’s best idea of what is fundamentally real physically. It implies that almost everything that is fundamentally real and physical is made of quantum fields. The exception is gravity. Note, though, that having a successful theory about what is physically real does not necessarily resolve the ontological issues about the nature of other things such as numbers, songs, and hopes. They may not be “made of” quantum fields.
The theory of relativity is well understood philosophically, but quantum mechanics is not, although the mathematical implications of both theories are well understood by mathematicians and physicists. These theories are not merely informed guesses. Each is a confirmed set of precise, teleology-free laws with recognized ways to apply the laws to physical reality. The theories have survived a great many experimental tests and observations, so the scientific community trusts their implications in cases in which they do not conflict with each other.
Here is the scientific image of time presented as a numbered list of its most significant implications about time, with emphasis upon relativity theory. The reason to avoid quantum mechanics is that there is considerable agreement among the experts that quantum mechanics might have deep implications about the nature of time, but there is little agreement on what those implications are. For example, scientists do not agree on whether quantum mechanics implies that time splits or branches into parallel universes, each having its own time. The impact of quantum mechanics on our understanding of time is discussed in this Supplement.
(1) When you look at a distant object, you see it as it was some time ago, not as it is.
We see Saturn now as it was an hour and a half ago. We see our hand as it was more recently. Because seeing an object requires light to travel from the object to our eyes, and because the speed of light is not infinite, and because it takes time for the brain to process information that it receives from the eyes, the information you obtain by looking at an object is information about how it was, not how it is. The more distant the object, the more outdated is the information.
(2) The duration of the past is at least 13.8 billion years.
The big bang theory of our cosmological origins is well confirmed (though not as well as relativity theory), and it requires the past of the observable universe to extend back at least 13.8 billion years ago to when an explosion of space occurred, the so-called “big bang.” This number is found primarily from imagining the current expansion of the observable universe to be reversed in time, and noting that the galaxies or whatever they evolved from had to have been very close together about 13.8 billion years ago. It is assumed that gravity if the only significant phenomenon affecting this calculation. Because it is unknown whether anything happened before the big bang, it is better to think of the big bang, not as the beginning of time, but as the beginning of what we understand about our distant past. A large majority of cosmologists believe the big bang’s expansion is an expansion of space but not of spacetime and thus not of time. By the way, when cosmologists speak of space expanding, this remark is about increasing distances among clusters of galaxies. The distance from New York City to Washington, D.C. is not increasing.
(3) Time is one-dimensional, like a line.
The scientist Joseph Priestly in 1765 first suggested time is like a one-dimensional line. The idea quickly caught on, and now time is represented as one-dimensional in all the fundamental theories of physics. Two-dimensional time has been studied by mathematical physicists, but no theories implying that time has more than one dimension in our actual universe have acquired a significant number of supporters. Such theories are difficult to make consistent with what else we know, and there is no motivation for doing so. Because of this one-dimensionality, time is represented in a coordinate system with a time line rather than a time area, and so its geometry is simpler than that of space. However, neither the geometry of real, physical space nor the geometry of time is something that can be known a priori, as Euclid and Kant mistakenly supposed.
(4) Time connects all events.
Given any two events that ever have existed or ever will, one event happens before the other or else they happen simultaneously. No exceptions.
(5) Time travel is possible, but you cannot change the past.
You can travel to the future—to meet your great, great grandchildren. Your travelling to the future, to someone else’s future who once lived at a time you lived has been experimentally well-confirmed many times. Travelling to your own future, though, does not make sense because you are always in your own present. There is no consensus among scientists regarding whether you might someday be able to travel into your own past, but the majority of scientists are doubtful. You are presently in the past of people who will be born fifty years from now, but this has no implications about time travel. If you were able to travel to the past you could not change it; that would make a sentence be both true and not true, which is not allowed.
(6) Time is relative.
According to relativity theory, the amount of time an event lasts (the event’s duration) is relative to someone’s choice of a reference frame or coordinate system or vantage point. How long you slept last night is very different depending on whether it is measured by a clock next to you or by a clock in a spaceship speeding by at close to the speed of light. If no reference frame has been pre-selected, then it is a violation of relativity theory to say one of those two durations for your sleeping is correct and the other is incorrect. Newton would have said both durations cannot be correct, but regarding this assumption of Newton’s classical physics, Einstein and Infeld said, “In classical physics it was always assumed that clocks in motion and at rest have the same rhythm…[but] if the relativity theory is valid, then we must sacrifice this assumption. It is difficult to get rid of deep-rooted prejudices, but there is no other way.” Duration’s being relative to a reference frame is ultimately a consequence of Einstein’s insight that the speed of light is not relative to a reference frame.
Because duration is relative, the conclusion is drawn that:
(7) Time is not an objectively real feature of the universe.
According to relativity theory, space-time is objectively real, but time or duration is not real and neither is space or distance. The main reason for believing time is not objectively real is that it is relative to a reference frame and so is not independent of space. Scientists assume that what is objectively real must not be relative, that is, dependent upon someone’s choice of reference frame. A state of the universe at a single time is also frame-relative, so, the state of the universe at a time is not objectively real either. Scientists generally adopt the metaphysical stance that, if many reference frames can be chosen, then no reference frame is ontologically privileged. To some philosophers, these claims cast doubt upon either the theory of relativity itself or the importance that scientists ascribe to frame-independence.
(8) Simultaneity is relative.
According to relativity theory, if the two observers move toward or away from each other or experience different gravitational forces, then many pairs of events will be simultaneous for one observer and not simultaneous for the other. Relativity theory implies there is no uniquely correct answer to the question, for some distant place, “What is happening now at that place?” The answer depends on what observer is answering the question, namely what reference frame is being assumed. The relation “Event A is simultaneous with event B” is transitive only for a single reference frame, not across all reference frames.
(9) Within a single reference frame, coordinate time “fixes” (i) when each event occurs, (ii) what any event’s duration is, (iii) what other events occur simultaneously with it, and (iv) the time-order of any two events.
Coordinate time is time measured along the time dimension in a chosen coordinate system.
(10) Speeding clocks run slower.
According to relativity theory, a speeding clock always runs slower compared to a stationary clock. The speeding clock’s ticking is said to be “dilated” (that is, stretched or extended) compared to that of the stationary clock. Click to view a picture of this time dilation. The dilation works for all physical processes, not just clocks. For everyone, their own clock’s time is not dilated; it is always the other person’s clock that dilates. There is no objectively universal clock against which processes run slower or faster, but there is a conventionally-chosen standard clock in Paris whose time is recognized by nearly all countries to be the “correct time.”
(11) Time slows when the gravitational force increases.
This somewhat misleading remark (because time has no rate of flow) is meant to imply that initially synchronized clocks will get out of synch if they are affected differently by gravity. The greater the gravitational force, the slower the ticking. This holds for all processes, not just the ticking of clocks. You will live longer on the first floor than on the tenth floor of your apartment building. On the first floor the gravitational force on you is greater. This dilation due to gravity is a second kind of time dilation called “gravitational time dilation.” The clock of an astronaut on the moon ticks faster than it does back on Earth. After about 50 years, the astronaut would be one second older than if he or she had stayed on Earth. The clock in a satellite orbiting Earth disagrees with the standard clock back on Earth by slowing down due to its speed while speeding up due to its being less affected by Earth’s gravity. These two time dilation effects cancel out when the satellite is about 2,000 miles above Earth.
(12) Time can warp or curve.
When time warps, clocks do not bend in space as if in a Salvador Dali painting. Instead, they undergo time dilation. According to general relativity, gravity is the curvature of four-dimensional spacetime even though there is no known fifth dimension for it to curve into (so it curves as if there is a fifth). Changing the distribution of matter-energy will change the warp of time. In this sense, time is malleable. This 4D curvature of our space and time is observed by detecting time dilation and space contraction. By analogy, a two-dimensional being confined to the surface of a globe would call a longitude line a straight line because it is the shortest distance between two points, the North Pole and the South Pole, but we in a higher third dimension can see that the longitude line curves. Choosing to say “three-dimensional space curves” expands the ordinary meaning of the word “curve,” and saying “four-dimensional spacetime curves” expands it even more because the word “curve” normally indicates a change of 3D spatial direction, as when the hiking path curves to the right, or the shape of an apple’s surface is curved and not flat. Newtonian mechanics and special relativity allow curvature only in this ordinary sense of curvature, and they do not allow curvature of either time or space or spacetime. According to general relativity though, they all can curve. Independently of each other, Gauss, Lobachevsky and Bolyai first suggested that physical space could curve, and Einstein first suggested that spacetime could curve. Einstein’s general theory of relativity implies space-time can curve by bending, stretching, shaking and rippling (or do all these together) just as a gelatin dessert can.
(13) Black holes slow time and can end it.
The place in our universe where time is the strangest is at and in a black hole. The most severe gravitational time dilation is near the surface of black holes. According to relativity theory, if you were in a spaceship approaching a black hole near its event horizon (its outer boundary or point of no return), then your time warp (the slowing of your clock relative to clocks back on Earth) would be more severe the longer you stayed in the vicinity and also the closer you got to the horizon. Even if your spaceship accelerated rapidly toward the horizon, and you quickly plunged inside (according to time as measured by you), viewers far away from the black hole would see your spaceship progressively slow its speed during its approach to the horizon. Reports sent by radio back toward Earth of the readings of your spaceship’s clock would become lower in frequency (due to gravitational red shift), and these reports would contain evidence that your clock’s ticking was slowing down (dilating) compared to Earth clocks. An outside viewer watching your spaceship as it plunges toward the horizon might never live long enough to see your spaceship actually reach the event horizon, although by your own clock on the ship you reached the hole in less than a minute. When you reach the singularity near the center of the black hole, not only is your clock crushed to a tiny size but your proper time itself stops.
(14) There is no such thing as right now when you are far away.
The only reason that there is such a thing as THE correct time for a distant event is that we accept the convention of trusting reports from just one clock, our standard clock or master clock. By convention, our standard clock reports what time it is at the Greenwich Observatory in London, England. But relativity theory allows other conventions, and this allows a single distant event to occur at a range of times depending upon the convention adopted.
(15) You have enough time left in your life to visit the far side of the galaxy and return.
One philosophically interesting implication of time dilation in relativity theory is that in your lifetime, without using cryogenics, you have enough time to visit the far side of our Milky Way galaxy 100,000 light years away from Earth and then return to report on your adventure to your descendants many generations from now. As your spaceship approaches the speed of light, you can cross the galaxy in hardly any proper time at all, even though someone using the coordinate time of the standard Earth-based clock must judge that it took you over 100,000 years to cross the galaxy one-way. Both time judgments would be correct. The faster you move, the more time you have time to visit new places because the distance of travel shrinks, too. Because of space contraction due to your high speed, the far side of our Milky Way is no longer 100,000 light years away as judged by your distance measurements. You can approach but never actually reach the cosmic speed limit of traveling at light speed, but the closer you get to that speed the closer you get to experiencing no time at all (as measured by your clock).
(16) All the fundamental laws are invariant under time-translation.
These laws being time-translation invariant means that the fundamental laws of nature do not depend on what time it is, and they do not change as time goes by. Your health might change as time goes by, but the basic laws underlying your health do not. This translation symmetry property expresses the equivalence of all instants. This feature of time can be expressed using the language of coordinate systems by saying that replacing the time variable t everywhere in a fundamental law by t + 4 does not change what processes are allowed by the law. The choice of “4” was an arbitrary choice of a real number. Requiring the laws of physics to be time-translation symmetric was proposed by Isaac Newton. In the early 20th century, the mathematician Emmy Noether discovered that time-translation symmetry implies the law of the conservation of energy.
(17) Almost all the fundamental physical laws are invariant under time-reversal.
This point about time-reversal symmetry can be expressed informally by saying that if you make a documentary film and show it in reverse, what you see may look very surprising or even impossible, but actually nothing shown violates a fundamental physical law. It may violate the second law of thermodynamics, but that law is not fundamental. Another way time-reversal symmetry shows itself is in the fact that the fundamental laws look the same if you change the time variable “t” to its negation “-t“.
The reason for using the hedge word “almost” is that some rarely seen decays of certain mesons do violate time-reversal, but all the common and important processes in the universe could possibly go the other way. Another way to make the point is to say the fundamental laws of physics are very nearly time symmetr.
If almost all the fundamental laws are time-reversal symmetric, this raises the interesting question of why all the common physical processes are seen by us to go spontaneously in only one direction in time, as if time has an arrow. The arrow is shown clearly in all the common processes. Bullets explode but never un-explode. Light leaves a lit candle and never converges from all directions into it. Heat flows spontaneously from hot to cold, never the other way. This issue is examined further in the later section on the arrow of time.
(18) Science does not require atoms of time.
Neither the theory of relativity nor quantum mechanics require there to be atoms of time, or any lack of temporal continuity. There is much 21st century research into this topic, and many physicists do suspect that there are atoms of time. More realistically, there probably would be atoms of spacetime, not just time. This is because one needs to respect relativity theory which implies that spacetime is fundamental, and time is not.
6. Time and Change (Relationism vs. Substantivalism)
Does physical time necessarily depend on change existing, or vice versa? Philosophers have been sharply divided on these issues, and any careful treatment of them requires clarifying the relevant terms being used. Even the apparent truism that change takes time is false if the terms are used improperly.
Let’s focus on whether time necessarily involves change. If it does, then what sort of change is required? For example, would time exist in a universe that does change yet does not change in enough of a regular manner to have a clock? Those who answer “yes,” are quick to point out that there is a difference between not being able to measure some entity and that entity not existing. Those who answer “no,” have sometimes said that if an entity cannot be measured then the very concept of it is meaningless—although not that it must be meaningless, as a Logical Positivist would declare, but only that it is as a matter of fact meaningless. The latter position is defended by Max Tegmark in (Tegmark 2017).
Classical relationists claim that time necessarily involves change, and classical substantivalists say it does not. Substantivalism (also called substantialism) implies that both space and time exist always and everywhere regardless of what else exists or changes. They say space and time provide a large, invisible, inert container within which matter exists and moves independently of the container. The container provides an absolute rest frame, and motion relative to that frame is real motion, not merely relative motion. Relationism (also called relationalism) implies space and time are not like this. It implies there is no container, so, if you take away matter’s motions, you take away time, and if you also take way the matter itself, you take away space.
Substantivalism is the thesis that space and time exist always and everywhere independently of physical material and its events.
Relationism is the thesis that space is only a set of relationships among existing physical material, and time is a set of relationships among the events of that physical material.
Relationism is inconsistent with substantivalism. Substantivalism implies there can be empty time, time without the existence of physical events. Relationism does not allow empty time. It is committed to the claim that time requires material change. That is, necessarily, if time exists, then change exists.
Everyone agrees that clocks do not function without change and that time cannot be measured without there being changes, but the present issue is whether time exists without changes. Can we solve this issue by testing? Could we, for instance, turn off all changes and then look to see whether time still exists? No, the issue has to be approached indirectly.
Relationists and substantivalists agree that, perhaps as a matter of fact, change is pervasive and so is time. What is contentious is whether time exists even if, perhaps contrary to fact, nothing is changing. This question of whether time requires change is not the question of whether change requires time, nor is it the question of whether time is fundamental.
To make progress, more clarity is needed regarding the word change. The meaning of the word is philosophically controversial. It is used here in the sense of ordinary change—an object changing its ordinary properties over time. For example, a leaf changes its location if it falls from a branch and lands on the ground. This ordinary change of location is very different from the following three extraordinary kinds of change. (1) The leaf changes by being no longer admired by Donald. (2) The leaf changes by moving farther into the past. (3) The leaf changes across space from being green at its base to brown at its tip, all at one time. So, a reader needs always to be alert about whether the word change means ordinary change or one of the extraordinary kinds of change.
There is a fourth kind of change that also is extraordinary. Consider what the word properties means when we say an object changes its properties over time. When referring to ordinary change of properties, the word properties is intended to exclude what Nelson Goodman called grue-like properties. Let us define an object to be grue if and only if, during the time that it exists, it is green before the beginning of the year 1888 but is blue thereafter. With this definition, we can conclude that the world’s chlorophyll underwent a change from grue to non-grue in 1888. We naturally would react to drawing this conclusion by saying that this change in chlorophyll is very odd, not an ordinary change in the chlorophyll, surely nothing that would be helpful to the science of biology.
Classical substantival theories are also called absolute theories. The term absolute here implies existing without dependence on anything except perhaps God. The relationist, on the other hand, believes time’s existence depends upon material events.
Many centuries ago, the manifest image of time was relationist, but due to the influence of Isaac Newton upon the teaching of science in subsequent centuries and then this impact upon the average person who is not a scientist, the manifest image has become substantivalist.
a. History of the Debate from Aristotle to Kant
Aristotle had said, “neither does time exist without change” (Physics, Book IV, chapter 11, page 218b). This claim about time is often called Aristotle’s Principle. In this sense he was Leibniz’s predecessor, although Leibniz’s relationism contains not only Aristotle’s negative element that there is no changeless time but also a positive element that describes what time is. In opposition to Aristotle on this topic, Democritus spoke of there being an existing space within which matter’s atoms move, implying space is substance-like rather than relational. So, the ancient Greek atomists were a predecessor to Newton on this topic.
The battle lines between substantivalism and relationism were drawn more clearly in the early 18th century when Leibniz argued for relationism and Newton argued against it. Leibniz claimed that space is a network of objects. It is nothing but the “order of co-existing things,” so without objects there is no space. “I hold space to be something merely relative, as time is; …I hold it to be an order of coexistences, as time is an order of successions.” Time is a relational order of successions of events, with events causing other events. The typical succession-relationships Leibniz is talking about here are that this event caused that event to occur two minutes later. If asked what a specific time is, a modern Leibnizian would be apt to say a single time is a set of simultaneous events.
Opposing Leibniz, Isaac Barrow and his student Isaac Newton returned to a Democritus-like view of space as existing independently of material things; and they similarly accepted a substantival theory of time, with time existing independently of all motions and other kinds of events. Newton’s actual equations of motion and his law of gravity are consistent with both relationism and substantivalism, although this point was not clear at the time to either Leibniz or Newton.
In 1670 in his Lectiones Geometricae, the English physicist Isaac Barrow rejected any necessary linkage between time and change. He said, “Whether things run or stand still, whether we sleep or wake, time flows in its even tenor.” Barrow also said time existed even before God created the matter in the universe. Newton agreed. In Newton’s unpublished manuscript De gravitatione, written while he was composing his Principia, he said, “we cannot think that space does not exist just as we cannot think there is no duration” (Newton 1962, p. 26). This suggests that he believed time exists necessarily, and this idea may have influenced Kant’s position that time is an a priori condition of all appearance whatsoever.
Newton believed time is not a primary substance, but is like a primary substance in not being dependent on anything except God. For Newton, God chose some instant of pre-existing time at which to create the physical world. From these initial conditions, including the forces acting on the material objects, the timeless scientific laws took over and guided the material objects, with God intervening only occasionally to perform miracles. If it were not for God’s intervention, the future would be a logical consequence of the present.
Leibniz objected. He was suspicious of Newton’s substantival time because it is undetectable, which, he supposed, made the concept incoherent. Leibniz argued that time should be understood not as an entity existing independently of actual, detectable events. He complained that Newton had under-emphasized the fact that time necessarily involves an ordering of events, the “successive order of things,” such as one event happening two minutes after another. This is why time needs events, so to speak. Leibniz added that this overall order is time.
It is clear that Leibniz and Newton had very different answers to the question, “Given some event, what does it mean to say it occurs at a specific time?” Newton would says events occur at some absolute time that is independent of what other events occur, but Leibniz would say we can properly speak only about events occurring before or after or simultaneous with other events. Leibniz and Newton had a similar disagreement about space. Newton believed objects had absolute locations that need no reference to other objects’ locations, but Leibniz believed objects can be located only via spatial relations between other material objects—by an object being located above or below or three feet from another object.
One of Leibniz’s criticisms of Newton’s theory is that it violates Leibniz’s Law of the Identity of Indiscernibles: If two things or situations cannot be discerned by their different properties, then they are really identical; they are just one and not two. Newton’s absolute theory violates this law, Leibniz said, because it implies that if God had shifted the entire world some distance east and its history some minutes earlier, yet changed no properties of the objects nor relationships among the objects, then this would have been a different world—what metaphysicians call an ontologically distinct state of affairs. Leibniz claimed there would be no difference because there would be no discernible difference in the two, so there would be just one world here, not two, and so Newton’s theory of absolute space and time is faulty. This argument is called “Leibniz’s shift argument.”
Regarding the shift argument, Newton suggested that, although Leibniz’s a priori Principle of the Identity of Indiscernibles is correct, God is able to discern differences in absolute time or space that mere mortals cannot.
Leibniz offered another criticism. Newton’s theory violates Leibniz’s a priori Principle of Sufficient Reason: that there is a sufficient reason why any aspect of the universe is the way it is and not some other way. Leibniz complained that, since everything happens for a reason, if God shifted the world in time or space but made no other changes, then He surely would have no reason to do so.
Newton responded that Leibniz is correct to accept the Principle of Sufficient Reason but is incorrect to suppose there is a sufficient reason knowable to humans. God might have had His own reason for creating the universe at a given absolute place and time even though mere mortals cannot comprehend His reason.
Newton later admitted to friends that his two-part theological response to Leibniz was weak. Historians of philosophy generally agree that if Newton had said no more, he would have lost the debate.
Newton, through correspondence from his friend Clarke to Leibniz, did criticize Leibniz by saying, “the order of things succeeding each other in time is not time itself, for they may succeed each other faster or slower in the same order of succession but not in the same time.” Leibniz probably should have paid more attention to just what this remark might imply. However, Newton soon found another clever and clearer argument, one that had a much greater impact at the time. He suggested a thought experiment in which a bucket’s handle is tied to a rope hanging down from a tree branch. Partially fill the bucket with water, grasp the bucket, and, without spilling any water, rotate it many times until the rope is twisted. Do not let go of the bucket. When everything quiets down, the water surface is flat and there is no relative motion between the bucket and its water. That is situation 1. Now let go of the bucket, and let it spin until there is once again no relative motion between the bucket and its water. At this time, the bucket is spinning, and there is a concave curvature of the water surface. That is situation 2.
How can a relational theory explain the difference in the shape of the water’s surface in the two situations? It cannot, said Newton. Here is his argument. If we ignore our hands, the rope, the tree, and the rest of the universe, says Newton, each situation is simply a bucket with still water; the situations appear to differ only in the shape of their water surface. A relationist such as Leibniz cannot account for the change in shape. Newton said that even though Leibniz’s theory could not be used to explain the difference in shape, his own theory could. He said that when the bucket is not spinning, there is no motion relative to space itself, that is, to absolute space; but, when it is spinning, there is motion relative to space itself, and so space itself must be exerting a force on the water to make the concave shape. This force pushing away from the center of the bucket is called centrifugal force, and its presence is a way to detect absolute space.
Because Leibniz and his supporters had no counter to this thought experiment, for over two centuries Newton’s absolute theory of space and time was generally accepted by European scientists and philosophers, with the notable exceptions of Locke in England and d’Alembert in France.
One hundred years later, Kant entered the arena on the side of Newton. Consider two nearly identical gloves except that one is right-handed and the other is left-handed. In a world containing only a right-hand glove, said Kant, Leibniz’s theory could not account for its handedness because all the internal relationships among parts of the glove would be the same as in a world containing only a left-hand glove. However, intuitively we all know that there is a real difference between a right and a left glove, so this difference can only be due to the glove’s relationship to space itself. But if there is a space itself, then the absolute or substantival theory of space is better than the relational theory. This indirectly suggests that the absolute theory of time is better, too.
Newton’s theory of time was dominant in the 18th and 19th centuries, even though Christiaan Huygens (in the 17th century) and George Berkeley (in the 18th century) had argued in favor of Leibniz. See (Huggett 1999) and (Arthur 2014) for a clear, detailed discussion of the opposing positions of Leibniz and Newton on this issue.
b. History of the Debate after Kant
Leibniz’s criticisms of Newton’s substantivalism are clear enough, but the positive element of Leibniz’s relationism is vague. It lacked specifics by assuming uncritically that his method for abstracting duration from change is unique, but this uniqueness assumption is not defended. That is, what exactly is it about the relationship of objects and their events that produces time and not something else? Nor did Leibniz address the issue of how to define the duration between two arbitrarily chosen events. In the twentieth century, Einstein argued successfully that the duration is not unique, but is relative. Appreciating Einstein’s argument has affected the debates about substantivalism and relationism.
Newton and subsequent substantivalists hoped to find a new substance for defining absolute motion without having to appeal to the existence and location of ordinary material objects. In the late 19th century, the substantivalists discovered a candidate. It was James Clerk Maxwell’s luminiferous aether, the medium that waves when there is a light wave. Maxwell had discovered that light is an electromagnetic wave. Since all then-known waves required a medium to wave, all physicists and philosophers of science at the time believed Maxwell when he said the aether was needed as a medium for the propagation of electromagnetic waves and also when he said that it definitely did exist even if it had never been directly detected. Yet this was Maxwell’s intuition speaking; his own equations did not require a medium for the propagation.
The idea was that velocity relative to the ether was the “real velocity” or “absolute velocity” of an object as opposed to a velocity relative to some ordinary object like the Earth or a train. The aether would provide a basis for measuring “real” time, that is, absolute time.
Late in the 19th century, the physicist A. A. Michelson and his chemist colleague Edward Morley set out to experimentally detect the aether. Their assumption was that at different times of the year along the Earth’s path around the Sun, the Earth would move at different angles relative to the aether, and thus the speed of light measured on Earth would differ at different times of the year. Their interferometer experiment was very sensitive, but somehow it failed to detect an aether or any difference in the speed of light even though the experiment was at the time the most sensitive experiment in the history of physics and apparently should have detected it. Some physicists, including Michelson himself, believed the failure was due to the fact that he needed a better experimental apparatus. Other physicists believed that the aether was somehow corrupting the apparatus. Most others, however, believed the physicist A. J. Fresnel who said the Earth is dragging a layer of the aether with it, so the Earth’s nearby aether is moving in concert with the Earth itself. If so, this would make the aether undetectable by the Michelson-Morley experimental apparatus, as long as the apparatus was used on Earth and not in outer space. No significant physicist said there was no aether to be detected.
However, these ad hoc rescues of the aether hypothesis did not last long. In 1893, the physicist-philosopher Ernst Mach, who had a powerful influence on Albert Einstein, offered an original argument that attacked Newton’s bucket argument, promoted relationism, and did not assume the existence of absolute space (the aether) or absolute time. Absolute time, said Mach, “is an idle metaphysical conception.” Mach claimed Newton’s error was in not considering the presence or absence of stars or, more specifically, not considering the combined gravitational influence of all the matter in the universe beyond the bucket. That is what was curving the water surface in the bucket when the water was spinning.
To explore Mach’s argument, consider a female ballet dancer who pirouettes in otherwise empty space. Her arms always splay out from her body as she spins, but would her arms have to do so in this thought experiment? Leibniz would answer “no.” Newton would answer “yes.” Similarly, if we were to spin Newton’s bucket of water in otherwise empty space, would the presence of absolute space eventually cause the surface of the water to become concave? Leibniz would answer “no.” Newton would answer “yes.” Mach would say the questions makes no sense because the very notion of spin must be spin relative to some object, such as the surrounding stars. Mach would add that, if the distant stars were retained in the thought experiment, then there would be spin relative to the stars, and he would change his answers to “yes.” Newton believed the presence or absence of the distant stars is irrelevant to the situations with a spinning ballet dancer and a spinning bucket of water. Unfortunately, Mach did not provide any detailed specification of how the distant stars exerted their influence on Newton’s bucket or on a ballet dancer, and he had no suggestion for an experiment to test his answer; and so nearly all physicists and philosophers of physics were not convinced by Mach’s reasoning. Thus, the prevailing orthodoxy was that, because of Maxwell’s aether, Newton’s substantivalism is correct.
It is surprising that so little was said at the time about the asymmetry in the two bucket scenarios. In the second one, the water is rotating along with the bucket, and that implies change of velocity and thus acceleration, an acceleration that does not occur in the first scenario, and that might be the key to explaining the puzzle without relying upon the distant stars or upon an underlying spatial substance such as an aether.
A young physicist named Albert Einstein was very intrigued by Mach’s remarks. He at first thought Mach was correct, and even wrote him a letter saying so, but he eventually rejected Mach’s position and took an original, relationist position on the issue.
In 1905, he proposed his special theory of relativity that does not require the existence of either Newton’s absolute space or Maxwell’s aether. Ten years later he added a description of gravity and produced his general theory of relativity, which had the same implication. The theory was immediately understood by the leading physicists, and, when experimentally confirmed, it caused the physics and philosophy communities to abandon classical substantivalism. The tide quickly turned against what Newton had said in his Principia, namely that “Time exists in and of itself and flows equably without reference to anything external.” Influenced by relativity theory, the philosopher Bertrand Russell became an articulate promoter of relationism in the early twentieth century.
Waxing philosophical in The New York Times newspaper in 1919, Einstein declared his general relativity theory to be a victory for relationism:
Till now it was believed that time and space existed by themselves, even if there was nothing—no Sun, no Earth, no stars—while now we know that time and space are not the vessel for the Universe, but could not exist at all if there were no contents, namely, no Sun, no Earth, and other celestial bodies.
Those remarks show Einstein believed in relationism at this time. However, in his Nobel Prize acceptance speech three years later in 1922, Einstein backtracked on this and took a more substantivalist position by saying time and space could continue to exist without the Sun, Earth, and other celestial bodies. He claimed that, although relativity theory does rule out Maxwell’s aether and Newton’s absolute space, it does not rule out some other underlying substance that is pervasive. All that is required is that, if such a substance exists, then it must obey the principles of the theory of relativity. Soon he was saying this substance is space-time itself—a field whose intrinsic curvature is what we call gravitational force. With this position, he is a non-Newtonian, non-Maxwellian substantivalist. Rejecting classical substantivalism, Einstein said that spacetime, “does not claim an existence of its own, but only as a structural quality of the [gravitational] field.”
This pro-substantivalism position has been subsequently strengthened by the 1998 experimental discovery of dark energy which eventually was interpreted as indicating that space itself has inertia and is expanding faster and faster. Because spacetime itself can curve and can have ripples (from gravitational waves) and can expand in volume, the pro-substantivalist position became the most popular position in the 21st century. Nevertheless, there are interesting challenges, and the issue is open.
In the 21st century it is widely accepted that spacetime can curve (near large or small masses), expand (when the universe’s volume increases), and ripple (when gravitational waves pass by). Those are properties one commonly associates with an underlying medium.
Quantum field theory provides another reason to accept substantivalism. This theory is the result of applying quantum mechanics to fields. The assumption of Leibniz and Newton that fundamentally there are particles in space and time buffeted about by forces was rejected due to the rise of quantum field theory in the late twentieth century. It became clear that fields are better candidates than particles for the fundamental entities of the universe. Physicists influenced by logical positivism, once worried that perhaps Einstein’s gravitation field, and all other fields, are merely computational devices without independent reality. However, ever since the demise of logical positivism and the development and confirmation of quantum electrodynamics in the late twentieth century, fields have been considered to be real by both physicists and philosophers. What once were called “fundamental particles” still exist, but only as weakly emergent entities from fundamental fields. Because quantum field theory implies that a field does not go away even if the field’s values reach a minimum everywhere, the gravitational field is considered to be substance-like, but it is a substance that changes with the distribution of matter-energy throughout the universe, so it is very unlike Newton’s absolute space or Maxwell’s aether. The philosophers John Earman and John Norton have called this position (of promoting the substance-like character of the gravitational field) manifold substantivalism. In response, the philosopher of physics Tim Maudlin said: “The question is: Why should any serious substantivalist settle on manifold substantivalism? What would recommend that view? Prima facie it seems like a peculiar position to hold” because the manifold has no spatiotemporal structure. (Maudlin 1988, p. 87).
Since the late twentieth century, philosophers have continued to create new arguments for and against substantivalism, so the issue is still open. Nevertheless, many other scientists and philosophers have suggested that the rise of quantum field theory has so changed the concepts in the Newton-Leibniz debate that the old issue cannot be settled either way. The cosmologist Lawrence Krauss remarked that “Quantum mechanics blurs the distinction between something and nothing” because the vacuum according to quantum mechanics always contains fields and particles even at the lowest possible energy level.
For additional discussion of substantivalism and relationism, see (Dainton 2010, chapter 21).
7. Is There a Beginning or End to Time?
This section surveys some of the principal, well-informed speculations about the beginning and end of time. The emphasis should be on “speculations” because there are hundreds of competing ideas about the beginning and end of the universe and of time, and none of the ideas are necessary to explain any actual observations. There is no consensus about whether time is infinite in future, and there is none about whether time is infinite in the past. For all we know, we may never know the answer to these questions, despite our being better informed on the issue than were our predecessors. One cautionary note is that researchers sometimes speak of time existing before the beginning of the universe, so perhaps what they mean by the word “universe” is not as comprehensive as what others mean. Also, researchers sometimes speak of the creation of a universe from the physicists’ quantum vacuum and call this creation ex nihilo, but a quantum vacuum is not nothing, at least not nothing in the sense used by many philosophers, so the label can be misleading to philosophers.
a. The Beginning
Many persons have argued that the way to show there must have been a first event is to show that time has a finite past. But this is a mistake. The universe can have a finite past but no first event. This point is illustrated with the positive real numbers. All positive real numbers less than five and greater than zero have predecessors, but there is no first number in this series even though it has a finite measure of 5. For any positive real number in the series, there is a smaller one without there being a smallest one.
Many theologians are confident that there was a beginning to time, but there is no agreement among cosmologists that there ever was a beginning.
Immanuel Kant argued that contradictions could be deduced from the claim that the universe had a beginning. But he also believed contradictions followed from the claim that it did not have a beginning.
Relativity theory and quantum mechanics both allow time to be infinite in the future and the past. Thus any restrictions on time’s extent must come from other sources. Regarding the beginning of time, some cosmologists believe the universe began with a big bang 13.8 billion years ago. This is the t = 0 of cosmic time used by professional cosmologists. The main controversy is whether t = 0 is really the beginning. Your not being able to imagine there not being a time before the big bang does not imply there is such an earlier time, although this style of argument might have been acceptable to the ancient Greek philosophers. The cosmologist Stephen Hawking once famously quipped that asking for what happened before the big bang is like asking what is north of the north pole. He later retracted that remark and said it is an open question whether there was a time before the big bang.
If the universe began at the big bang, there is the problem that it appears as if something begins from nothing, that is, the universe emerges from no universe. This transition violates the apparently strongly-confirmed law of the conservation of energy. Those who wish to retain the law as having no exceptions say we are required to conclude that the total energy of the universe is zero or else to conclude that there always has been a universe with energy and there always will be such a universe.
There are a great many detailed physical theories that are conjectures about our origins. One is that the big bang was the beginning. Another is that the universe had an infinite past in which nothing of interest happened, then abruptly the big bang began.
Even if there were a time before the big bang began, the question would remain as to whether the extent of this prior time is finite or infinite, and there is no consensus on that question either.
The big bounce theory of cosmology says the small, expanding volume of the universe 13.8 billion years ago was the effect of a prior multi-billion-year compression that, when the universe became small enough, stopped its compression and began a rapid expansion that we have been calling the big bang. Perhaps there have been repetitions of compression followed by expansion, and perhaps these cycles have been occurring forever and will continue occurring forever. This is the theory of a cyclic universe.
The Hawking-Hartle No Boundary Proposal suggests that the universe had no time, then suddenly one dimension of space converted to a dimension of time.
Cosmologist J. Richard Gott speculated that time had no first instant but rather began in an unusual process in which the universe came from itself by a process of backward causation. He suggests that backward causation is consistent with general relativity. At the beginning of the universe, he says, there was a closed time-like loop that lasted for 10-44 seconds during which the universe caused its own existence. Past time is finite, and the loop was a beginning of time without a first event. See (Gott 2002) for an informal presentation of the idea, which has not been promoted by many other cosmologists.
b. The End
The cosmologists’ favorite scenario for the universe’s destiny implies that all stars burn out in about 100 trillion years, matter falls into black holes, but then, after 10100 more years, all black holes eventually evaporate, and the remaining particles of radiation get ever farther from each other, with no end to the dilution and cooling while all the ripples of space-time become weaker. This resulting scenario with the entire universe reaching a common temperature is called the heat death, the big chill, and also the big freeze. This is our most likely fate if the large-scale geometry of the universe is flat (that is, 3D Euclidean), which is suspected, but not known very confidently. The scenario also assumes the total energy of the universe is not zero, which is a controversial assumption, too, because there is no direct evidence for it or against it.
The heat death may occur, but it is likely not to be the end of absolutely all structure because the laws of quantum mechanics imply there is always a small probability of a fluctuation of some structure from no-structure. Fluctuations usually last an extremely short amount of time, but they do not have to. So, given enough time, an extremely, extremely long time, there is a chance that anything physically possible will happen.
Here is a summary of some serious, competing suggestions by twenty-first-century cosmologists about our universe’s future. The list begins with the most popular one:
Heat Death—Big Chill (Expansion of space at an ever-increasing rate as entropy increases toward a common equilibrium temperature for the entire universe. Assumes the large-scale shape of the universe is perfectly flat.) An infinite future.
Big Crunch (The universe is expanding; eventually the expansion stops somehow; and the universe begins contracting to a final compressed state as if the big bang is running in reverse.) A finite future.
Big Bounce. (Eternal pattern of cycles of expansion, then compression, then expansion, then compression, and so forth. One version implies there are repeated returns to a microscopic volume with each being followed by a new big bang). An infinite future.
Cycles without Crunches (While the universe expands forever and never contracts, the observable part of the universe can oscillate between expansions and contractions with a big bounce separating a contraction from the next expansion.) An infinite future.
Big Rip (Dark energy runs wild. The expansion rate of dark energy is not a Cosmological Constant but instead increases exponentially toward infinity. As this happens, every complex system that interacts gravitationally is eventually pushed apart—first groups of galaxies, then galaxies, later the planets, then all the molecules, and even the fabric of space itself.) A finite future.
Big Snap (The fabric of space suddenly reveals a lethal granular nature when stretched too much, and its “snaps” like when an overly stretched rubber band breaks.) A finite future.
Death Bubble (Due to some high energy event such as the creation of a tiny black hole with a size never created before, our metastable Higgs field suddenly changes its value from the current false vacuum value to some more stable true vacuum value. This is analogous to supercooled distilled water being disturbed and rapidly turning to ice. The energy of the vacuum decay that this collapse creates appears as a 3D bubble with no inside that expands at nearly the speed of light while destroying the structure of everything in its path. Meanwhile our space rapidly contracts and crushes together all the remaining debris. A finite future.
Mirror Universe. (Before the big bang, time runs in reverse. Both the big bang’s before-region and after-region evolve from a tiny situation at cosmic time t = 0 in which the apexes of their two light cones meet. The two regions are almost mirror images of each other. On some versions, the light cones meet at a point; on other versions, they might in an infinitesimal region.) There are versions with a finite future and with an infinite future.
These theories have been described in detail with mathematical physics, but they are merely hypotheses in the sense that none are tied to any decisive experimental results, at least so far. The Big Crunch was the most popular theory among cosmologists until the 1960s. In that theory, the universe continues its present expansion for about three billion more years until the inward pull due to the mutual gravitation among all the universe’s matter-energy overcomes that expansion, thereby causing a subsequent seven billion years of contraction until everything is compressed together into a black hole.
See (Mack 2020) and (Hossenfelder 2022, chapter two) for a presentation by two cosmologists of many of the competing theories about the beginning and the end of time.
c. Historical Answers
There has been much speculation over the centuries about the extent of the past and the future, although almost all remarks have contained serious ambiguities. For example, regarding the end of time, is this meant in the sense of (a) the end of humanity, or (b) the end of life, or (c) the end of the universe that was created by God, but not counting God, or (d) the end of all natural and supernatural change? Intimately related to these questions are two others: (1) Is it being assumed that time exists without change, and (2) what is meant by the term change? With these cautions in mind, below there is a brief summary of conjectures throughout the centuries about whether time has a beginning or an end.
Regarding the beginning of time, the Greek atomist Lucretius in about 50 B.C.E. said in his poem De Rerum Natura:
For surely the atoms did not hold council, assigning order to each, flexing their keen minds with questions of place and motion and who goes where.
But shuffled and jumbled in many ways, in the course of endless time they are buffeted, driven along chancing upon all motions, combinations.
At last they fall into such an arrangement as would create this universe.
The implication is that time has always existed, but that an organized universe began a finite time ago with a random fluctuation.
Plato and Aristotle, both of whom were opponents of the atomists, agreed with them that the past is infinite or eternal. Aristotle offered two reasons. Time had no beginning because, for any time, we always can imagine an earlier time. In addition, time had no beginning because everything in the world has a prior, efficient cause. In the fifth century, Augustine disagreed with Aristotle and said time itself came into existence by an act of God a finite time ago, but God, himself, does not exist in time. This is a cryptic answer because it is not based on a well-justified and detailed theory of who God is, how He caused the big bang, and how He can exist but not be in time. It is also difficult to understand St. Augustine’s remark that “time itself was made by God.” On the other hand, for a person of faith, belief in their God is usually stronger than belief in any scientific hypothesis, or in any desire for scientific justification of their remark about God, or in the importance of satisfying any philosopher’s demand for clarification.
Agreeing with Augustine against Aristotle, Martin Luther estimated the universe to have begun in 4,000 B.C.E. Then Johannes Kepler estimated that it began in 4,004 B.C.E. In the early seventeenth century, the Calvinist James Ussher calculated from the Bible that the world began in 4,004 B.C.E. on Friday, October 28.
In about 1700, Isaac Newton claimed future time is infinite and that, although God created the material world some finite time ago, there was an infinite period of past time before that, as Lucretius had also claimed.
Advances in geology eventually refuted the low estimates that the universe was created in 4,000 B.C.E.
Twentieth and twenty-first century astronomers say the universe is at least as old as the big bang which began about 13.8 billion years ago.
For more discussion of the issue of the extent of time, see the companion section Infinite Time.
8. Emergence of Time
To ask whether time emerges ontologically is to ask what it depends on, not how it changes over time. Is physical time emergent, or is it instead a fundamental feature of nature? That is, is it basic, elementary, not derivative, or does it emerge at a much higher level of description from more basic timeless features? Experts are not sure of the answer, although they agree that time does not emerge from spacetime; rather, time is a special feature or designated dimension of spacetime. The most favored candidate for what spacetime emerges from is the quantum wave function, and in particular from quantum entanglement. Entanglement is a matter of degree. These quantum features of reality are explained in the companion article “What Else Science Requires of Time. (That Philosophers Should Know).”
The word emerge has been used in different ways in the philosophical literature. Some persons define emergence as a whole emerging from but being greater than the sum of its parts. There are better, less vague definitions. The word “emerge” in this article is intended to indicate the appearance of an objective or mind-independent feature of nature in the philosopher Mark Bedau’s sense of “weak emergence.” If time were to obey laws that are not entailed by the lower and more fundamental laws, then the emergence is called “strong emergence.” The emphasis in this section is not on epistemological emergence but rather ontological emergence.
A paradigm example of weak emergence is temperature emerging from the kinetic energy of molecules. A system’s temperature is entailed by the kinetic energy of its molecules, even though no molecule alone has a temperature. No human being could use all the micro-information about molecular positions and momenta in their cup of coffee to determine the temperature of the coffee. It would be an overwhelming amount of information. Weak emergence is about new features supervening upon more basic features but not existing at that more basic level. A supervenes on B if changes in A require there to be changes in B. Temperature supervenes on molecular motion because the temperature of an object cannot change without there being changes in the object’s molecular motions. Even though the low-scale laws entail the high-scale behavior and high-scale laws, as a practical matter, it is rare that a higher level concept is in practice explicitly defined in terms of a lower level concept even if it can be in principle. Strong emergence denies the supervenience and emphasizes the independence of the emergent concept from a lower level. Many philosophers have claimed that consciousness strongly emerges from the human body and that there can be a change in consciousness without any change in the configuration of the molecules in the body. The British Emergentists of the 19th century believed this. Physicists favor weak emergence over strong emergence.
When we ask whether time emerges, the notion of being emergent does not imply being inexplicable, and it does not imply that there is a process occurring over time in which something appears that was not there before the process began in analogy to an oak tree emerging from an acorn. So, we are not saying (at one time) there was no time, then (at a later time) time emerged . The philosopher Daniel Dennett helpfully recommended treating an emergent entity as a real pattern that has an explanatory and predictive role in the theory positing the entity, but it is a pattern at a higher or less-detailed level. Information is lost as one moves to higher levels, but the move to a higher level can reveal real patterns and promote understanding of nature that would never be noticed by focusing only on the fundamental level. Typically the information at the higher level that is not thrown away involves what philosophers of science have called “observables” and “macro variables.”
To say that something is emergent is to say that it’s part of an approximate description of reality that is valid at a certain (usually macroscopic) level, and is to be contrasted with “fundamental” things, which are part of an exact description at the microscopic level….Fundamental versus emergent is one distinction, and real versus not-real is a completely separate one (Carroll 2019, p. 235).
Believing time will be considered to be coarse-grained or weakly emergent in any future, successful theory of quantum gravity, theoretical astrophysicist Sean Carroll says, “Time is just an approximation….” Carlo Rovelli agrees:
Space-time is…an approximation. In the elementary grammar of the world, there is neither space nor time—only processes that transform physical quantities from one to another…. At the most fundamental level that we currently know of,…there is little that resembles time as we experience it. There is no special variable “time,” there is no difference between past and future, there is no spacetime (Rovelli 2018 195).
We properly and usefully speak of persons being made of atoms. In that sense, a person is just an approximation; so is a planet. The point of saying a new concept emerges at a higher level is not only to imply that much lower level information is lost in using higher level concepts. Instead the main point of using a higher level concept is to have access to real higher-level patterns that are needed in creating explanations that could not be easily appreciated by humans by using only lower level concepts. The point is to find especially useful patterns at the higher level to improve describing, explaining, and understanding nature. The concept of time is needed for this.
Eliminativism is the theory in ontology that says emergent entities are unreal. So, if time is emergent, it is not real. Similarly, if pain is emergent, it is not real—and therefore no person has ever really felt a pain. The theory is also called strong emergentism. The more popular position in ontology is weak emergence or anti-eliminativism. It implies that emergent entities are real patterns of fundamental entities despite being emergent and despite our not knowing how to reduce the higher level concept to the lower level one, even though Laplace’s Demon knows how to perform the reduction. Carroll claims emergence is an objective relationship between two theories, the micro one and the macro one, and it holds regardless of human inability to understand the relationship.
An important philosophical issue is to decide which level is the fundamental one. Being fundamental is relative to the speaker’s purpose. Biologists and physicists have different purposes. To a biologist, the hunger causing you to visit the supermarket emerges from the fundamental level of cellular activity. But to a physicist, the level of cellular activity is not fundamental but rather emerges from the more fundamental level of elementary particle activity which in turn emerges from the even more fundamental level of fluctuations in elementary quantum fields.
In another sense of emergence, the one in which we say a large oak tree emerged later from a small acorn, some physicists speculate that time emerged from space. Early in the big bang period there were an infinite number of dimensions of space and none of time. As the universe expanded and cooled, these eventually collapsed into four dimensions of space and still none of time. Then this collapsed so that one of the space dimensions disappeared as the time dimension emerged, leaving our current four-dimensional spacetime. (This description—especially its use of the word “then”— seems to imply that there was time before time began, but that is a problem with the English language and not with what is intended by the description.)
Some physicists believe that time is fundamental and not ontologically emergent. In 2004, after winning the Nobel Prize in physics, David Gross expressed that viewpoint. In speaking about string theory, which is his favored theory for somehow reconciling the inconsistency between quantum mechanics and the general theory of relativity, he said.
Everyone in string theory is convinced…that spacetime is doomed. But we don’t know what it’s replaced by. We have an enormous amount of evidence that space is doomed. We even have examples, mathematically well-defined examples, where space is an emergent concept…. But in my opinion the tough problem that has not yet been faced up to at all is, “How do we imagine a dynamical theory of physics in which time is emergent?” …All the examples we have do not have an emergent time. They have emergent space but not time. It is very hard for me to imagine a formulation of physics without time as a primary concept because physics is typically thought of as predicting the future given the past. We have unitary time evolution. How could we have a theory of physics where we start with something in which time is never mentioned?
By doomed, Gross means not-fundamental but rather ontologically emergent.
The physicist Carlo Rovelli, a proponent of loop quantum gravity rather than string theory, has a suggestion for what the fundamental level is from which time emerges. It is a configuration of loops. He conjectured: “At the fundamental level, the world is a collection of events not ordered in time” (Rovelli 2018a, p. 155). Rovelli is re-imagining the relationship between time and change. Spacetime emerges from a configuration of loops, analogous to the way a vest of chainmail emerges from a properly connected set of tiny circular chain links. Nevertheless, at the macroscopic level (above the Planck level), he would say time does exist even though it is not a fundamental feature of reality.
The physicist Stephen Wolfram believes the atoms of time have a duration of only 10-100 seconds. This is the time the universe needs to update itself to the next state, in analogy to a computer updating itself to the next state according to its internal clock. Wolfram asserts that the universe is a cosmic computer, and time is the progression of the universe’s computations. All physical change is a computation. He envisions the fundamental entities in the universe to be represented as a finite collection of 10400 space atoms, with time atoms lasting for 10-100 seconds. So, there is quite a bit a parallel processing going on throughout the universe. One of Wolfram’s critics, the philosopher of physics Tim Maudlin, reacted by remarking, “The physics determines the computational structure, not the other way around.”
The English physicist Julian Barbour is an eliminativist and strong emergentist about time. He said the “universe is static. Nothing happens; there is being but no becoming. The flow of time and motion are illusions” (Barbour 2009, p. 1). He argued that, although there does exist objectively an infinity of instantaneous moments, nevertheless there is no objective happens-before ordering of them, no objective time order. There is just a vast, jumbled heap of moments. Each moment is an instantaneous configuration (relative to one reference frame) of all the objects in space. Like a photograph, a moment or configuration contains information about change, but it, itself, does not change. If the universe is as Barbour describes, then space (the relative spatial relationships within a configuration) is ontologically fundamental and a continuum, but time is neither. Time is unreal, and at best emerges as some general measure of the differences among the existing spatial configurations. For more on Barbour’s position, see (Smolin 2013, pp. 84-88).
Sean Carroll has a different idea about time. He is not an eliminativist, but is a weak emergentist who claims in (Carroll 2019) that time and everything else in the universe emerges from the universe’s wave function in a “gravitized quantum theory.” The only fundamental entity in the universe is the wave function. Everything else that is real emerges from the wave function that obeys Schrödinger’s equation. This gives a physical interpretation of the wave function. Carroll says neither time, space, nor even spacetime is fundamental. These features emerge from the quantum wave function. So, spacetime is merely an approximation to reality and does not exist at the most fundamental level.
Carroll points to a result by Juan Maldacena regarding two different versions of the very same theory of cosmology, but the two versions differ on the number of their spacetime dimensions (Carroll 2010, 282). This suggests to Carroll that our own four-dimensional spacetime probably emerges and is not fundamental.
Another proposal is that whether time is emergent may not have a unique answer. Perhaps time is relative to a characterization of nature. That is, perhaps there are alternative, but empirically adequate theoretical characterizations of nature, yet time is fundamental in one characterization but emergent in another. This idea is influenced by Quine’s ontological relativity.
For more description of the different, detailed speculations on whether time is among the fundamental constituents of reality, see (Merali 2013) and (Rovelli 2018b).
9. Convention
One philosophical issue is to identify which features of time are conventional and which are not. A convention is a widely agreed upon assumption, and it is not a hypothesis. The clearest way to specify the conventional elements in a theory would be by axiomatizing it, but there is no such precise theory of time.
The issue about convention is conventional vs. factual, not conventional vs. foolish nor conventional vs. impractical. Although the term convention is somewhat vague, conventions as used here are up to our civilization to freely adopt and are not objective features of the external world that we are forced to accept if we seek the truth. Conventions are inventions as opposed to being natural or mandatory or factual. It is a convention that the English word green means green, but it is not a convention that the color of normal, healthy leaves is green.
Conventions need not be arbitrary; they can be useful or have other pragmatic virtues. Nevertheless, if a feature is conventional, then there must in some sense be reasonable alternative conventions that could have been adopted. Also, conventions can be explicit or implicit. For one last caution, conventions can become recognized as having been facts all along. The assumption that matter is composed of atoms was a useful convention in late nineteenth century physics; but, after Einstein’s explanation of Brownian motion in terms of atoms, the convention was generally recognized by physicists as having been a fact all along.
When Westerners talk about past centuries, they agree to use both A.D. and B.C.E. A clock measuring B.C.E. periods would count toward lower numbers. The clock on today’s wall always counts up, but that is merely because it is agreed we intend to use it only in the A.D. era, so there is no need for the clock to count in B.C.E. time. The choice of the origin of the time coordinate is an uncontroversial convention, too. The choice might have been an event in Muhammad’s life or a Jesus event or a Temple event or the big bang event.
The duration of the second is universally recognized to be a conventional feature. Our society could have chosen it to be longer or shorter. It is a convention that there are sixty-seconds in a minute rather than sixty-six, and that no week fails to contain a Tuesday. It is a convention that we choose time coordinates so that time goes forward as the coordinate numbers get larger rather than smaller.
The following convention is not free of controversy: It is a convention about which event here now is simultaneous with which events there then. The controversy is discussed later in this article and also at The Relativity of Simultaneity.
In a single reference frame, if event 1 happens before event 2, and event 2 happens before event 3, must event 1 also happens before event 3 as a matter of fact or a matter of convention? This fact about the transitivity of the happens-before relation in any single reference frame is a general feature of time, not a convention. It is implied by relativity theory; it is helpful to believe; no one has ever seen evidence that this transitivity is violated; and there are no reputable theories implying that there should be such evidence.
Time in physics is measured with real numbers (decimal numbers) rather than imaginary numbers (such as the square root of negative one). Does this reveal a deep feature of time? No, it is simply an uncontroversial convention.
How do we know the speed of light is the same in all directions? Is this a fact, or is it a convention? This is a controversial issue in the philosophy of physics. Einstein claimed it was a convention and untestable, but the philosophers B. Ellis and P. Bowman in 1967, and D. Malament in 1977, gave different reasons why Einstein is mistaken. For an introduction to this dispute, see The Conventionality of Simultaneity.
It is a useful convention that, in order to keep future midnights from occurring during the daylight, clocks are re-set by one hour as one moves across a time-zone on the Earth’s surface—and that is also why leap days and leap seconds are used. The minor adjustments with leap seconds are required because the Earth’s rotations are not exactly regular—mostly due to friction from ocean tides. Back in the time of dinosaurs, the rotation took only 23.5 hours. And the mass of the Earth increases continually as space dust lands. So, without conventions about re-setting clocks, one of these days the sun would be shining overhead at midnight.
Consider the ordinary way a clock is used to measure how long a nearby event lasts. We adopt the following metric, or method: Take the time at which the event ends, say 5:00, and subtract the time at which it starts, say the previous 3:00. The metric procedure says to take the absolute value of the difference between the two numbers; this method yields the answer of two hours. Is the use of this method merely a convention, or in some objective sense is it the only way that a clock could and should be used? That is, is there an objective metric, or is time metrically amorphous? Philosophers of physics do not agree on this. The philosopher of physics Adolf Grünbaum has argued that the method is conventional. Perhaps the duration between instants x and y could be:
|log(y/x)|
instead of the ordinary:
|y – x|.
A virtue of both metrics is that duration cannot be negative. The trouble with the log metric is that, for any three point events x, y, and z, if t(x) < t(y) < t(z), then it is normal to demand that the duration from x to y plus the duration from y to z be equal to the duration from x to z. However, the log metric does not have this property. The philosophical issue is whether it must have this property for any reason other than convenience.
It is an interesting fact and not a convention that our universe is even capable of having a standard clock that measures both electromagnetic events and gravitational events and that electromagnetic time stays in synchrony with gravitational time.
It is a fact and not a convention that our universe contains a wide variety of phenomena that are sufficiently regular in their ticking to serve as clocks. They are sufficiently regular because they tick in adequate synchrony with the standard clock. The word adequate here means successful for the purposes we have for using a clock.
Physicists regularly assume they may use the concept of a point of continuous time. They might say some event happened the square root of three seconds after another event. Physicists usually uncritically accept a point of time as being real-valued, but philosophers of physics disagree with each other about whether this is merely a useful convention. Whitehead argued that it is not a convention; it is a false hypothesis.
Our society’s standard clock tells everyone what time it really is. Can our standard clock be inaccurate? “Yes,” say the objectivists about the standard clock. “No,” say the conventionalists who claim the standard clock is accurate only by convention; if it acts strangely, then all other clocks must act equally strangely in order to stay in synchrony with the standard clock. For an example of strangeness, suppose our standard clock used the periodic rotations of the Earth relative to the background stars. In that case, if a comet struck Earth and affected the rotational speed of the Earth (as judged by, say, a pendulum clock), then we would be forced to say the rotation speed of the Earth did not really change but rather the other periodic clock-like phenomena such as swinging pendulums and quartz crystal oscillations all changed in unison because of the comet strike. The comet “broke” those clocks. That would be a strange conclusion to draw, and in fact, for just this reason, 21st century physicists have rejected any standard clock that is based on Earth rotations and have chosen a newer standard clock that is based on atomic phenomena. Atomic phenomena are unaffected by comet strikes.
A good choice of a standard clock makes the application of physics much simpler. A closely related philosophical question about the choice of the standard clock is whether, when we change our standard clock, we are merely adopting constitutive conventions for our convenience, or in some objective sense we can be making a choice that is closer to being correct. For more on this point, see this article’s Frequently Asked Questions.
The special theory of relativity is believed by most physicists to imply that the notion of now or the present is conventional because it depends on which person’s present is being referred to. Many philosophers, but not a majority, disagree and believe in an objective present. Here is a two-dimensional Minkowski diagram of space and time displaying the gray area where a range of possible conventions is allowed according to relativity theory:
The light cone of your future is the region above the gray area; the past line cone is the region below the gray area. The diagonal straight lines are the worldlines of light rays reaching and leaving you here now. The gray areas of this block universe represent all the events (in sense 1 of the term “event”) that could be classified either way, as your future events or as your past events; and this classification depends upon someone’s choice of what line within the gray area will be the line of your present. Events within the gray areas represent all the events that could neither cause, nor be caused by, your being here now. The more technical ways of saying this is that the gray area is all events that are space-like separated from your hear and now or that are in your hear-and-now’s absolute elsewhere or that constitute your extended present. Two events are time-like separated from each other if they could possibly have affected each other. If a pair of events is time-like separated, then they cannot also be space-like separated. Light cones are not frame relative; they are absolute and objective. Also, this structure of space-time holds not just for you; every point-event, has its own unique pair of light cones.
The gray region of space-like events is called the extended present because, if you were defining an x-axis of this diagram in order to represent your present events, then you would have a great latitude of choice. You could place the line that is to be the frame’s spatial axis anywhere in the gray area; but, in order to avoid ambiguity, once it is chosen it stays there for all uses of the coordinate system; it cannot change its angle. For example, suppose two point-events represented as a and b in the diagram both occur in the Andromeda Galaxy. That galaxy is 2,000,000 light-years away from you, assuming you are now on Earth. Even though event b were to occur a million years after a, you (or whomever is in charge of setting up the axes of the coordinate system you are using) are free to choose either event as happening now in that galaxy, and you also are free to choose any intermediate event there. But you are not free to choose an event in a white area because that would violate relativity theory’s requirements about causality. One implication of this argument is that relativity theory implies there is no fact of the matter as to what is happening at present in the Andromeda Galaxy. What is happening when there is frame-relative.
The above discussion about time-order is often expressed more succinctly by physicists by saying the time-order of space-like events is conventional and not absolute. For more on this controversial issue, see the discussion of the relativity of simultaneity.
Well, perhaps this point should be made more cautiously by saying that special relativity implies the relativity of simultaneity for non-local events. Some philosophers believe there is a fact of the matter, a unique present, even if special relativity does not recognize the fact.
We can see a clock, but we cannot see time, so how do we know whether time is real—that it exists? Someone might think that time is real because it is what clocks are designed to measure, and because there certainly are clocks. The trouble with this reasoning is that it is analogous to saying that unicorns are real because unicorn hunters intend to find unicorns, and because there certainly are unicorn hunters.
A principal argument that time is real is, as the metaphysician David Lewis would say, because the hypothesis is serviceable, and that is a reason to think that it is true. If it can be shown that the concept provides theoretical unity and economy across multiple theories, especially our fundamental theories, and if it can be shown that its acceptance does not violate Occam’s Razor nor seem to have hidden, unacceptable implications, then is not the default position one of time’s reality? For a similar reason, poems and extinct languages are real.
But if, as most physicists say, to be real is to be frame-independent, then time is not real and only spacetime is real. This insight into the nature of time was first promoted by Hermann Minkowski soon after his student Albert Einstein created the special theory of relativity. Similarly, because energy, distance, and mass are also different in different references frames, they, too, are not real. The requirement that to be real is to be frame-independent is not a logical truth, nor a result of observation. It is a plausible metaphysical assumption that so far has the support of almost every physicist and, to a lesser extent, the philosophers of physics. Physicists presume the reality of time, energy, distance, and mass because they implicitly assume that there is prior agreement on which reference frame is accepted, and this assumption is also made in the discussion below.
Let’s consider some other arguments against the reality of time that have appeared in the philosophical literature. The logical positivist Rudolf Carnap said, “The external questions of the reality of physical space and physical time are pseudo-questions” (“Empiricism, Semantics, and Ontology,” 1950). He meant these two questions are meaningless because there is no way to empirically verify their answers one way or the other. Subsequent philosophers have generally disagreed with Carnap and have taken these metaphysical questions seriously.
Here are other reasons for the unreality of time. Time is unreal because (i) it is emergent, or (ii) it is subjective, or (iii) it is merely conventional (such as being only a mathematical construct that doesn’t correspond to something that exists in the real world), or (iv) it is defined inconsistently, or (v) its scientific image deviates too much from its commonsense image. The five are explored below, in order.
i. Because Time is Emergent
Time does not emerge from spacetime, but suppose it does emerge from the quantum gravitational field, or something else. Does this imply time is not real? Most scientists and philosophers of time will answer “no” for the following reasons. Scientists once were once surprised to learn that heat emerges from the motion of molecules. A molecule itself has no heat. Would it not have been a mistake to conclude from this that heat is unreal and nothing is warm? And when it became clear that a baseball is basically a collection of molecules, and so baseballs can be said to emerge from arrangements of molecules, would it not have been a mistake to say this implies baseballs no longer exist? It would be a mistake because baseballs and heat are real patterns of fundamental objects and events. Also, the concept of time has proven itself to be extremely useful from the ultramicroscopic scales of quarks to the large scale of the entire cosmos, so most experts argue that time is real at least at all those scales. There is some serious and popular speculation in the physics community that as one investigates nature at smaller and smaller scales below the Planck scale, the concept of time becomes less applicable to reality, but few physicists or philosophers draw the conclusion from this that time is not real at any scale. The compatibility of time’s not existing somewhere below, say, the Planck scale to its existing above that scale is somewhat analogous to free will’s not existing at the scale of an individual human cell to its existing at the macroscopic scale of human activity.
ii. Because Time is Subjective
Psychological time is clearly subjective, but the focus now is on physical time. Any organism’s sense of time is subjective, but is the time that is sensed also subjective? Well, first what does subjective mean? This is a notoriously controversial term in philosophy. Here it means that a phenomenon is subjective if it is a mind-dependent phenomenon, something that depends upon being represented by a mind. A secondary quality such as being red is a subjective quality, but being capable of reflecting the light of a certain wavelength is not subjective. The same point can be made by asking whether time comes just from us or instead is wholly out there in the external world independent of us. Throughout history, philosophers of time have disagreed on the answer. Without minds, nothing in the world would be surprising or beautiful or interesting. Can we add that nothing would be in time? If so, time is not objective, and so is not objectively real.
Aristotle envisioned time to be a counting of motions (Physics, IV.ch11.219b2), but he also asked the question of whether the existence of time requires the existence of mind. He does not answer his own question because he says it depends on whether time is the conscious numbering of movement or instead is just the capability of movements to be numbered were consciousness to exist.
St. Augustine, clearly adopted a subjectivist position regarding time, and said time is nothing in reality but exists only in the mind’s apprehension of that reality.
Several variants of idealism have implied that time is not real. Kant’s idealism implies objective time, the time of things-in-themselves, if there even are such things, is unknowable, and so is in that sense unreal. The post-Kantian German idealists (Fichte, Schelling, Hegel) argued that the problem isn’t that time is unknowable but that all reality is based wholly upon minds, so objective time is unreal. It cannot be a feature of, or part of, reality.
Here are some comments against the above arguments and for the reality of objective time. Notice that a clock can tick in synchrony with other clocks even when no one is paying attention to the clocks. Second, notice how useful the concept of time is in making such good sense of our evidence involving change, persistence, and succession of events. Consider succession. This is the order of events in time. If judgments of time order were subjective in the way judgments of being interesting vs. not-interesting are subjective, then it would be too miraculous that everyone can so easily agree on the temporal ordering of so many pairs of events: birth before death, the acorn sprouts before oak tree appears, houses are built before they are painted. W. V. O. Quine might add that the character of the objective world with all its patterns is a theoretical entity in a grand inference to the best explanation of the data of our experiences, and the result of this inference tells us that the world is an entity containing an objective time, a time that gets detected by us mentally as psychological time and gets detected by our clocks as physical time.
iii. Because Time is Merely Conventional or Only a Mathematical Construct
One might argue that time is not real because the concept of time is just a mathematical artifact in our fundamental theories of mathematical physics which is merely playing an auxiliary mathematical role. Similarly, coordinate systems are mathematical constructs, and the infinite curvature of space at the center of a black hole is generally considered to be merely an artifact of the mathematics used by the general theory of relativity but not to exist in reality.
Or one might argue as follows. Philosophers generally agree that humans invented the concept of time, but some philosophers argue that time itself is invented. It was created as a useful convention, like when we decided to use certain coin-shaped metal objects as money. Money is culturally real but not objectively real because it would disappear if human culture were to disappear, even if the coin-shaped objects were not to disappear. Money and oxygen both exist, but money’s existence depends upon social relations and conventions that oxygen’s existence does not depend upon. Is time’s existence more like money than oxygen in that regard?
Although it would be inconvenient to do so, our society could eliminate money and return to barter transactions. Analogously, Callender asks us to consider the question, “Who Needs Time Anyway?”
Time is a way to describe the pace of motion or change, such as the speed of a light wave, how fast a heart beats, or how frequently a planet spins…but these processes could be related directly to one another without making reference to time. Earth: 108,000 beats per rotation. Light: 240,000 kilometers per beat. Thus, some physicists argue that time is a common currency, making the world easier to describe but having no independent existence (Callender 2010, p. 63).
In 1905, the French physicist Henri Poincaré argued that time is not a feature of reality to be discovered, but rather is something we have invented for our convenience. He said possible empirical tests cannot determine very much about time, so he recommended the convention of adopting whatever concept of time that makes for the simplest laws of physics. Nevertheless, he said, time is otherwise wholly conventional, not objective.
There are two primary reasons to believe time is not merely conventional: First, there are so many one-way processes in nature. For example, mixing cold milk into hot, black coffee produces lukewarm, brown coffee, but agitations of lukewarm, brown coffee have never turned it back into hot black coffee with cool milk. The process goes only one way in time.
Second, our universe has so many periodic processes whose periods are constant multiples of each other over time. That is, their periods keep the same constant ratio to each other. For example, the frequency of rotation of the Earth around its axis relative to the “fixed” stars is a constant multiple of the frequency of swings of a fixed-length pendulum, which in turn is a constant multiple of the half-life of a specific radioactive uranium isotope, which in turn is a constant multiple of the frequency of a vibrating quartz crystal, which in turn is a constant multiple of the frequency of a light beam emitted from a specific kind of atomic process used in an atomic clock. The relationships do not change as time goes by—at least not much and not for a long time, and when there is deviation we know how to predict it and compensate for it. The existence of these sorts of constant time relationships—which cannot be changed by convention—makes our system of physical laws much simpler than it otherwise would be, and it makes us more confident that there is some convention-free, natural kind of entity that we are referring to with the time-variable in those physical laws—despite the fact that time is very abstract and not something we can see, taste, or touch.
iv. Because Time is Defined Inconsistently
Bothered by the contradictions they claimed to find in our concept of time, Parmenides, Zeno, Spinoza, Hegel, and McTaggart said time is not real.
Plato’s classical interpretation of Zeno’s paradoxes is that they demonstrate the unreality of any motion or any other change. Assuming the existence of time requires the existence of change, then Zeno’s paradoxes also overturn Greek common sense that time exists.
The early 20th-century English philosopher J.M.E. McTaggart believed he had a convincing argument for why a single event can acquire the properties of being a future event, a present event, and also a past event, and that since these are contrary properties, our concept of time is inconsistent, and the inconsistency cannot be removed. It follows for McTaggart that time is not real. This argument has received a great deal of attention in the philosophy literature but hardly any in the physics literature.
The early 20th-century absolute-idealist philosopher F.H. Bradley claimed, “Time, like space, has most evidently proved not to be real, but a contradictory appearance…. The problem of change defies solution.”
Regarding the inconsistencies in our concept of time that Zeno, McTaggart, Bradley, and others claim to have revealed, most philosophers of time say that there is no inconsistency, and that the complaints can be handled by clarification or by revising the relevant concepts. For example, Zeno’s paradoxes were solved by requiring time to be a linear continuum like a segment of the real number line. This solution was very fruitful and not ad hoc. It would be unfair to call it a change of subject.
v. Because Scientific Time is Too Unlike Ordinary Time
If you believe that for time to exist it needs to have certain features of the commonsense image of time, but you believe that science implies time does not have those features, you might be tempted to conclude that science has really discovered that time does not exist. In the mid 20th century the logician Kurt Gödel argued for the unreality of time as described by contemporary physical science because the equations of the general theory of relativity allow for physically possible universes in which all events precede themselves. People can, “travel into any region of the past, present, and future and back again” (Gödel, 1959, pp. 560-1). It should not even be possible for time to be circular or symmetric like this, Gödel believed, so, he concluded that, if we suppose time is the time described by relativity theory, then time is not real.
Regarding the claim that our commonsense understanding of time by science is not treated fairly by the science of time, there is no consensus about which particular features of commonsense time cannot be rejected, although not all can be or else we would be changing the subject and not talking about time. But science has not required us to reject our belief that some events happen in time before other events, nor has science required us to reject our belief that some events last for a while. Gödel’s complaint about relativity theory’s allowing for circular time has been treated by the majority of physicists and philosophers of time by saying he should accept that time might possibly be circular even though as a contingent matter it is not circular in our universe, and he needs to revise his intuitions about what is essential to the concept.
vi. Conclusion
Even if the previous five arguments do not succeed, it still does not follow that time is real. Time is not the same thing as spacetime. The word spacetime does refer to a real, existing entity because it is so helpful for explaining, understanding, and predicting so many phenomena above the Planck scale, plus there do not exist alternative, better ways of doing this. So, time is real, but only given that there is prior agreement about which reference frame for spacetime is being assumed.
11. Time Travel
Would you like to travel to the future and read about the history of your great-grandchildren? You can do it. Nothing in principle is stopping you. Would you like to travel, instead, to the past? You may have regrets and wish to make some changes. Unfortunately, travel to your own past is not as easy as travel to someone else’s future. It is much easier to visit your descendants than your ancestors.
The term time travel has now become a technical term. For starters, it means travel in physical time, not psychological time. You do not time travel if you merely dream of living in the past, although neuroscientists commonly do call this “mental time travel.” You do not time travel for five minutes simply by being alive for five minutes. You do not time travel by crossing a time zone, nor by having your body frozen and thawed later, even if this does extend your lifetime.
Time travel to the future presupposes the metaphysical theory of eternalism because, if you travel to the future, there must be a future that you travel to. Presentism and the growing-past theory deny the existence of this future. That is why the growing-past theory is also called no-futurism and possibilism.
In 1976, the Princeton University metaphysician David Lewis offered this technical definition of time travel:
In any case of physical time travel, the traveler’s journey as judged by a correct clock attached to the traveler takes a different amount of time than the journey does as judged by a correct clock of someone who does not take the journey.
The implication from this definition is that time travel occurs when correct clocks get out of synchronization. If you are the traveler, your personal time (technically called your proper time) is shown on the clock that travels with you. A person not taking the journey is said to be measuring external time. This external time could be their proper time, or it could be the proper time of our civilization’s standard clock.
Lewis’s definition is widely accepted, although it has been criticized occasionally in the philosophical literature. The definition has no implications about whether, if you travel forward in external time to the year 2376 or backward to 1776, you can suddenly pop into existence then as opposed to having traveled continuously during the intervening years. Continuity is required by scientific theory, but discontinuous travel is more popular in fictional books and films.
a. To the Future
Time travel to the future occurs very frequently, and it has been observed and carefully measured by scientists. Time travel to the past is much more controversial, and experts disagree with each other about whether it is even physically possible. Relativity theory implies there are two different kinds of time travel to the future: (1) two clocks becoming out of synchrony due to their moving relative to each other, and (2) two clocks becoming out of synchrony due to their encountering different gravitational forces.
When you travel to the future, you eventually arrive at some future event having taken less time on your clock than the non-travelers do on their clocks. You might travel to the future in the sense that you participate in an event ten years in their future, having taken only two years according to your own clock. That would be an eight-year leap forward in time. You can be continuously observed from Earth’s telescopes during your voyage to that event. However, the astronomers on Earth would notice that you turned the pages in your monthly calendar very slowly. The rate of ticking of your clock would differ from that of their clock during the flight. Reversing your velocity and traveling back to the place you began the trip will not undo this effect.
If you do travel to the future, that is, their future, then you never get biologically younger; you simply age more slowly than those who do not travel with you. So, it is not possible to travel into the future and learn about your own death.
Any motion produces time travel to the future, relative to the clocks of those who do not move. That is why you can legitimately advertise any bicycle as being a time machine. The faster you go the sooner you get to the part of the future you desire but the more easily the dust and other particles in space will slice through your body during the trip.
The second kind of future time travel is due, not to a speed difference between two clocks, but to a difference in the strength of the gravitational field on two clocks. This is called gravitational time dilation, and it is most noticeable near a source of extreme gravitation such as near a black hole. If you were to leave Earth and orbit near a black hole, your friends back on Earth might view you continuously through their telescopes and, if so, would see you live in slow motion. When you returned, your clock would show that less time had expired on your clock than on their clock that remained on Earth. Similarly, in a tall building the lower-floor clocks tick more slowly than upper-floor clocks because the lower floor is in a stronger gravitational field, all other things being equal. There is no theoretical limit to how slow a clock can tick when it undergoes time dilation, but it would never tick in reverse.
Travelers to the future can participate in that future, not just view it. They can influence the future and affect it. Saying travelers can change the future is a controversial comment; it is either true or false depending on what is meant by the term “change the future.” According to the metaphysician David Lewis (Lewis 1976, 150), changing the future is impossible. If it changed, then it was not really the future after all. He argued that no action changes the future, regardless of whether time travel is involved.
Suppose you were to encounter a man today who says that yesterday he lived next door to Isaac Newton in England in the year 1700, but now he has traveled to the future and met you. According to the theory of relativity, it is physically possible that he did this. Yet it is an extraordinary claim since you undoubtedly believe that sufficiently fast spaceships or access to extraordinarily high gravitational fields were not available to anyone in 1700. And it is unlikely that history books failed to mention this if it did occur. Epistemology tells us that extraordinary claims require extraordinarily good evidence, so the burden of proof is on the strange man to produce that evidence—such as a good reason how the secret of building spaceships was discovered but kept from the public in 1700 and from later historians. You also would like to be shown that his body today contains traces of the kind of atmosphere that existed back in 1700; that atmosphere is slightly different chemically from ours today. If he cannot or will not produce the evidence, then it is much more likely that he is deluded or is simply lying. Giving him a lie detector test will not be very helpful; you want to know what is true, not merely that he believes what he says.
b. To the Past
There are no known examples of travel to the past. But before we assess whether travel to the past is at least possible, let’s consider what we mean by travel to the past. A telescope is a window into the past. If we use it to look out into space to some region R a million light-years away, we are seeing R as it was a million years ago. However, this is looking at the past, not being in the past.
Being in our own past is part of what is meant by traveling to the past. The word “our” is important here. At present, we are existing in the past of future people, but we are not traveling into their past. What optimists about past time travel hope is that it is possible to travel into our own past. This is impossible according to Newton’s physics and impossible according to Einstein’s special theory of relativity, but it may be possible according to Einstein’s general theory of relativity, although experts are not in agreement on this point despite much study of the issue.
One of the world’s experts on time travel, Kip Thorne at the California Institute of Technology, made this helpful comment:
If it is possible to go backward in time, you can only do so by traveling outward in space and then returning to your starting point before you left. You cannot go backward in time at some fixed location while watching others go forward in time there.
Travel in time to the past was seriously discussed in Western philosophy only after 1949 when the logician Kurt Gödel published a solution to the equations of the general theory of relativity that he claimed allows travel to the past. He said some very exotic distributions of matter and energy will curve spacetime enough to form loops along which, as you continue to travel forward in your own proper time, you arrive back to your past events. These curves or time loops are technically called “closed time-like curves.” There is no requirement that anything moves faster than the speed of light in this scenario. Einstein praised the paper, but said he hoped some new physical law would be discovered that would block Gödel’s solution to the relativity equations. Other physicists say Einstein should not have praised Gödel’s argument.
If you are going to travel back in time and be consistent with general relativity, then you are required to stay within your own light cone. However, general relativity theory allows the light cone structure itself to “tip” in a spacetime diagram as time progresses. Some researchers argue that general relativity also allows enough tipping that eventually the cone encloses an earlier event on your world line. If so, you’ve traveled in a time loop without having had to travel faster than the speed of light.
According to cosmologist Stephen Hawking, creating a wormhole might be a way to build a time machine to do this.
One can show that to create a wormhole one needs to warp space-time in the opposite way to that in which normal matter warps it. Ordinary matter curves space-time back on itself, like the surface of the Earth. However, to create a wormhole one needs matter that warps space-time in the opposite way, like the surface of a saddle. The same is true of any other way of warping space-time to allow travel to the past if the universe didn’t begin so warped that it allow time travel. What one would need would be matter with negative mass and negative energy density to make space-time warp in the way required (Hawking 2018, 134).
Such matter is called “exotic matter.” There is universal agreement that time travel to the past has not yet been observed. In 1992, the Hawking proposed The Chronology Protection Hypothesis that Nature conspires somehow to block backward time travel on the macroscopic scale. Maybe time machines above the Planck scale will explode. One of the world’s experts on time travel, Kip Thorne, cautioned in 2024 that Hawking’s hypothesis is a guess, not a fact.
There is general agreement that science must obey logic, which implies that in a single world there is consistent story of what has happened and will happen, despite the fact that novels about time travel frequently describe traveling back to remake the past and thereby to produce a new version of reality that is inconsistent with the earlier version.
The equations of general relativity are simply too complicated to solve regarding past time travel, even for experts. Many of these experts (for example, Frank Wilczek) suggest that travel to the past is not allowed in any physically possible universe, and the closest one can come to time travel to the past is to travel to a new branch of the universe’s quantum wave function, which implies, for some experts, travelling to a parallel universe. All the experts agree that, even if the equations do allow some possible universe to contain travel to one’s own past via the creation of a time machine, they do now allow travel to a time before the creation of the first time machine in that universe.
Shortly after Einstein published his general theory of relativity, the physicist Hermann Weyl predicted that the theory allows time travel to the past. However, his claim was not studied carefully until Kurt Gödel’s work on relativity in 1949. Gödel claimed time travel must exist in a certain universe having a non-zero total angular momentum. Gödel was able to convince Einstein of this, but experts on relativity are not in agreement on whether Einstein should have been convinced. Opponents of Gödel say he discovered a mathematical curiosity, not a physical possibility. Still others say that, even if relativity does allow travel to the past, the theory should be revised to prevent this. Other opponents of the possibility of time travel to the past hope that an ad hoc restriction is not needed and instead that relativity theory will be understood more clearly so it can be seen that it does rule out past time travel. And still other opponents of time travel to the past hope an as yet unknown physical law will be discovered that rules out travel to the past. However, defenders of time travel say we should bite the bullet, and accept that relativity does allow time travel in some kinds of universes that have special warped spacetime.
Here is a pessimistic remark about time travel from J.J.C. Smart in The Journal of Philosophy in 1963:
Suppose it is agreed that I did not exist a hundred years ago. It is a contradiction to suppose that I can make a machine that will take me to a hundred years ago. Quite clearly no time machine can make it be that I both did and did not exist a hundred years ago.
Smart’s critics accuse him of the fallacy of begging the question. They wonder why he should demand that it be agreed that “I did not exist a hundred years ago.”
If general relativity does allow a universe that contains time travel to the past, this universe must contain a very special distribution of matter-energy. For an illustration of the simplest universe allowing backward time travel (in a one-dimensional space) and not being obviously inconsistent with general relativity, imagine a Minkowski two-dimensional spacetime diagram written on a square sheet of paper, with the one space dimension represented as going left and right on the page. Each point on the page represents a possible two-dimensional event. The time dimension points up and down the page, at right angles to the space dimension. The origin is at the center of the page. Now curve (bend) the page into a horizontal cylinder, parallel to the space axis so that the future meets the past. In the universe illustrated by that graph, any stationary object that persists long enough arrives into its past and becomes its earlier self. Its worldline is (topologically equivalent to) a circle; more technically, it is a closed time-like curve that is a circle. A closed curve has no end points. This cylindrical universe allows an event to occur both earlier and later than itself, so its time is not asymmetric. The curvature of this universe is what mathematicians call extrinsic curvature. There is no intrinsic curvature, however. Informally expressed, extrinsic curvature is curvature detectable only from a higher dimension, but intrinsic curvature can be detected by a being who lives within the space, say by noticing a failure of the Pythagorean Theorem somewhere. When the flat, square sheet is rolled into a tube, the intrinsic geometry does not change; only the extrinsic geometry changes.
Regardless of how space is curved and what sort of time travel occurs, if any past time travel does occur, the traveler apparently is never able to erase facts or otherwise change the past. That is the point of saying, “whatever happened, happened.” But that metaphysical position has been challenged. It assumes there is only one past and that whatever was the case will always have been the case. These assumptions, though widely accepted, occasionally have been challenged in the philosophical literature. They were challenged in the 11th century by Peter Damian who said God could change the past.
Assuming Damian is mistaken, if you do go back, you would already have been back there. For this reason, some philosophers argue that for reasons of logical consistency, if you go back in time and try to kill your grandfather by shooting him before he conceived a child, you will fail no matter how hard you try. You will fail because you have failed. But nothing prevents your trying to kill him. There is no free will problem.
Or is there? The impossibility of killing your grandfather seems to some philosophers to raise a problem about free will. If you are free to shoot and kill people before you step into a time machine, then presumably you are free to shoot and kill people after you step out.
Assuming you cannot shoot and kill your grandfather because you did not, many philosophers argue that in this situation you do not really have freedom in the libertarian sense of that term. To resolve this puzzle, the metaphysician David Lewis said you can in one sense kill your grandfather but cannot in another sense. You can, relative to a set of facts that does not include the fact that your grandfather survived to have children. You cannot, relative to a set of facts that does include this fact. However, Lewis said there is no sense in which you both can and cannot. So, the meaning of the word can is sensitive to context. The metaphysician Donald C. Williams disagreed, and argued that we always need to make our can-statement relative to all the available facts. Lewis is saying you can and can’t, but in different senses, and you can but won’t. Williams is saying simply that you can’t, so you won’t.
If you step into a time machine that projects you into the past, then you can expect to step out into a new place because time travel apparently always involves motion. There is an ongoing philosophical dispute about whether, in a real closed time-like curve, a person would travel to exactly an earlier event or, instead, only to a nearby event. One suggested reason for restricting the time-like curve to only nearby events is that, if one went back to the same event, one would bump into oneself, and this would happen over and over again, and there would be too many copies of oneself existing in the same place. Many physicists consider this to be a faulty argument.
If it is logically inconsistent to build a new time machine to travel back to a time before the first time machine was invented, then there is no hope of creating the first time machine in order to visit the time of the dinosaurs. In 1988 in an influential physics journal, Kip Thorne and colleagues described the first example of how to build a time machine in a world that has never had one: “[I]f the laws of physics permit traversable wormholes, then they probably also permit such a wormhole to be transformed into a “time machine….” (Morris 1988, p. 1446).
A wormhole is a second route between two places; perhaps it is a shortcut tunnel to a faraway place. Just as two clocks get out of synchrony if one moves relative to the other, a clock near a rapidly moving mouth of a wormhole could get out of synch with a clock at the other, stationary mouth. In principle a person could plunge into one hole and come out at an earlier time. Wormholes were first conceived by Einstein and Rosen, and later were named wormholes by John Wheeler.
Experts opposed to traversable wormholes have less of a problem with there being wormholes than with them being traversable. Although Thorne himself believes that traversable wormholes probably do not exist naturally, he also believes they might in principle be created by a more advanced civilization. However, Thorne also believes the short tunnel or “throat” between the two mouths of the wormhole probably would quickly collapse before anything of macroscopic size could use the wormhole to travel back in time. There has been some speculation by physicists that an advanced civilization could manipulate negative gravitational energy with its positive pressure in order to keep the hole from collapsing long enough to create the universe’s first non-microscopic time machine. Perhaps it could be used to visit the age of the dinosaurs.
It is a very interesting philosophical project to decide whether wormhole time travel, or any other time travel to the past, produces paradoxes of identity. For example, can a person travel back and be born again?
To solve the paradoxes of personal identity due to time travel’s inconsistency with commonly held assumptions about personal identity, many philosophers recommend rejecting the endurance theory which implies a person exists wholly at a single instant. They recommend accepting the perdurance theory in which a person exists as a four-dimensional entity extending in time from birth to death. The person is their spacetime “worm.” If the person were envisioned to be a point particle whose worm is a one-dimensional curve, then worms of this sort can live partly in wormholes and become closed time-like curves in spacetime.
Let us elaborate on this radical scenario. A closed time-like curve has implications for causality. The curve would be a causal loop. Causal loops lead to backward causation in which an effect can occur before its cause. Causal loops occur when there is a continuous sequence of events e1, e2, e3, …. in which each member is a cause of its numerical successor and, in addition, for some integer n, en causes e1. Some philosophers of time have cautioned that with a causal loop, “we would be clearly on the brink of magic.” Other philosophers of time are more willing to accept the possibility of causal loops, strange though they would be. These loops would be a fountain of youth. When you go around the loop, you travel back to a time when you were younger, or perhaps even to your birth.
Most time travel stories in literature involve contradictions, either logical contradictions or inconsistency with accepted laws of physics. The most famous one that appears not to is Robert Heinlein’s story “All You Zombies.” It shows how someone could be both their father and mother, provided relativity theory does allow backward time travel.
For a detailed review of the philosophical literature on backward time travel and the resulting paradoxes of causality and of personal identity, see (Wasserman, 2018, ch. 5) and (Fisher, 2015).
Inspired by an idea from John Wheeler, Richard Feynman suggested that a way to interpret the theory of quantum electrodynamics about interactions dominated by electromagnetic and weak forces is that an antimatter particle is really a matter particle traveling backward in time. For example, the positively charged positron moving forward in time is really a negatively charged electron moving backward in time.
This phenomenon is pictured in the two diagrams on the left of the above postage stamp, where time is portrayed as increasing upward. The positron e+ is moving downward or backward in time. Feynman diagrams picture a short sequence of elementary interactions among particles. Feynman speculated that the reason why every electron has exactly the same properties as any other, unlike identical brooms manufactured in a broom factory, is that there is only one electron in the universe and it exists simultaneously at a great many places, thanks to backward time travel.
All empirical searches attempting to observe a particle moving backward in time have failed. So, the majority of physicists in the 21st century see no need to accept backward time travel despite Feynman’s successful representations of quantum electrodynamics. See (Muller 2016a, p. 246, 296-7) and (Arntzenius & Greaves 2009) for critical commentary on this. Nevertheless, some well-respected physicists such as Neil Turok do accept Feynman-style backward time travel. The philosopher Huw Price added that the Feynman “zigzag is not there in standard QM, so if we put it in, we are accepting that QM is incomplete.” The zigzag needs hidden variables, in other words in order to determine when to zig and when to zag. At the heart of this dispute about whether to believe antimatter is regular matter traveling backward in time, physicists are very cautious because they realize that the more extraordinary the claim, the more extraordinarily good the evidence should be in support of the claim.
Here are a variety of very brief philosophical arguments against travel to the past:
If travel to the past were possible, you could go back in time and kill your grandfather before he met your grandmother, but then you would not be born and so could not go back in time and kill your grandfather. That is a logical contradiction. So, travel to the past is impossible.
Like the future, the past is also not real, so time travel to the past or the future is not real either.
Time travel is impossible because, if it were possible, we should have seen many time travelers by now, but nobody has ever encountered any time travelers.
If past time travel were possible, then you could be in two different bodies at the same time, which is metaphysically impossible.
If you were to go back to the past, then you would have been fated to go back because you already did, and this rules out your freedom to go back or not. Yet you do have this freedom, so travel to the past is impossible.
If past time travel were possible, then you could die before you were born, which is biologically impossible.
If you were presently to go back in time, then your present events would cause past events, which violates our concept of causality.
If travel to the past were possible, then when time travelers go back and attempt to change history, they must always fail in their attempts to change anything, and it will appear to anyone watching them at the time as if Nature is conspiring against them. Since no one has ever witnessed this apparent conspiracy of Nature, there probably cannot be time travel.
Travel to the past is impossible because it allows the gaining of information for free. Here is a suggestive scenario. You in the 21st century buy a copy of Darwin’s book The Origin of Species, which was published in 1859. You enter a time machine with it, go back to 1855 and give the book to Darwin himself. He could have used your copy in order to write his manuscript which he sent off to the publisher. If so, who first came up with the knowledge about evolution? You got the knowledge from Darwin, but Darwin got the knowledge from you. This is “free information.” Because this scenario contradicts what we know about where knowledge comes from, past-directed time travel is not really possible.
Travel to the past allows you to return to have intercourse with one of your parents, causing your birth. You would have the same fingerprints as one of your parents. But this is biologically impossible.
If past time travel is possible, then it should be possible for a rocket ship to carry a time machine capable of launching a probe (perhaps a smaller rocket) into its recent past which might eventually reunite with the mother ship. The mother ship has been programmed to launch the probe at a certain time unless a safety switch is on at that time. Suppose the safety switch is programmed to be turned on if and only if the return or impending arrival of the probe is detected by a sensing device on the mother ship. Does the probe get launched? It seems to be launched if and only if it is not launched.
These complaints about travel to the past are a mixture of arguments that past-directed time travel is not logically possible, not metaphysically possible, not physically possible, not technologically possible, not biologically possible, and not probable.
Counters to all of these arguments have been suggested by advocates of time travel. One response to the Grandfather Paradox of item 1 says you would kill your grandfather but then be in an alternative universe to the actual one where you did not kill him. This response is not liked by most proponents of time travel; they say traveling to an alternative universe is not what they mean by time travel.
Item 2 is the argument from presentism.
One response to item 3, the Enrico Fermi Paradox, is that perhaps we have seen no time travelers because we live in a boring era of little interest to time travelers. A better response is that perhaps the first time machine has never been built, and it is known that a time machine cannot be used to go back to a time before the first time machine exists (or closed time-like curve exists).
Item 9, the paradox of free information, has gotten considerable attention in the philosophical literature. In 1976, David Lewis said this:
But where did the information come from in the first place? Why did the whole affair happen? There is simply no answer. The parts of the loop are explicable, the whole of it is not. Strange! But not impossible, and not too different from inexplicabilities we are already inured to. Almost everyone agrees that God, or the Big Bang, or the entire infinite past of the Universe, or the decay of a tritium atom, is uncaused and inexplicable. Then if these are possible, why not also the inexplicable causal loops that arise in time travel?
Einstein and Rosen suggested that the laws of general relativity might allow traversable, macroscopic wormholes. A wormhole is a short tunnel connecting two distant regions of space, and a traversable wormhole allows travel throughout the tunnel. The tunnel would be a shortcut between two distant galaxies, and it is analogous to a path taken by a worm who has eaten its way to the opposite side of an apple’s surface without taking the longer path using only the apple’s surface. That is why John Wheeler coined the name “wormhole.” Think of a wormhole as two spheres connected by a tunnel through a higher dimension.
The hole is highly curved spacetime, and from the outside it looks like a sphere in 3D-space. It is not quite a black hole, so it has no event horizon. There is no consensus among theoretical physicists whether general relativity permits the existence of a wormhole. Assuming it does, and assuming one of the spheres could be controlled and forced to move very fast back and forth, then with two connected spheres situated in separate galaxies, a particle or person could enter one at some time, then exit the other at an earlier time, having traveled, say, just a few meters through the tunnel. Because of this implication for time, some physicists argue that if these traversable wormholes are allowed by general relativity, then the theory needs to be revised to disallow them.
For more discussion of time travel by wormhole, see the supplement on relativity theory. For more about time travel, see this encyclopedia’s article “Time Travel.” For some arguments in the philosophy literature against the possibility of a person travelling back to a time at which the person previously existed, see (Horwich 1975), (Grey 1999), and (Sider 2001).
12. McTaggart’s A-Theory and B-Theory
In 1908, the English philosopher J.M.E. McTaggart proposed two ways of linearly ordering all events in time. The two ways are different, but the two orderings are the same. Here is how he re-states his kernel idea:
For the sake of brevity, I shall give the name of the A series to that series of positions which runs from the far past through the near past to the present, and then from the present through the near future to the far future, or conversely. The series of positions which runs from earlier to later, or conversely, I shall call the B series. (McTaggart 1927, 10)
When McTaggart uses the word series, he means what mathematicians call a sequence, but the literature in philosophy often follows McTaggart on this point. Below is a graphic representation of McTaggart’s ordering, in which point event c happens later than point events a and b:
McTaggart is making several assumptions here. First, he does not believe time is real, so his remark that the A-series and B-series mark out positions in time is only on the assumption that time is real, despite what he, himself, believes. Another assumption is that longer-lasting events are composed of their point events. Also, there are a great many other events that are located within the series at event a‘s location, namely all the other events that are simultaneous with event a.
Using the standard time diagram with time increasing to the right along a horizontal line, event a in McTaggart’s B-series (see picture above) is ordered to the left of event b because a happens before b. But when ordering the same two events into McTaggart’s A-series, event a is ordered to the left of b for a different reason—because event a is more in the past than event b, or, equivalently, has more pastness than b. The A-series locates each event relative to the present; the B-series is created with no attention paid to the present, but only to what occurs before what.
Suppose that event c occurs in our present and after events a and b. Although the philosophical literature is not in agreement, it is usually said that the information that c occurs in the present is not contained within either the A-series or the B-series itself, but is used to create the A-series. That information of c‘s being in the present tells us to place c to the right of b because all present events are without pastness; they are not in the past. Someone constructing the B-series places event c to the right of b for a different reason, just that c happens after b.
One influential treatment of McTaggart’s idea is to say a future event will shed its intrinsic, non-relational A-property of futureness to eventually acquire presentness, then shed that property in favor of some pastness, then shed that, too, in favor of even greater pastness, and so forth. McTaggart himself did not accept this notion of shedding properties. McTaggart himself believed the A-series is paradoxical, but he also believed the A-properties (such as being past or being two weeks past) are essential to our concept of time. So, for this reason, he believed our current concept of time is paradoxical and incoherent. This reasoning is called McTaggart’s Paradox.
McTaggart is not an especially clear writer, so his remarks can be interpreted in different ways, and the reader needs to work hard to make sense of them. Consider McTaggart’s Paradox. Regarding one specific event, say the event when:
Socrates speaks to Plato for the first time.
This speaking to Plato is in the past, at least it is in our past, though not in the past of Egyptian King Tut during his lifetime, so the speaking is past in our present. Nevertheless, back in our past, there is a time when the event is present. From this, McTaggart concludes both that the event is past and that the event is present, from which he declares that the A-series is contradictory and so paradoxical. If that reasoning is correct (and it has been challenged by many), and if the A-series is essential to time, then time itself must be unreal. This piece of reasoning is commonly called “McTaggart’s Paradox.”
When discussing the A-theory and the B-theory, metaphysicians often speak of:
A-series and B-series
A-theorist and B-theorist
A-facts and B-facts
A-terms and B-terms
A-properties and B-properties
A-predicates and B-predicates
A-propositions and B-propositions
A-sentences and B-sentences
A-camp and B-camp.
Here are some examples of using this terminology. Unlike the A-series terms, the B-series terms are relational terms because a B-term refers to a property that relates a pair of events. Some of these properties are: is earlier than, happens twenty-three minutes after, and is simultaneous with. An A-theory term, on the other hand, refers to a single event, not a pair of events. Some of these properties are: in the near future, happened twenty-three minutes ago, and is present. The B-theory terms represent distinctively B-properties; the A-theory terms represent distinctively A-properties.
The B-fact that event a occurs before event b will always be a fact, but the A-fact that event a occurred about an hour ago will not be a fact for long. B-theorists do not like facts going in and out of existence, but this is acceptable to A-theorists. Similarly, if we turn from fact-talk to statement-talk, the A-statement that event a occurred about an hour ago, if true, will soon become false. B-facts are eternal. For example, the statement “The snowfall occurred an hour before this act of utterance” will, if true, be true at all times, provided the indexical phrase the snowfall does not change its reference.
The A-theory usually implies A-facts are the truthmakers of true A-statements and so A-facts are ontologically fundamental; the B-theorist, at least a B-theorist who believes in the existence of facts, appeals instead to B-facts. According to a classical B-theory, when the A-theorist correctly says, “It began snowing an hour ago,” what really makes it true is not that the snowing has an hour of pastness (so the fact is tensed) but that the event of uttering the sentence occurs an hour after the event of it beginning to snow. Notice that occurs an hour after is a B-term that is supposed to be logically tenseless and to be analogous to the mathematical term numerically less than even though when expressed in English it must use the present tense of the verb to occur.
When you like an event, say yesterday’s snowfall, then change your mind and dislike the event, what sort of change of the event is that? Well, this change in attitude is not a change that is intrinsic to the event itself. It is extrinsic. When your attitude changes, the snowfall itself undergoes no intrinsic change, only a change in its relationship to you. (A-theorists and B-theorists do not disagree about this.) This illustrates what is meant by intrinsic when A-theorists promote the intrinsic properties of an event, such as the snowfall having the intrinsic property of being in the past. B-theorists analyze the snowfall event differently, saying that more fundamentally the event is not in the past but is in the past relative to us. “Being in the past,” they say, is not intrinsic but rather is relational.
Members of the A-camp and B-camp recognize that ordinary speakers are not careful in their use of A and B terminology; but, when the terminology is used carefully, each believes their camp’s terminology can best explain ordinary speech involving time and also the terminology of the other camp.
A-theorists and B-theorists agree that time has an objective direction, an arrow, although A-theorists recommend that the arrow is intrinsic to time whereas the B-theorists believe the arrow is extrinsic to time, involving only one-way processes that happen to occur.
Many A-theorists promote becoming. The term means a change in the A-series position of an event, such as a change in its degree of pastness. The B-theorist philosopher Adolf Grünbaum believes becoming is mind-dependent, and he points to the following initial quotation from J. J. C. Smart in opposition to the A-theory:
“If past, present, and future were real properties of events [i.e., properties possessed by physical events independently of being perceived], then it would require [non-trivial] explanation that an event which becomes present [i.e., qualifies as occurring now] in 1965 becomes present [now] at that date and not at some other (and this would have to be an explanation over and above the explanation of why an event of this sort occurred in 1965)” (says Smart). It would, of course, be a complete trivialization of the thesis of the mind-independence of becoming to reply that by definition an event occurring at a certain clock time t has the unanalyzable attribute of nowness at time t (Grünbaum 1971, p. 218).
Grünbaum is implying that it is appropriate to ask regarding the event of her house falling down in 1965, “Why now instead of some other date?” He believes that it would be an appropriate explanation to appeal to mind-independent soil conditions and weather patterns, but that it would be trivial and inadequate to say instead that the event occurs now because by definition it had at that time the unanalyzable attribute of nowness. And, more generally, says Grünbaum, temporal becoming has no appropriate place within physical theory.
Beginning with Bertrand Russell in 1903, many B-theorists have argued that there are no irreducible one-place A-qualities (such as the monadic property of being past) because the qualities can all be reduced to, and adequately explained in terms of, two-place B-relations. The A-theorist disagrees. For example, the claim that it is after midnight might be explained, says the B-theorist, by saying midnight occurs before the time of this assertion. Before is a two-place relationship, a binary relation. The A-theorist claims this is a faulty explanation.
Is the A-theory or is the B-theory the correct theory of reality? This is a philosophically controversial issue. To clarify the issue, let us re-state the two theories. The A-theory has two especially central theses, each of which is contrary to the B-theory:
(1) Time is fundamentally constituted by an A-series in which any event’s being in the past (or in the present or in the future or twenty-three seconds in the past) is an intrinsic, objective, monadic property of the event itself.
(2) Events change.
In 1908, McTaggart described the special way that events change:
Take any event—the death of Queen Anne, for example—and consider what change can take place in its characteristics. That it is a death, that it is the death of Anne Stuart, that it has such causes, that it has such effects—every characteristic of this sort never changes…. But in one respect it does change. It began by being a future event. It became every moment an event in the nearer future. At last it was present. Then it became past, and will always remain so, though every moment it becomes further and further past.
This special change is usually called second-order change or McTaggartian change. For McTaggart, second-order change is the only genuine change, whereas a B-theorist such as Russell says this is not genuine change. Genuine change is intrinsic change, he would say. Just as there is no intrinsic change in a house due to your walking farther away from it, so there is no intrinsic change in an event as it supposedly “moves” farther into the past.
In response to Russell, McTaggart said:
No, Russell, no. What you identify as “change” isn’t change at all. The “B-series world” you think is the real world is…a world without becoming, a world in which nothing happens.
A world with becoming is a world in which events change and time flows. “It is difficult to see how we could construct the A series given only the B series, whereas given the former we can readily construct the latter,” says G.J. Whitrow in defense of the A theory.
The B-theory conflicts with two central theses of the A-theory. According to the B-theory,
(1′) Time is fundamentally constituted by a B-series, and the temporal properties of being in the past (or in the present or in the future) are fundamentally relational, not monadic.
(2′) Events do not change.
To re-examine this dispute, because there is much misunderstanding about what is in dispute, let us ask again what B-theorists mean by calling temporal properties relational. They mean that an event’s property of occurring twenty-three minutes in the past, say, is a relation between the event and us, the subject, the speaker. When analyzed, it will be seen to make reference to our own perspective on the world. Scottish Queen Anne’s death has the property of occurring in the past because it occurs in our past. It is not in Aristotle’s past or King Tut’s. So, the labels, “past,” “present,” and “future” are all about us and are not intrinsic properties of events. That is why there is no objective distinction among past, present and future, say the proponents of the B-theory. For similar reasons the B-theorist says the property of being two days in the past is not an ‘authentic’ property because it is a second-order property. The property of being two days in our past, however, is a genuine property, says the B-theorist.
Their point about A-properties being relational when properly analyzed is also made this way. The A-theory terminology about space uses the terms here, there, far, and near. These terms are essentially about the speaker, says the B-theorist. “Here” for you is not “here” for me. World War II is past for you but not for Aristotle.
The B-theorist also argues that the A-theory violates the theory of relativity because that theory implies an event can be present for one person but not for another person who is moving relative to the first person. So, being present is relative and not an intrinsic quality of the event. Being present is relative to a reference frame. And for this reason, there are as many different B-series as there are legitimate reference frames. The typical proponent of the A-series cannot accept this.
A-theorists are aware of these criticisms, and there are many counterarguments. Some influential A-theorists are A. N. Prior, E. J. Lowe, and Quentin Smith. Some influential B-theorists are Bertrand Russell, W. V. O. Quine, D. H. Mellor, and Nathan Oaklander. The A-theory is closely related to the commonsense image of time, and the B-theory is more closely related to the scientific image. Proponents of each theory shoulder a certain burden—explaining not just why the opponent’s theory is incorrect but also why it seems to be correct to the opponent.
The philosophical literature on the controversy between the A and B theories is vast. During a famous confrontation in 1922 with the philosopher and A-theorist Henri Bergson, Einstein defended his own B-theory of time and said “the time of the philosophers” is an illusion. This is an overstatement by Einstein. He meant to attack only the time of the A-theorists.
Martin Heidegger said he wrote Being and Time in 1927 as a response to the conflict between the A-theory and the B-theory.
Other than the thesis that the present is metaphysically privileged, the other principal thesis of the A-theory that distinguishes it from the B-theory is that time flows. Let us turn to this feature of the A-theory.
13. The Passage or Flow of Time
Many philosophers claim that time passes or flows. This characteristic of time has also been called a flux, a transiency of the present, a moving now, and becoming. “All is flux,” said Heraclitus. The philosopher G.J. Whitrow claimed “the passage of time…is the very essence of the concept.” Advocates of this controversial philosophical position often point out that the present keeps vanishing. And they might offer a simile and say present events seem to flow into the past, like a boat that drifts past us on the riverbank and then recedes farther and farther downstream from us. In the converse sense, the simile is that we ourselves flow into the future and leave past events ever farther behind us. Philosophers disagree with each other about how to explain the ground of these ideas. Philosopher X will say time passes or flows, but not in the sense used by philosopher Y, while philosopher Z will disagree with both of them.
There are various entangled issues regarding flow. (i) Is the flow an objective feature of physical events that exists independently of our awareness of them? (ii) What is actually flowing? (iii) What does it mean for time to flow? (iv) Are there different kinds of flow? (v) If time flows, do we experience the flow directly or indirectly? (vi) What is its rate of flow, and can the rate change? (vii) If time does not flow, then why do so many people believe it does?
There are two primary philosophical positions about time’s flow: (A) the flow is objectively real. (B) The flow is not objectively real; it is merely subjective. This B-theory is called the static theory, mostly by its opponents because of the negative connotation of the word “static.” The A-theory is called the dynamic theory because it implies time is constantly in flux. The A-theory implies that this fact of passage obtains independently of us; it is not subjective. The letters A and B are intended to suggest an alliance with McTaggart’s A-theory and B-theory. One A-theorist describes the situation this way:
The sensation we are (perhaps wrongly) tempted to describe as the sensation of temporal motion is veridical: it somehow puts us in touch with an aspect of reality that is unrepresented in Russell’s theory of time [the original B-theory]. (van Inwagen 2015, 81)
Some B-theorists complain that the concept of passage is incoherent, or it does not apply to the real world because this would require too many revisions to the scientific worldview of time. Other B-theorists say time flows but only subjectively and that B-theory concepts can explain why we believe in the flow. One explanation that is proposed is that the flow is due to our internal comparison of our predictions of what will happen with our memories of what recently happened, and this comparison needs to be continually updated.
One B-theorist charge is that the notion of flow is the product of a faulty metaphor. They say time exists, things change, and so we say time passes, but time itself does not change. It does not change by flowing or passing or elapsing or undergoing any motion. The present does not objectively flow because the present is not an objective feature of the world. We all experience this flow, but only in the sense that we all frequently misinterpret our experience. It is not that the sentences, “The present keeps vanishing” and “Time flows” are false; they are just not objective truths.
Here is another prong of a common B-theory attack on the notion of flow. The death of Queen Anne is an event that an A-theorist says is continually changing from past to farther into the past, but this change is no more of an objectively real change intrinsic to her death than saying her death changed from being approved of by Mr. Smith to being disapproved of by him. This extrinsic change in approval is not intrinsic to her death and so does not count as an objectively real change in her death.
One point J.J.C. Smart offered against the A-theory of flow was to ask about the rate at which time flows. It would be a rate of one second per second. But that is silly, he claimed. One second divided by one second is the number one, a unit-less number, and so not an allowable rate. And what would it be like for the rate to be two seconds per second? asks Huw Price who adds that, “We might just as well say that the ratio of the circumference of a circle to its diameter flows at pi seconds per second!” (Price 1996, p. 13).
Other philosophers of time, such as John Norton and Tim Maudlin argue that the rate of one second per second is acceptable, despite these criticisms. Paul Churchland countered that the rate is meaningful but trivial, for what other rate could it be?
There surely is some objective feature of our brains that causes us to believe there is a flow of time which we are experiencing. B-theorists say perhaps the belief is due not to time’s actually flowing but rather to the objective fact that we have different perceptions at different times and that anticipations of experiences always happen before memories of those experiences.
A-theorists who believe in flow have produced many dynamic theories that are closer to common sense on this topic. Here are six.
(1) The passage or flow is a matter of events changing from being future, to being present, to being past. Events change in their degree of futureness and degree of pastness. This kind of change is often called McTaggart’s second-order change to distinguish it from more ordinary, first-order change that occurs when, say, a falling leaf changes its altitude over time.
(2) A second type of dynamic theory implies time’s flow is the coming into existence of new facts, the actualization of new states of affairs. Reality grows by the addition of more facts. There need be no commitment to events changing intrinsically.
(3) A third dynamic theory implies that the flow is a matter of events changing from being indeterminate to becoming determinate in the present. Because time’s flow is believed to be due to events becoming determinate, these dynamic theorists speak of time’s flow as becoming.
(4) A fourth dynamic theory says, “The progression of time can be understood by assuming that the Hubble expansion takes place in four dimensions rather than in three. The flow of time consists of the continuous creation of new moments, new nows, that accompany the creation of new space…. Unlike the picture drawn in the classic Minkowski spacetime diagram, the future does not yet exist; we are not moving into the future, but the future is being constantly created.” (Muller 2016b).
(5) A fifth dynamic theory suggests the flow is (or is reflected in) the change over time of truth-values of declarative sentences. For example, suppose the sentence, “It is now raining,” was true during the rain yesterday but has changed to false because it is sunny today. That is an indication that time flowed from yesterday to today, and these sorts of truth-value changes are at the root of the flow.
In response to this linguistic turn of theory (5), critics of the dynamic theory suggest that the temporal indexical sentence, “It is now raining,” has no truth-value because the reference of the word now is unspecified. If the sentence cannot have a truth-value, it cannot change its truth-value. However, the sentence is related to a sentence that does have a truth-value, namely the associated complete sentence or eternal sentence, the sentence with its temporal indexical replaced by some date expression that refers to a specific time, and with the other indexicals replaced by names of whatever they refer to. Typical indexicals are the words: then, now, I, this, here, them. Supposing it is now midnight here on April 1, 2020, and the speaker is in San Francisco, California, then the indexical sentence, “It is now raining,” is intimately associated with the more complete or context-explicit sentence, “It is raining at midnight on April 1, 2020, in San Francisco, California.” Only these latter, non-indexical, non-context-dependent, so-called complete sentences have truth-values, and these truth-values do not change with time, so they do not underlie any flow of time, according to the critic of the fifth dynamic theory.
(6) A sixth dynamic theory adds to the block-universe a traveling present. The present is somehow metaphysically privileged, and there is a moving property of being now that spotlights a new slice of the present events of the block at every new, present moment. A slice is a set of events all of which are simultaneous in the block. So, a slice of events can temporarily possess a monadic property of being now, and then lose it as a newer slice becomes spotlighted. This theory is called the moving spotlight theory. Metaphysically, the moving spotlight theory has been interpreted in two different ways, one rejecting eternalism and the other accepting it. That is, one way suggests there are illuminated moments and unilluminated moments that are, respectively, real and unreal. A second and more common way suggests all times exist but that the present is the only actual time; the actual time exists but is privileged over the other times. Here is how Hermann Weyl described the spotlight theory as subjective rather than objective:
The objective world simply is, it does not happen. Only to the gaze of my consciousness crawling along the lifeline of my body, does a section of the world come to life as a fleeting image in space which continuously changes in time.
Huw Price offers a short overview of various arguments against the passage of time in (Price 1996 pages 12-16). These arguments are responded to by Tim Maudlin in (Maudlin 2002).
14. The Past, Present, and Future
a. Presentism, the Growing-Past, Eternalism, and the Block-Universe
Have dinosaurs slipped out of existence? More generally, this is asking whether the past is part of reality. How about the future? Philosophers are divided on this ontological question of the reality of the past, present, and future. There are three leading theories, and there is controversy over the exact wording of each, and whether the true theory is metaphysically necessary or just contingently true. The following three philosophical theories do not differ in their observational consequences as do competing scientific theories.
(1) According to the ontological theory called presentism, only present objects exist. Stated another way: if something is real, then it exists now. The past and the future are not real, so either the past tense sentence, “Dinosaurs existed” is false, or else it is true but its truth is grounded only in some present facts. A similar analysis is required for statements in the future tense. Perhaps they can be analyzed in terms of present anticipations. With that accomplished, then all the events can be linearly ordered as if the past ones occur before the present ones and the present ones occur before the future ones, when actually they do not because all real events occur in the present. Heraclitus, Duns Scotus, Thomas Hobbes, Arthur Schopenhauer, A.N. Prior, and Lee Smolin are presentists. In the 17th century, Hobbes wrote, “The present only has a being in nature; things past have a being in the memory only, but things to come have no being at all, the future being but a fiction of the mind….” In 1969, Prior said of the present and the real:
They are one and the same concept, and the present simply is the real considered in relation to two particular species of unreality, namely the past and the future.
(2) Advocates of a growing-past agree with the presentists that the present is special ontologically, but they argue that, in addition to the present, the past is also real and is growing bigger all the time. Some have claimed there is a universal, the property of presentness, that successively inheres in different times, as time flows on. The philosophers of science C.D. Broad, George Ellis, Richard Jeffrey, and Michael Tooley have defended the growing-past theory. William James famously remarked that the future is so unreal that even God cannot anticipate it. It is not clear whether Aristotle accepted the growing-past theory or accepted a form of presentism; see Hilary Putnam (1967, p. 244) for commentary on this issue. The growing-past theory is also called by other names such as the growing-present theory, now-and-then-ism, the becoming theory, and possibilism. Members of McTaggart’s A-camp are divided on whether to accept presentism or, instead, the growing-past theory, but they agree on rejecting eternalism.
(3) Advocates of eternalism advocate the equal existence of all times, but not the equal human significance of them. That is, there are no objective ontological differences among the past, present, and future, just as there are no objective ontological differences between here and there. The differences are subjective, according to eternalism, and they depend upon whose experience is being implicitly referred to—yours or Napoleon’s or Aristotle’s. An eternalist will say Napoleon’s rise to power in France is not simply in the past, as the first two theories imply; instead, it is in the past for you, but in the future for Aristotle, and it is equally real for both of you. The past, the present, and the future exist conjointly but not simultaneously. The eternalist is committed to saying all events in spacetime are equally real; the events of the present are not ontologically privileged. The eternalist often describes the theory with a large block of all events that in classical physics can be represented with a Minkowski diagram. All moments of the block are equally real (though not all are present at one time). The entire block as a representation of events might be present all at once on a piece of paper in your office. For the eternalist or block theorist, there are epistemological limitations but no ontological differences among the past, present, and future. For example, we usually can know much more about past events than future ones. The hedge word “usually” is here because we can know more about whether it will rain in the next five minutes in New York City than whether it rained there seven thousand and one years ago.
Eternalism is often called a static theory. The label “static” was once supposed to be derogatory and to indicate that the theory could not successfully deal with change, but these days the term has lost much of its negative connotations just as the initially derogatory term “big bang” in cosmology has lost its negative connotations.
Eternalism is the only one of the three metaphysical theories that permits time travel, so it is understandable that time travel was not seriously discussed in philosophy until the twentieth century when presentism began to be challenged. In the 20th century, Bertrand Russell, J.J.C. Smart, W.V.O. Quine, Adolf Grünbaum, and David Lewis endorsed eternalism. Eternalism is less frequently called the tapestry theory of time.
Presentism was the implicitly accepted ontology as human languages were being created, so it has influenced our current use of tenses and of the words “now.” and “present.” It is very difficult to speak correctly about eternalism using natural language because all natural languages are infused with presumptions about presentism. Correct descriptions of personal identity are especially difficult for eternalists for this reason.
Here is how one philosopher of physics briefly defended eternalism:
I believe that the past is real: there are facts about what happened in the past that are independent of the present state of the world and independent of my knowledge or beliefs about the past. I similarly believe that there is (i.e., will be) a single unique future. I know what it would be to believe that the past is unreal (i.e., nothing ever happened, everything was just created ex nihilo) and to believe that the future is unreal (i.e., all will end, I will not exist tomorrow, I have no future). I do not believe these things, and would act very differently if I did. Insofar as belief in the reality of the past and the future constitutes a belief in a ‘block universe’, I believe in a block universe. But I also believe that time passes, and see no contradiction or tension between these views (Maudlin 2002, pp. 259-260).
A and B theorists agree that it is correct to say, “The past does not exist” and to say, “Future events do not exist” if the verbs are being used in their tensed form, but argue that there should be no implications here for ontology because this is merely an interesting feature of how some languages such as English use tensed verbs. Languages need not use tenses at all, and, according to the B-theorists, a B-analysis of tense-talk can be provided when languages do use tenses.
Hermann Minkowski is the father of the block universe concept. The block theory employing this concept implies reality is correctly representable as a four-dimensional block of point-events in spacetime in some reference frame. Minkowski treated the block as a manifold of point-events upon which was placed a four-dimensional rectangular coordinate system. In the block, any two non-simultaneous events are ordered by the happens-before-or-is-simultaneous-with relation.
For a graphic presentation of the block, see a four-dimensional Minkowski diagram in a supplement to this article. If time has an infinite future or infinite past, then the block is infinite in those directions in time. If space has an infinite extent, then the block is infinitely large along the spatial dimensions. If it were learned that space is nine-dimensional rather than three-dimensional, then block theorists would promote a ten-dimensional block rather than a four-dimensional block.
To get a sense of why the block is philosophically controversial, note that in his book The Future, the Oxford philosopher John Lucas said,
The block universe gives a deeply inadequate view of time. It fails to account for the passage of time, the pre-eminence of the present, the directedness of time, and the difference between the future and the past.
G. J. Whitrow complains that “the theory of the block universe…implies that past (and future) events co-exist with those that are present.” This is a contradiction, he believes. Whitrow’s point can be made metaphorically this way: The mistake of the B-theorist is to envision the future as unfolding, as if it has been waiting in the wings for its cue to appear on the present stage—which is absurd.
Motion in the real world is dynamic, but its historical record, such as its record or worldline within the block, is static. That is, any motion’s mathematical representation is static in the sense of being timeless. The block theory has been accused by A-theorists of spatializing time and geometricizing time, which arguably it does. The philosophical debate is whether this is a mistake. Some B-theorists complain that the very act of labeling the static view as being static is implying mistakenly that there is a time dimension in which the block is not changing but should. The block describes change but does not itself change, say B-theorists. The A-theorist’s complaint, according to the B-theorist, is like complaining that a printed musical score is faulty because it is static, while real music is vibrant.
A principal difficulty for the presentism theory is to make sense of some distant event happening now. Relativity theory implies that which events are simultaneous with which other events depends upon the reference frame that is chosen to make the determination. That is, the concept of the present or now is frame-relative and so is not objectively real. For the eternalist and block-theorist, the block that is created using one reference frame is no more distinguished than the block that is created using another frame allowed by the laws of science. Any chosen reference will have its own definite past, present, and future. The majority of physicists accept this block theory, which could be called the mild block theory. Metaphysicians also argue over whether reality itself is a static block, rather than just being representable as a static block. These metaphysicians are promoting a strong block theory. Some theorists complain that this strong block theory is confusing the representation with what is represented. See (Smolin 2013, pp. 25-36) for an elaboration of the point.
Some proponents of the growing-past theory have adopted a growing-block theory. They say the block is ever-growing, and the present is the leading edge between reality and the unreal future. Some philosophers express that point by saying the present is the edge of all becoming. The advocates of the growing-block can agree with the eternalists that what makes the sentence, “Dinosaurs once existed,” be true is that there is a past region of the block in which dinosaurs do exist.
All three ontologies (namely, presentism, the growing-past, and eternalism) imply that, at the present moment, we only ever experience a part of the present and that we do not have direct access to the past or the future. They all agree that nothing exists now that is not present, and all three need to explain how and why there is an important difference between never existing (such as Santa Claus) and not existing now (such as Aristotle). Members of all three camps will understand an ordinary speaker who says, “There will be a storm tomorrow so it’s good that we fixed the roof last week,” but they will provide different treatments of this remark at a metaphysical level.
Most eternalists accept the B-theory of time. Presentists and advocates of the growing-past tend to accept the A-theory of time. Let us take a closer look at presentism.
One of the major issues for presentism is how to ground true propositions about the past. What makes it true that U.S. President Abraham Lincoln was assassinated in 1865? Speaking technically, we are asking what are the truthmakers of the true sentences and the falsemakers of the false sentences. Many presentists say past-tensed truths lack truthmakers in the past but are nevertheless true because their truthmakers are in the present. They say what makes a tensed proposition true are only features of the present way that things are, perhaps traces of the past in pages of present books and in our memories. The eternalist disagrees. When someone says truly that Abraham Lincoln was assassinated, the eternalist and the growing-past theorist believe this is to say something true of a real Abraham Lincoln who is not present. The block theorist and the growing-block theorist might add that Lincoln is real but far away from us along the time dimension just as the Moon is real but far away from us along a spatial dimension. Because of this analogy, they say, “Why not treat these distant realities in the same manner?”
A related issue for the presentist is how to account for causation, for how April showers bring May flowers. Presentists believe in processes, but can they account for the process of a cause producing an effect without both the cause and the effect being real at different times?
Presentism and the growing-past theory need to account for the Theory of Relativity’s treatment of the present, or else criticize the theory. On its orthodox interpretation, relativity theory implies there is no common global present, but only different presents for each of us. Relativity theory allows event a to be simultaneous with event b in one reference frame, while allowing b to be simultaneous with event c in some other reference frame, even though a and c are not simultaneous in either frame. Nevertheless, if a is real, then is c not also real? But neither presentism nor the growing-past theory can allow c to be real. This argument against presentism and the growing-past theory presupposes the transitivity of co-existence.
Despite this criticism, (Stein 1991) says presentism can be retained by rejecting transitivity and saying what is present and thus real is different depending on your spacetime location. The implication is that, for event a, the only events that are real are those with a zero spacetime interval from a. Many of Stein’s opponents, including his fellow presentists, do not like this implication.
Eternalists very often adopt the block-universe theory. This implies our universe is the set of all the point-events with their actual properties. The block is representable with a Minkowski diagram in the regions where spacetime does not curve and where nature obeys the laws of special relativity.
The presentist and the advocate of the growing-past theory usually will unite in opposition to eternalism for these five reasons: (i) The present is so much more vivid than the future. (ii) Eternalism misses the special open and changeable character of the future. In the classical block-universe theory promoted by most eternalists, there is only one future, so this implies the future exists already; but that denies our ability to affect the future, and it is known that we do have this ability. (iii) A present event moves in the sense that it is no longer present a moment later, having lost its property of presentness, but eternalism disallows this movement. (iv) Future events do not exist and so do not stand in relationships of before and after, but eternalism accepts these relationships. (v) Future-tensed statements that are contingent, such as “There will be a sea battle tomorrow,” do not have existing truthmakers and so are neither true nor false, yet most eternalists mistakenly believe all these statements do have truth values now.
Defenders of eternalism and the block-universe offer a variety of responses to these criticisms. For instance, regarding (i), they are likely to say the vividness of here does not imply the unreality of there, so why should the vividness of now imply the unreality of then? Regarding (ii) and the openness of the future, the block theory allows a closed future and the absence of libertarian free will, but it does not require this. Eventually, there will be one future, regardless of whether that future is now open or closed, and that is what constitutes the future portion of the block that has not happened yet.
“Do we all not fear impending doom?” an eternalist might ask. But according to presentism and the growing-block theory, why should we have this fear if the future doom is known not to exist, as these two kinds of theorists evidently believe? Implicitly accepting this argument in 1981, J.J.C. Smart, who is a proponent of the block-universe, asked us to:
conceive of a soldier in the twenty-first century…cold, miserable and suffering from dysentery, and being told that some twentieth-century philosophers and non-philosophers had held that the future was unreal. He might have some choice things to say.
All observation is of the past. If you look at the North Star, you see it as it was, not as it is, because it takes so many years for the light to reach your eyes, about 434 years. The North Star might have burned out several years ago. If so, then you are seeing something that does not exist, according to the presentist. That is puzzling. Eternalism with the block theory provides a way out of the puzzle: you are seeing an existing time-slice of the 4D block that is the North Star.
Determinism for a system is the thesis that specifying the state of the system at one time fixes how the system evolves forward in time. So, the present state determines each future state, and state at a past time determines the present. By “determines,” we mean determines by rules or laws. Determinism implies that no event is purely random. Here is a commonly offered defense of the block-universe theory against the charge that it entails determinism:
The block universe is not necessarily a deterministic one. …Strictly speaking, to say that the occurrence of a relatively later event is determined vis à vis a set of relatively earlier events, is only to say that there is a functional connection or physical law linking the properties of the later event to those of the earlier events. …Now in the block universe we may have partial or even total indeterminacy—there may be no functional connection between earlier and later events (McCall 1966, p. 271).
One defense of the block theory against Bergson’s charge that it inappropriately spatializes time is to point out that when we graph the color of eggs sold against the location of the sales, no one complains that we are inappropriately spatializing egg color. The issues of spatialization and determinism reflect a great philosophical divide between those who believe the geometrical features of spacetime provide an explanation of physical phenomena or instead provide only a representation or codification of those phenomena.
Challenging the claim that the block universe theory must improperly spatialize time, but appreciating the point made by Bergson that users of the block universe can make the mistake of spatializing time, the pragmatist and physicist Lee Smolin says,
By succumbing to the temptation to conflate the representation with the reality and [to] identify the graph of the records of the motion with the motion itself, these scientists have taken a big step toward the expulsion of time from our conception of nature.
The confusion worsens when we represent time as an axis on a graph…This can be called spatializing time.
And the mathematical conjunction of the representations of space and time, with each having its own axis, can be called spacetime. The pragmatist will insist that this spacetime is not the real world. It’s entirely a human invention, just another representation…. If we confuse spacetime with reality, we are committing a fallacy, which can be called the fallacy of the spatialization of time. It is a consequence of forgetting the distinction between recording motion in time and time itself.
Once you commit this fallacy, you’re free to fantasize about the universe being timeless, and even being nothing but mathematics. But, the pragmatist says, timelessness and mathematics are properties of representations of records of motion—and only that.
For a survey of defenses for presentism and the growing-past theories, see (Putnam 1967), (Saunders 2002), (Markosian 2003), (Savitt 2008), and (Miller 2013, pp. 354-356).
b. The Present
The present is what we are referring to when we use the word “now.” The temporal word now changes its reference every instant but not its meaning. Obviously there is a present, many people say, because it is so different from the past. The majority position among physicists is that the present is not an objective feature of reality. It is a mind-dependent or sociological feature, depending on a human convention about which clock and reference frame to use. (Everyone in this dispute agrees that it can make an important difference to your life whether it is presently noon or presently midnight.)
A-theorists, unlike B-theorists believe the present is a metaphysically-privileged instant that is fundamental, spatially-extended, and global (applying to the entire cosmos). The A-theorists favor the claim that the present is objectively real; the B-theorists say it is subjective because everyone and everything has its own personal time so there can be no fact of the matter as to which person’s present is the real present. Relativity theory implies what is happening now is relative to a chosen reference frame. The present is always different for two people moving toward or away from each other.
Let us consider some arguments in favor of the objectivity of the present, the reality of now. One is that the now is so much more vivid to everyone than all other times. Past and future events are dim by comparison. Proponents of an objective present say that if scientific laws do not recognize this vividness and the objectivity of the present, then there is a defect within science. Einstein considered this argument and rejected it. The philosopher of science Tim Maudlin accepts it, and he hopes to find a way to revise relativity theory so it allows a universal present for each instant.
One counter to Einstein is that there is so much agreement among people about what is happening now and what is not. Is that not a sign that the now is objective, not subjective? This agreement is reflected within our natural languages where we find evidence that a belief in the now is ingrained in our language. It is unlikely that it would be so ingrained if it were not correct to believe it.
What have B-theorists said in response? Well, regarding vividness, we cannot now step outside our present experience and compare its vividness with the experience of past presents and future presents. Yet that is what needs to be done for a fair comparison. Instead, when we speak of the “vividness” of our present experience of, say, a leaf falling in front of us, all we can do is compare our present experience of the leaf with our dim memories of leaves falling, and with even dimmer expectations of leaves yet to fall. So, the comparison is unfair; the vividness of future events should be assessed, says the critic, by measuring those future events when they happen and not merely by measuring present expectations of those events before they happen.
In another attempt to undermine the vividness argument, the B-theorist points out that there are empirical studies by cognitive psychologists and neuroscientists showing that our judgment about what is vividly happening now is plastic and can be affected by our expectations and by what other experiences we are having at the time. For example, we see and hear a woman speaking to us from across the room; then we construct an artificial now, in which hearing her speak and seeing her speak happen at the same time. But they do not really happen at the same time, so we are playing a little trick on ourselves. The acoustic engineer assures us we are mistaken because the sound traveled much slower than the light. Proponents of the manifest image of time do not take travel time into account and mistakenly suppose there is a common global present and suppose that what is happening at present is everything that could in principle show up in a still photograph taken with light that arrives with infinite speed.
When you speak on the phone with someone two hundred miles away, the conversation is normal because the two of you seem to share a common now. But that normalcy is only apparent because the phone signal travels the two hundred miles so quickly. During a phone conversation with someone much farther away, say on the Moon, you would notice a strange 1.3 second time lag because the Moon is 1.3 light seconds away from Earth. Suppose you were to look at your correct clock on Earth and notice it is midnight. What time would it be on the Moon, according to your clock? This is not a good question. A more sensible question is, “What events on the Moon are simultaneous with midnight on Earth, according to my clock?” You cannot look and see immediately. You will have to wait 1.3 seconds at least because it takes any signal that long to reach you from the Moon. If an asteroid were to strike the Moon, and you were to see the explosion through your Earth telescope at 1.3 seconds after midnight, then you could compute later that the asteroid must have struck the Moon at midnight. If you want to know what is presently happening on the other side of Milky Way, you will have a much longer wait. So, the moral is that the collection of events comprising your present is something you have to compute; you cannot directly perceive those events at once.
To continue advancing a pro-B-theory argument against an objective present, notice the difference in time between your clock which is stationary on Earth and the time of a pilot using a clock in a spaceship that is flying by you at high speed. Assume the spaceship flies very close to you and that the two clocks are synchronized and are working perfectly and they now show the time is midnight at the flyby. According to the special theory of relativity, the collection of events across the universe that you eventually compute and say occurs now at midnight, necessarily must be very different from the collection of events that the spaceship traveler computes and says occurs at midnight. You and the person on the spaceship probably will not notice much of a difference for an event at the end of your street or even for an event on another continent, but you will begin to notice the difference for an event on the Moon and even more so for an event somewhere across the Milky Way or, worse yet, for an event in the Andromeda galaxy.
When two people disagree about what events are present events because the two are in motion relative to each other, the direction of the motion makes a significant difference. If the spaceship is flying toward Andromeda and away from you, then the spaceship’s now (what it judges to be a present event) would include events on Andromeda that occurred thousands of years before you were born. If the spaceship is flying away from Andromeda, the spaceship’s now would include events on Andromeda that occur thousands of years in your future. Also, the difference in nows is more extreme the faster the spaceship’s speed as it flies by you. The implication, says the B theorist, is that there are a great many different nows and nobody’s now is the only correct one.
To make a similar point in the language of mathematical physics, something appropriately called a now would be an equivalence class of instances that occur at the same time. But because Einstein showed that time is relative to refence frame, there are different nows for different reference frames, so the notion of now is not frame-independent and thus is not objective, contra the philosophical position of the A-theorist.
When the B-theorist says there is no fact of the matter about whether a distant explosion has happened, the A-theorist will usually disagree and say, regardless of your limitations on what knowledge you have, the explosion has occurred now or it hasn’t occurred now.
For ordinary discussions about events on Earth, a reference frame is customarily used in which the Earth is not moving. And since we all move at slow speeds relative to each other on Earth and do not experience very different gravitational forces and don’t consider very distant phenomena, we can agree for practical purposes on Earth about what is simultaneous with what.
Opponents of an objective present frequently point out that none of the fundamental laws of physics pick out a present moment. Scientists frequently do apply some law of science while assigning, say, t0 to be the temporal coordinate of the present moment, then they go on to calculate this or that. This insertion of the fact that some value of the time variable t is the present time is an initial condition of the situation to which the law is being applied, and is not part of the law itself. The basic laws themselves treat all times equally. If science’s laws do not need the present, then it is not real, say the B theorists. The counterargument is that it is the mistake of scientism to suppose that if something is not in our current theories, then it must not be real. France is real, but it is not mentioned in any scientific law.
In any discussion about whether the now is objective, one needs to remember that the term objective has different senses. There is objective in the sense of not being relative to the reference frame, and there is objective in the sense of not being mind-dependent, and there is objective in the sense of not being anthropocentric. Proponents of the B-theory say the now is not objective in any of these senses.
There is considerable debate in the philosophical literature about whether the present moments are so special that the laws should somehow recognize them. It is pointed out that even Einstein said, “There is something essential about the Now which is just outside the realm of science.” In 1925, the influential philosopher of science Hans Reichenbach criticized the block theory’s treatment of the present:
In the condition of the world, a cross-section called the present is distinguished; the ‘now’ has objective significance. Even when no human being is alive any longer, there is a ‘now’….
This claim has met stiff resistance. Earlier, in 1915, Bertrand Russell had objected to giving the present any special ontological standing:
In a world in which there was no experience, there would be no past, present, or future, but there might well be earlier and later (Russell 1915, p. 212).
Later, Rudolf Carnap added that a belief in the present is a matter for psychology, not physics.
The B-camp says belief in a global now is a product of our falsely supposing that everything we see is happening now, when actually we are not factoring in the finite speed of light and sound. Proponents of the non-objectivity of the present frequently claim that a proper analysis of time talk should treat the phrases the present and now as indexical terms which refer to the time at which the phrases are uttered by the speaker, and so their relativity to us speakers shows the essential subjectivity of the present. A-theorists do not accept these criticisms.
There are interesting issues about the now in the philosophy of religion. For one example, Norman Kretzmann has argued that if God is omniscient, then He knows what time it is, and to know this, says Kretzmann, God must always be changing because God’s knowledge keeps changing. Therefore, there is an incompatibility between God’s being omniscient and God’s being immutable.
Disagreement about the now is an ongoing feature of debate in the philosophy of time, and there are many subtle moves made by advocates on each side of the issue. (Baron 2018) provides a broad overview of the debate about whether relativistic physics disallows an objective present. For an extended defense of the claim that the now is not subjective and that there is temporal becoming, see (Arthur 2019).
c. Persistence, Four-Dimensionalism, and Temporal Parts
Eternalism differs from four-dimensionalism. Eternalism is the thesis that the present, past, and future are equally real, whereas four-dimensionalism says the ontologically basic objects are four-dimensional events and ordinary objects referred to in everyday discourse are three-dimensional slices of 4-d spacetime. However, most four-dimensionalists do accept eternalism. Most all eternalists and four-dimensionalists accept McTaggart’s B-theory of time.
Four-dimensionalism does not imply that time is a spatial dimension. When a four-dimensionalist represents time relative to a reference frame in a four-dimensional diagram, say, a Minkowski diagram, time is a special one of the four-dimensions of this mathematical space, not an arbitrary one. Using this representation technique does not imply that a four-dimensionalist is committed to the claim that real, physical space itself is four-dimensional, but only that spacetime is.
Four-dimensionalists take a stand on the philosophical issue of endurance vs. perdurance. Some objects last longer than others, so we say they persist longer. But there is no philosophical consensus about how to understand persistence. Objects are traditionally said to persist by enduring over some time interval. At any time during the interval the whole of the object exists. Not so for perduring objects. Perduring objects are said, instead, to persist by perduring. They do not exist wholly at a single instant but rather exist over a stretch of time. These objects do not pass through time; they do not endure; instead, they extend through time. A football game does not wholly exist at one instant; it extends over an interval of time. The issue is whether we can or should say the same for electrons and people. Technically expressed, the controversial issue is whether or not persisting things are (or are best treated as) divisible into temporal parts.
The perduring object persists by being the sum or fusion of a series of its temporal parts (also called its temporal stages). Instantaneous temporal parts are called temporal slices and time slices. For example, a forty-year-old man might be treated as being a four-dimensional perduring object consisting of his three temporal stages that we call his childhood, his middle age, and his future old age. But his right arm is also a temporal part that has perdured for forty years.
Although the concept of temporal parts is more likely to be used by a four-dimensionalist, here is a definition of the concept from Judith Jarvis Thomson in terms of three-dimensional objects:
Let object O exist at least from time t0 to time t3. A temporal part P of O is an object that begins to exist at some time t1, where t1 ? t0, and goes out of existence at some time t2 ? t3, and takes up some portion of the space that O takes up for all the time that P exists.
Four-dimensionalists, by contrast, think of physical objects as regions of spacetime and as having temporal parts that extend along all four dimensions of the object. A more detailed presentation of these temporal parts should say whether four-dimensional objects have their spatiotemporal parts essentially.
David Lewis offers the following, fairly well-accepted definitions of perdurance and endurance:
Something perdures iff it persists by having different temporal parts, or stages, at different times, though no one part of it is wholly present at more than one time; whereas it endures iff it persists by being wholly present at more than one time.
The term “iff“ stands for “if and only if.” Given a sequence of temporal parts, how do we know whether they compose a single perduring object? One answer, given by Hans Reichenbach, Ted Sider, and others, is that they compose a single object if the sequence falls under a causal law so that temporal parts of the perduring object cause other temporal parts of the object. Philosophers of time with a distaste for the concept of causality, oppose this answer.
According to David Lewis in On the Plurality of Worlds, the primary argument for perdurantism is that it has an easier time solving what he calls the problem of temporary intrinsics, of which the Heraclitus Paradox is one example. The Heraclitus Paradox is the problem, first introduced by Heraclitus of ancient Greece, of explaining our not being able to step into the same river twice because the water is different the second time. The mereological essentialist agrees with Heraclitus, but our common sense says Heraclitus is mistaken because people often step into the same river twice. Who is really making the mistake?
The advocate of endurance has trouble showing that Heraclitus is mistaken, says Lewis. We do not step into two different rivers, do we? They are the same river. Yet the river has two different intrinsic properties, namely being a collection of water we stepped in the first time and a collection of water we stepped in the second time; but, by Leibniz’s Law of the Indiscernibility of Identicals, identical objects cannot have different intrinsic properties. So, the advocate of endurance has trouble escaping the Heraclitus Paradox. So does the mereological essentialist.
A 4-dimensionalist who advocates perdurance says the proper metaphysical analysis of the Heraclitus Paradox is that we can step into the same river twice by stepping into two different temporal parts of the same 4-dimensional river. Similarly, we cannot see a football game at a moment; we can see only a momentary temporal part of the 4D game.
For more examination of the issue with detailed arguments for and against perdurance and endurance, see (Wasserman, 2018), (Carroll and Markosian 2010, pp. 173-7), and especially the article “Persistence in Time” in this encyclopedia.
d. Truth-Values of Tensed Sentences
The above disputes about presentism, the growing-past theory, and the block theory have taken a linguistic turn by focusing upon a related question about language: “Are predictions true or false at the time they are uttered?” Those who believe in the block-universe (and thus in the determinate reality of the future) will answer “Yes,” while a “No” will be given by presentists and advocates of the growing-past.
The issue is whether contingent sentences uttered now about future events are true or false now rather than true or false only in the future at the time the predicted event is supposed to occur. For example, suppose someone says, “Tomorrow the admiral will start a sea battle.” And suppose that the next day the admiral does order a sneak attack on the enemy ships which starts a sea battle. The eternalist says that, if this is so, then the sentence token about the sea battle was true yesterday at the time it was uttered. Truth is eternal or fixed, eternalists say, and the predicate is true is a timeless predicate, not one that merely means is true now. The sentence spoken now has a truth-maker within the block at a future time, even though the event has not yet happened and so the speaker has no access to that truthmaker. These B-theory philosophers point favorably to the ancient Greek philosopher Chrysippus who was convinced that a contingent sentence about the future is simply true or false, even if we do not know which.
Many other philosophers, usually in McTaggart’s A-camp, agree with Aristotle’s suggestion that the sentence about the future sea battle is not true (or false) until the battle occurs (or does not). Predictions fall into the truth-value gap. This position that contingent sentences have no classical truth-values when uttered is called the doctrine of the open future and also the Aristotelian position because many researchers throughout history have taken Aristotle to have been holding the position in chapter 9 of his On Interpretation—although today it is not so clear that Aristotle himself held the position.
One principal motive for adopting the Aristotelian position arises from the belief that, if sentences about future human actions are now true, then humans are determined to perform those actions, and so humans have no free will. To defend free will, we must deny truth-values to predictions.
This Aristotelian argument against predictions being true or false has been discussed as much as any in the history of philosophy, and it faces a series of challenges. First, if there really is no free will, or if free will is compatible with determinism, then the motivation to deny truth-values to predictions is undermined.
Second, according to many compatibilists, but not all, your choices do affect the world as the libertarians believe they must; but, if it is true that you will perform an action in the future, it does not follow that now you will not perform it freely, nor that you were not free to do otherwise if your intentions had been different back then, but only that you will not do otherwise. For more on this point about modal logic, see the discussion of it in Foreknowledge and Free Will.
A third challenge, from Quine and others, claims the Aristotelian position wreaks havoc with the logical system we use to reason and argue with predictions. For example, here is a deductively valid argument, presumably:
If there will be a sea battle tomorrow, then we should wake up the admiral.
There will be a sea battle tomorrow.
So, we should wake up the admiral.
Without both premises in this argument having truth-values, that is, being true or false, we cannot properly assess the argument using the usual standards of deductive validity because this standard is about the relationships among truth-values of the component sentences—that a valid argument cannot possibly have true premises and a false conclusion. Unfortunately, the Aristotelian position says that some of these component sentences are neither true nor false. So, logic does not apply. Surely, then, the Aristotelian position is implausible.
In reaction to this third challenge, proponents of the Aristotelian argument say that if Quine would embrace tensed propositions and expand his classical logic to a tense logic, he could avoid those difficulties in assessing the validity of arguments that involve sentences having future tense.
Quine has claimed that the analysts of our talk involving time should in principle be able to eliminate the temporal indexical words such as now and tomorrow because their removal is needed for fixed truth and falsity of our sentences [fixed in the sense of being eternal or complete sentences whose truth-values are not relative to the situation and time of utterance because the indexicals and indicator words have been replaced by expressions for specific times, places and names, and whose verbs are treated as timeless and tenseless], and having fixed truth-values is crucial for the logical system used to clarify science. “To formulate logical laws in such a way as not to depend thus upon the assumption of fixed truth and falsity would be decidedly awkward and complicated, and wholly unrewarding,” says Quine. For a criticism of Quine’s treatment of indexicals, see (Slater 2012, p. 72).
Philosophers are divided on all these issues.
e. Essentially-Tensed Facts
Using a tensed verb is a grammatical way of locating an event in time. All the world’s cultures have a conception of time, but only half the world’s languages use tenses. English has tenses, but the Chinese, Burmese, and Malay languages do not. The English language distinguishes “Her death has happened” from “Her death will happen.” However, English also expresses time in other ways: with the adverbial phrases now and twenty-three days ago, with the adjective phrases new and ancient, and with the prepositions until and since.
Philosophers have asked what we are basically committed to when we use tense to locate an event in time. There are two principal answers: tenses are objective, and tenses are subjective. The two answers have given rise to two competing camps of philosophers of time.
The first answer is that tenses represent objective features of reality that are not captured by the B-theory, nor by eternalism, nor by the block-universe approach. This philosophical theory is said to “take tense seriously” and is called the tensed theory of time. The theory claims that, when we learn the truth-values of certain tensed sentences, we obtain knowledge which tenseless sentences do not and cannot provide, for example, that such and such a time is the present time. Tenses are almost the same as what is represented by positions in McTaggart‘s A-series, so the theory that takes tense seriously is commonly called the A-theory of tense, and its advocates are called tensers.
A second, contrary answer to the question of the significance of tenses is that they are merely subjective. Tensed terms have an indexical feature which is specific to the subject doing the speaking, but this feature has no ontological significance. Saying the event happened rather than is happening indicates that the subject or speaker said this after the event happened rather than before or during the event. Tenses are about speakers, not about some other important ontological characteristic of time in the world. This theory is the B-theory of tense, and its advocates are called detensers. The detenser W.V.O. Quine expressed the position this way:
Our ordinary language shows a tiresome bias in its treatment of time. Relations of date are exalted grammatically…. This bias is of itself an inelegance, or breach of theoretical simplicity. Moreover, the form that it takes—that of requiring that every verb form show a tense—is peculiarly productive of needless complications, since it demands lipservice to time even when time is farthest from our thoughts. Hence in fashioning canonical notations it is usual to drop tense distinctions (Word and Object §36).
The philosophical disagreement about tenses is not so much about tenses in the grammatical sense, but rather about the significance of the distinctions of past, present, and future which those tenses are used to mark.
The controversy is often presented as a controversy about whether tensed facts exist, with advocates of the tenseless theory objecting to tensed facts and advocates of the tensed theory promoting them as essential. The primary function of tensed facts is to make tensed sentences true, to be their truthmakers.
The B-theorist says tensed facts are not needed to account for why tensed sentences get the truth values they do.
Consider the tensed sentence, “Queen Anne of Great Britain died.” The A-theorist says the truthmaker is simply the tensed fact that the death has pastness. The B-theorist gives a more complicated answer by saying the truthmaker is the fact that the time of Queen Anne’s death is-less-than the time of uttering the above sentence. Notice that the B-answer does not use any words in the past tense. According to the classical B-theorist, the use of tense (and more importantly, any appeal to tensed facts) is an extraneous and eliminable feature of our language at the fundamental level, as are all other uses of the terminology of the A-series (except in trivial instances such as “The A-series is constructed using A-facts”).
This B-theory analysis is challenged by the tenser’s A-theory on the grounds that it can succeed only for utterances or readings or inscriptions, but the A-theorist points out that a proposition can be true even if never uttered, never read, and never inscribed.
There are other challenges to the B-theory. Roderick Chisholm and A.N. Prior claim that the word “is” in the sentence “It is now midnight” is essentially present-tensed because there is no adequate translation using only tenseless verbs. Trying to give a B-style analysis of it, such as, “There is a time t such that t = midnight,” is to miss the essential reference to the present in the original sentence because the original sentence is not always true, but the sentence “There is a time t such that t = midnight” is always true. So, the tenseless analysis fails. There is no escape from this criticism by adding “and t is now” because this last indexical phrase needs its own analysis, and we are starting a vicious regress. John Perry famously explored this argument in his 1979 article, “The Problem of the Essential Indexical.”
Prior, in (Prior 1959), supported the tensed A-theory by arguing that after experiencing a painful event,
one says, e.g., “Thank goodness that’s over,” and [this]…says something which it is impossible that any use of a tenseless copula with a date should convey. It certainly doesn’t mean the same as, e.g., “Thank goodness the date of the conclusion of that thing is Friday, June 15, 1954,” even if it be said then. (Nor, for that matter, does it mean “Thank goodness the conclusion of that thing is contemporaneous with this utterance.” Why should anyone thank goodness for that?).
Prior’s criticisms of the B-theory involves the reasonableness of our saying of some painful, past event, “Thank goodness that is over.” The B-theorist cannot explain this reasonableness, he says, because no B-theorist should thank goodness that the end of their pain happens before their present utterance of “Thank goodness that is over,” since that B-fact or B-relationship is timeless; it has always held and always will. The only way then to make sense of our saying “Thank goodness that is over” is to assume we are thankful for the A-fact that the pain event is in the past, that is, we are thankful for the pastness. But if so, then the A-theory is correct and the B-theory is incorrect.
One B-theorist response is simply to disagree with Prior that it is improper for a B-theorist to thank goodness that the end of their pain happens before their present utterance, even though this is an eternal B-fact. Still another response from the B-theorist comes from the 4-dimensionalist who says that as 4-dimensional beings it is proper for us to care more about our later time-slices than our earlier time-slices. If so, then it is reasonable to thank goodness that the time slice at the end of the pain occurs before the time slice in which we are saying, “Thank goodness that is over.” Admittedly this is caring about an eternal B-fact. So, Prior’s premise [that the only way to make sense of our saying “Thank goodness that is over” is to assume we are thankful for the A-fact that the pain event has pastness] is a faulty premise, and Prior’s argument for the A-theory is unsuccessful.
D.H. Mellor and J.J.C. Smart, both proponents of the B-theory, agree that tensed talk is important, and can be true, and even be essential for understanding how we think and speak; but Mellor and Smart claim that tensed talk is not essential for describing extra-linguistic reality and that the extra-linguistic reality does not contain tensed facts corresponding to true, tensed talk. These two philosophers, and many other philosophers who “do not take tense seriously,” advocate a newer tenseless B-theory by saying the truth conditions of any tensed, declarative sentence can be explained without tensed facts even if Chisholm and Prior and other A-theorists are correct that some tensed sentences in English cannot be adequately translated into tenseless ones.
The truth conditions of a sentence are the conditions which must be satisfied in the world in order for the sentence to be true. The sentence “Snow is white” is true on the condition that snow is white. More particularly, it is true if whatever is referred to by the term ‘snow’ satisfies the predicate ‘is white’. Regarding if-then sentences, the conditions under which the sentence “If it is snowing, then it is cold” are true are that it is not both true that it is snowing and false that it is cold. Other analyses are offered for the truth conditions of sentences that are more complex grammatically. Alfred Tarski has provided these analyses in his semantic theory of truth.
Mellor and Smart agree that truth conditions can adequately express the meaning of tensed sentences or all that is important about the meaning when it comes to describing objective reality. This is a philosophically controversial point, but Mellor and Smart accept it, and argue that therefore there is really no need for tensed facts and tensed properties. The untranslatability of some tensed sentences merely shows a fault with ordinary language‘s ability to characterize objective, tenseless reality. If the B-theory, in accounting for the truth conditions of an A-sentence, fails to account for the full meaning of the A-sentence, then this is because of a fault with the A-sentence, not the B-theory.
Let us make the same point in other words. According to the newer B-theory of Mellor and Smart, if I am speaking to you and say, “It is now midnight,” then this sentence admittedly cannot be translated into tenseless terminology without some loss of meaning, but the truth conditions can be explained fully with tenseless terminology. The truth conditions of “It is now midnight” are that my utterance occurs (in the tenseless sense of occurs) at very nearly the same time as your hearing the utterance, which in turn is the same time as when our standard clock declares the time to be midnight in our reference frame. In brief, it is true just in case it is uttered at midnight. Notice that no tensed facts are appealed to in this explanation of the truth conditions.
Similarly, an advocate of the new tenseless theory will say it is not the pastness of the painful event that explains why I say, “Thank goodness that’s over” after exiting the dentist’s chair. I say it because I believe that the time of the occurrence of that utterance is greater than the time of the occurrence of the painful event, and because I am glad about this; and even though it was true even last month that the one time occurred before the other, I am happy to learn this. Of course, I would be even gladder if there were no pain at any time. I may not be consciously thinking about the time of the utterance when I make it; nevertheless, that time is what helps explain what I am glad about. Being thankful for the pastness of the painful event provides a simpler explanation, actually a simplistic explanation, but not a better explanation.
In addition, it is claimed by Mellor and other new B-theorists that tenseless sentences can be used to explain the logical relations between tensed sentences; they can be used to explain why one tensed sentence implies another, is inconsistent with yet another, and so forth. According to this new theory of tenseless time, once it is established that the truth conditions of tensed sentences can be explained without utilizing tensed facts, then Ockham’s Razor is applied. If we can do without essentially-tensed facts, then we should say essentially-tensed facts do not exist.
To summarize, tensed facts were presumed by the A-theory to be needed to be the truthmakers for the truth of tensed talk; but proponents of the new B-theory claim their analysis shows that ordinary tenseless facts are adequate. The B-theory concludes that we should “not take tense seriously” in the sense of requiring tensed facts to account for the truth and falsity of sentences involving tenses because tensed facts are not actually needed.
Proponents of the tensed theory of time do not agree with this conclusion. They will insist there are irreducible A-properties and that what I am glad about when a painful event is over is that the event is earlier than now, that is, has pastness. Quentin Smith says, more generally, that the “new tenseless theory of time is faced with insurmountable problems, and that it ought to be abandoned in favor of the tensed theory.”
The advocate of the A-theory E.J. Lowe opposed the B-theory because it conflicts so much with the commonsense image of time:
I consider it to be a distinct merit of the tensed view of time that it delivers this verdict, for it surely coincides with the verdict of common sense (Lowe, 1998, p. 104).
Lowe argued that no genuine event can satisfy a tenseless predicate, and no truth can be made true by B-theory truth conditions because all statements of truth conditions are tensed.
So, the philosophical debate continues over whether tensed concepts have semantical priority over untensed concepts, and whether tensed facts have ontological priority over untensed facts.
15. The Arrow of Time
If you are shown an ordinary movie and also shown the same movie running in reverse, you have no trouble telling which is which because it is so easy for you to detect the one in which time’s arrow is pointing improperly—the improper movie would be the one in which your omelet turns into unbroken eggs and everyone walks backwards up their steps. Philosophers of physics want to know the origin and nature of this arrow. There is considerable disagreement about what it is, what counts as an illustration of it, how to explain it, and even how to define the term “arrow of time” and related, conceptually-tricky terms.
The main two camps disagree about whether (1) there is an intrinsic arrow of time itself that is perhaps due to its flow or to more events becoming real, or instead (2) there is only an extrinsic arrow due to so many of nature’s processes spontaneously going in only one direction. Those in the intrinsic camp often accuse those in the other camp of scientism; those in the extrinsic camp often accuse those in the other camp of subjectivism and an over-emphasis on the phenomenology of temporal awareness.
Arthur Eddington first used the term “time’s arrow” in 1927. The presence of the arrow implies, among other things, that tomorrow will be different from today in many ways: people grow older rather than younger; metal naturally rusts but does not un-rust; apples fall down from the apple tree, never up to the tree. Hopefully, all this can be explained and not simply assumed. To do this, there must be some assumption somewhere that is time-asymmetric, that prefers one direction in time to the other. In the search for that assumption, some recommendations are: (a) to find a significant, fundamental law of physics that requires one-way behavior in time, or (b) to assume a special feature at the origin of time that directs time to start out going in only one direction and keep going that way, or (c) to assume arrow-ness or directedness is an intrinsic but otherwise inexplicable feature of time itself.
The universe is filled with one-way processes; these are all macroprocesses, not micro-physical ones. At the most fundamental, micro-physical level, nearly all the significant laws of physics reveal no requirement that any process must go one way in time rather than the reverse. The exceptions involve rare particle decays involving weak interactions that all experts agree have nothing to do with time’s overall arrow.
Many experts in the extrinsic camp suggest that the presence of time’s arrow is basically a statistical issue involving increased disorder and randomization (the technical term is “increased entropy”) plus a special low-entropy configuration of nature early in the cosmic big bang, with the target of the arrow being thermodynamic equilibrium in the very distant future when the universe’s average temperature approaches absolute zero. These experts point to the second law of thermodynamics as the statistical law that gives a quantitative description of entropy increase. Experts in the intrinsic camp disagree with this kind of explanation of the arrow. They say the one-way character of time is not fundamentally a statistical issue involving processes but rather is intimately tied to the passage of time itself, to its intrinsic and uninterrupted flow or passage.
There is a wide variety of special kinds of processes with their own mini-arrows. The human mind can know the past more easily than the future (the knowledge arrow). Heat flows from hot to cold (the thermodynamic arrow). The cosmos expands and does not shrink (the cosmological arrow). Light rays expand away from a light bulb rather than converge into it (the electromagnetic arrow). We remember the past, not the future (the memory arrow). These mini-arrows are deep and interesting asymmetries of nature, and philosophers and physicists would like to know how the mini-arrows are related to each other. This is called the taxonomy problem.
Some philosophers have even asked whether there could be distant regions of space and time where time’s arrow points in reverse compared to our arrow. If so, would adults there naturally walk backwards on the way to their infancy while they remember the future?
Temporal logic is the representation of reasoning about time and temporal information by using the methods of symbolic logic in order to formalize which statements imply which others. For example, in McTaggart’s B-series, the most important relation is the happens-before relation on events. Logicians have asked what sort of principles must this relation obey in order to properly account for our reasoning about time.
Here is one suggestion. Consider this informally valid reasoning:
Alice’s arrival at the train station happens before Bob’s. Therefore, Bob’s arrival at the station does not happen before Alice’s.
Let us translate this into classical predicate logic using a domain of instantaneous events, where the individual constant ‘a‘ denotes Alice’s arrival at the train station, and ‘b‘ denotes Bob’s arrival at the train station. Let the two-place or two-argument relation ‘Bxy‘ be interpreted as x happens before y—the key relation of McTaggart’s B-series. The direct translation of the above informal argument produces one premise with one conclusion:
Bab
——- ~Bba
The symbol ‘~’ is the negation operator; some logicians prefer to use the symbol ‘¬’ and others prefer to use ‘–’. Unfortunately, our simple formal argument is invalid. To make the argument become valid, we can add some semantic principles about the happens-before relation, namely, the premise that the B relation is asymmetric. That is, we can add this additional premise to the argument:
∀x∀y[Bxy → ~Byx]
The symbol ‘∀x‘ is the universal quantifier on the variable ‘x‘. Some logicians prefer to use ‘(x)‘ for the universal quantifier. The symbol ‘→?‘ is the conditional operator or if-then operator; some logicians prefer to use the symbol ‘⊃‘ instead.
In other informally valid reasoning, we discover a need to make even more assumptions about the happens-before relation. For example, suppose Alice arrives at the train station before Bob, and suppose Bob arrives there before Carol. Is it valid reasoning to infer that Alice arrives before Carol? Yes, but if we translate directly into classical predicate logic we get this invalid argument:
Bab Bbc
—— Bac
To make this argument be valid we can add the premise that says the happens-before relation is transitive, that is:
∀x∀y∀z [(Bxy & Byz) → Bxz]
The symbol ‘&’ represents the conjunction operation. Some logicians prefer to use either the symbol ‘·‘ for conjunction. The transitivity of B is a principle we may want to add to our temporal logic.
What other constraints should be placed on the B relation (when it is to be interpreted as the happens-before relation)? Here are some of the many suggestions:
∀x∀y{Bxy → [t(x) < t(y)]}. If x happens before y, then the time coordinate of x is less than the time coordinate of y. ‘t‘ is a one-argument function symbol.
∀x~Bxx. An event cannot happen before itself.
∀x∀y{[t(x) ≠ t(y)] → [Bxy v Byx]}. Any two non-simultaneous events are connected by the B relation. That is, there are no temporally unrelated pairs of events. (In 1781 in his Critique of Pure Reason, Kant says this is an a priori necessary requirement about time.)
∀x∃yBxy. Time is infinite in the future.
∀x∀y(Bxy → ∃z(Bxz & Bzy)). B is dense in the sense that there is a third point event between any pair of non-simultaneous point events. This requirement prevents quantized time.
To incorporate the ideas of the theory of relativity, we might want to make the happens-before relation be three-valued instead of two-valued by having it relate two events plus a reference frame.
When we formalized these principles of reasoning about the happens-before relation by translating them into predicate logic, we said we were creating temporal logic. However, strictly speaking, a temporal logic is just a theory of temporal sentences expressed in a formal logic. Calling it a logic, as is commonly done, is a bit of an exaggeration; it is analogous to calling the formalization of Peano’s axioms of arithmetic the development of number logic. Our axioms about B are not axioms of predicate logic, but only of a theory that uses predicate logic and that presumes the logic is interpreted on a domain of instantaneous events, and that presumes B is not open to re-interpretation as are the other predicate letters of predicate logic. That is, B is always to be interpreted as happens-before.
The more classical approach to temporal logic, however, does not add premises to arguments formalized in classical predicate logic as we have just been doing. The classical approach is via tense logic, a formalism that adds tense operators on propositions of propositional logic or predicate logic. A. N. Prior was the pioneer in the late 1950s. Michael Dummett and E. J. Lemmon also made major, early contributions to tense logic. Prior created this new logic to describe our reasoning involving time phrases such as now, happens before, twenty-three minutes afterward,at all times, and sometimes. He hoped that a precise, formal treatment of these concepts could lead to the resolution of some of the controversial philosophical issues about time.
Prior begins with an important assumption: that a proposition such as “Custer dies in Montana” can be true at one time and false at another time. That assumption is challenged by some philosophers, such as W.V.O. Quine, who recommended avoiding the use of this sort of proposition. He recommended that temporal logics use only sentences that are timelessly true or timelessly false.
Prior’s main original idea was to appreciate that time concepts are similar in structure to modal concepts such as it is possible that and it is necessary that. He adapted modal propositional logic for his tense logic by re-interpreting its propositional operators. Or we can say he added four new propositional operators. The list below provides examples of their intended interpretations using an arbitrary present-tensed proposition p.
Pp
“It has at some time been the case that p“
Fp
“It will at some time be the case that p”
Hp
“It has always been the case that p”
Gp
“It will always be the case that p”
‘Pp‘ might be interpreted also as at some past time it was the case that, or it once was the case that, or it once was that, all these being equivalent English phrases for the purposes of applying tense logic to English. None of the tense operators are truth-functional.
One standard system of tense logic is a variant of the S4.3 system of modal logic. In this formal tense logic, if p represents the present-tensed proposition “Custer dies in Montana,” then Pp represents “It has at some time been the case that Custer dies in Montana” which is equivalent in English to simply “Custer died in Montana.” So, we properly call ‘P‘ the past-tense operator. It represents a phrase that attaches to a sentence and produces another that is in the past tense.
Metaphysicians who are presentists are especially interested in this tense logic because, if presentists can make do with the variable p ranging only over present-tensed propositions, then this logic, with an appropriate semantics, may show how to eliminate any ontological commitment to the past (and future) while preserving the truth of past tense propositions that appear in biology books such as “There were dinosaurs” and “There was a time when the Earth did not exist.”
The symbol ‘v’ represents disjunction, the “or” operation. The axiom says that for any two propositions p and q, at some past time it was the case that p or q if and only if either at some past time it was the case that p or at some past time (perhaps a different past time) it was the case that q.
If p is the proposition “Custer dies in Montana” and q is “Sitting Bull dies in Montana,” then:
P(p v q) ↔ (Pp v Pq)
says:
Custer or Sitting Bull died in Montana if and only if either Custer died in Montana or Sitting Bull died in Montana.
The S4.3 system’s key axiom is the following equivalence. For all propositions p and q,
(Pp & Pq) ↔ [P(p & q) v P(p & Pq) v P(q & Pp)].
This axiom, when interpreted in tense logic, captures part of our ordinary conception of time as a linear succession of states of the world.
Another axiom of tense logic might state that if proposition q is true, then it will always be true that q has been true at some time. If H is the operator It has always been the case that, then a new axiom might be:
Pp ↔ ~H~p.
This axiom of tense logic is analogous to the modal logic axiom that p is possible if and only if it is not necessary that not-p.
A tense logic will need additional axioms in order to express q has been true for the past two weeks. Prior and others have suggested a wide variety of additional axioms for tense logic. It is controversial whether to add axioms that express the topology of time,
for example that it comes to an end or does not come to an end or that time is like a line instead of a circle; the reason usually given is that this is an empirical matter, not a matter for logic to settle.
Regarding a semantics for tense logic, Prior had the idea that the truth or falsehood of a tensed proposition could be expressed in terms of truth-at-a-time. For example, the proposition Pp (it was once the case that p) is true-at-a-time t if and only if p is true-at-a-time earlier than t. This suggestion has led to extensive development of the formal semantics for tense logic.
Prior himself did not take a stand on which formal logic and formal semantics are correct for dealing with temporal expressions.
The concept of being in the past is usually treated by metaphysicians as a predicate that assigns properties to events, for example, “The event of Queen Anne’s dying has the property of being in the past”, though whether this is a monadic quality or only a relation to other subjects such as Aristotle or you is a point of continuing controversy in metaphysics. But in the tense logic just presented, the concept is treated as an operator P upon propositions. For example, “It has at some time in the past been the case that Queen Anne is dying,” and this difference in treatment by Prior is objectionable to some metaphysicians.
The other major approach to temporal logic does not use a tense logic. Instead, it formalizes temporal reasoning within a first-order logic without modal-like tense operators. One method for developing ideas about temporal logic is the method of temporal arguments which adds an additional temporal argument to any predicate involving time in order to indicate how its satisfaction depends on time. Instead of translating the x is resting predicate as Px, where P is a one-argument predicate, it could be translated into temporal predicate logic as the two-argument predicate Rxt, and this would be interpreted as saying x is resting at time t. P has been changed to a two-argument predicate R by adding a place for a temporal argument. The time variable t is treated as a new sort of variable requiring new axioms to more carefully specify what can be assumed about the nature of time.
Occasionally the method of temporal arguments uses a special constant symbol, say n, to denote now, the present time. This helps with the translation of common temporal sentences. For example, let the individual constant s denote Socrates, and let Rst be interpreted as “Socrates is resting at t.” The false sentence that Socrates has always been resting would be expressed in this first-order temporal logic as:
∀t(Ltn → Rst)
Here L is the two-argument predicate for numerically less than that mathematicians usually write as <. And we see the usefulness of having the symbol n.
If tense logic is developed using a Kripke semantics of possible worlds, then it is common to alter the accessibility relation between any two possible worlds by relativizing it to a time. The point is to show that some old possibilities are no longer possible. For example, a world in which Hillary Clinton becomes the first female U.S. president in 2016 was possible relative to the actual world of 2015, but not relative to the actual world of 2017. There are other complexities. Within a single world, if we are talking about a domain of people containing, say, Socrates, then we want the domain to vary with time since we want Socrates to exist at some times but not at others. Another complexity is that in any world, what event is simultaneous with what other event should be relativized to a reference frame.
Some temporal logics have a semantics that allows sentences to lack both classical truth-values. The first person to give a clear presentation of the implications of treating declarative sentences as being neither true nor false was the Polish logician Jan Lukasiewicz in 1920. To carry out Aristotle’s suggestion that future contingent sentences do not yet have truth-values, he developed a three-valued symbolic logic, with each grammatical declarative sentence having just one of the three truth-values True, or False, or Indeterminate [T, F, or I]. Contingent sentences about the future, such as, “There will be a sea battle tomorrow,” are assigned an I value in order to indicate the indeterminacy of the future. Truth tables for the connectives of propositional logic are redefined to maintain logical consistency and to maximally preserve our intuitions about truth and falsehood. See (Haack 1974) for more details about this application of three-valued logic.
For an introduction to temporal logics and their formal semantics, see (Øhrstrøm and Hasle 1995).
17. Time, Mind, and Experience
The principal philosophical issue about time and mind is to specify how time is represented in the mind; and the principal scientific issue in cognitive neuroscience is to uncover the neurological basis of our sense of time.
Our experience reveals time to us in many ways: (1) We notice some objects changing over time and some other objects persisting unchanged. (2) We detect some events succeeding one another. (3) We notice that some similar events have different durations. (4) We seem to automatically classify events as present, past, or future, and we treat those events differently depending upon how they are classified. For example, we worry more about future pain than past pain.
Neuroscientists and cognitive scientists know that these ways of experiencing time exist, but not why they exist. Humans do not need to consciously learn these skills any more than they need to learn how to be conscious. It’s just something that grows or is acquired naturally. It’s something that appears due to a human being’s innate biological nature coupled with the prerequisites of a normal human environment—such as an adequate air supply, warmth, food, and water. A tulip could be given the same prerequisites, but it would never develop anything like our time consciousness. But neuroscientists do not yet understand the details of how our pre-set genetic program produces time consciousness, although there is agreement that the genes themselves are not conscious in any way.
A minority of philosophers, the panpsychists, would disagree with these neurophysiologists and say genes have proto-mental properties and proto-consciousness and even proto-consciousness of time. Critics remark sarcastically that our genes must also have the proto-ability to pay our taxes on time. The philosopher Colin McGinn, who is not a panpsychist, has some sympathies with the panpsychist position. He says genes:
contain information which is such that if we were to know it we would know the solution to the mind-body problem. In a certain sense, then, the genes are the greatest of philosophers, the repositories of valuable pieces of philosophical information. (McGinn 1999, p. 227)
No time cell nor master clock has been discovered so far in the human body, despite much searching, so many neuroscientists have come to believe there are no such things to be found. Instead, the neurological basis of our time sense probably has to do with coordinated changes in a network of neurons (and glia cells, especially astrocytes) that somehow encodes time information. Our brain cells, the neurons, are firing all at once, but they are organized somehow to produce a single conscious story in perceived, linear time. Although the details are not well understood by neuroscientists, there is continual progress. One obstacle is complexity. The human central nervous system is the most complicated known structure in the universe.
Cognitive neuroscientists want to know the neural mechanisms that account for our awareness of change, for our ability to anticipate the future, for our sense of time’s flow, for our ability to place remembered events into the correct time order (temporal succession), for our understanding of tenses, for our ability to notice and often accurately estimate durations, and for our ability to keep track of durations across many different time scales, such as milliseconds for some events and years for others.
It surely is the case that our body is capable of detecting very different durations even if we are not conscious of doing so. When we notice that the sound came from our left, not right, we do this by unconsciously detecting the very slight extra time it takes the sound to reach our right ear, which is only an extra 0.0005 seconds after reaching our left ear. The unconscious way we detect this difference in time must be very different from the way we detect differences in years. Also, our neurological and psychological “clocks” very probably do not work by our counting ticks and tocks as do the clocks we build in order to measure physical time.
We are consciously aware of time passing by noticing changes either outside or inside our body. For example, we notice a leaf fall from a tree as it acquires a new location. If we close our eyes, we still can encounter time just by imagining a leaf falling. But scientists and philosophers want more details. How is this conscious encounter with time accomplished, and how does it differ from our unconscious awareness of time?
With the notable exception of Husserl, most philosophers say our ability to imagine other times is a necessary ingredient in our having any consciousness at all. Some say our consciousness is a device that stores information about the past in order to predict the future. Although some researchers believe consciousness is a hard problem to understand, some others have said, “Consciousness seems easy to me: it’s merely the thoughts we can remember.” We remember old perceptions, and we make use of our ability to imagine other times when we experience a difference between our present perceptions and our present memories of past perceptions. Somehow the difference between the two gets interpreted by us as evidence that the world we are experiencing is changing through time. John Locke said our train of ideas produces our idea that events succeed each other in time, but he offered no details on how this train does the producing. Surely memory is key. Memories need to be organized into the proper temporal order in analogy to how a deck of cards, each with a different integer on the cards, can be sorted into numerical order. There is a neurological basis to the mental process of time-stamping memories so they are not just a jumble when recalled or retrieved into consciousness. Dogs successfully time-stamp their memories when they remember where they hid their bone and also when they plan for the short-term future by standing at the door to encourage their owner to open it. The human’s ability to organize memories far surpasses any other conscious being. We can decide to do next week what we planned last month because of what happened last year. This is a key part of what makes homo sapiens be sapien.
As emphasized, a major neurological problem is to explain the origin and character of our temporal experiences. How do brains take the input from all its sense organs and produce true beliefs about the world’s temporal relationships? Philosophers and cognitive scientists continue to investigate this, but so far there is no consensus on either how we experience temporal phenomena or how we are conscious that we do. However, there is a growing consensus that consciousness itself is an emergent property of a central nervous system, and that dualism between mental properties and physical properties is not a fruitful supposition. The vast majority of neuroscientists are physicalists who treat brains as if they are just wet machines, and they believe consciousness does not transcend scientific understanding.
Neuroscientists agree that the brain takes a pro-active role in building a mental scenario of the external 3+1-dimensional world. As one piece of suggestive evidence, notice that if you look at yourself in the mirror and glance at your left eyeball, then glance at your right eyeball, and then glance back to the left, you can never see your own eyes move. Your brain always constructs a continuous story of non-moving eyes. However, a video camera taking pictures of your face easily records your eyeballs’ movements, proving that your brain has taken an active role in “doctoring” the scenario.
Researchers believe that at all times our mind is testing hypotheses regarding what is taking place beyond our brain. The brain continually receives visual, auditory, tactile, and other sensory signals arriving at different times from an event, then must produce a hypothesis about what the signals might mean. Do those signals mean there probably is a tiger rushing at us? The brain also continuously revises hypotheses and produces new ones in an attempt to have a coherent story about what is out there, what is happening before what, and what is causing what. Being good at unconsciously producing, testing, and revising these hypotheses has survival value.
Psychological time’s rate of passage is a fascinating phenomenon to study. The most obvious feature is that psychological time often gets out of sync with physical time. At the end of our viewing an engrossing television program, we often think, “Where did the time go? It sped by.” When we are hungry in the afternoon and have to wait until the end of the workday before we can have dinner, we think, “Why is everything taking so long?” When we are feeling pain and we look at a clock, the clock seems to be ticking slower than normal.
An interesting feature of the rate of passage of psychological time reveals itself when we compare the experiences of younger people to older people. When we are younger, we lay down richer memories because everything is new. When we are older, the memories we lay down are much less rich because we have “seen it all before.” That is why older people report that a decade goes by so much more quickly than it did when they were younger.
Do things seem to move more slowly when we are terrified? “Yes,” most people would say. “No,” says neuroscientist David Eagleman, “it’s a retrospective trick of memory.” The terrifying event does seem to you to move more slowly when you think about it later, but not at the time it is occurring. Because memories of the terrifying event are “laid down so much more densely,” Eagleman says, it seems to you, upon your remembering, that your terrifying event lasted longer than it really did.
The human being inherited most or perhaps all of its biological clocks from its ancestor species. Although the cerebral cortex is usually considered to be the base for our conscious experience, it is surprising that rats can distinguish a five-second interval from a forty-second interval even with their cerebral cortex removed. So, a rat’s means of sensing time is probably distributed throughout many places in its brain. Perhaps the human being’s time sense is similarly distributed. However, surely the fact that we know that we know about time is specific to our cerebral cortex. A rat does not know that it knows. It has competence without comprehension. A cerebral cortex apparently is required for this comprehension. Very probably no other primate has an appreciation of time that is as sophisticated as that had by any normal human being.
Entomologists still do not know how the biological clock of a cicada enables these insects to hatch after 13 years living underground, and not after 12 years or 14 years. Progress on this issue might provide helpful clues for understanding the human being’s biological clock.
We humans are very good at detecting the duration of silences. We need this ability to tell the difference between the spoken sentence, “He gave her cat-food,” and “He gave her cat food.” The hyphen is the linguistic tool for indicating that the duration between the spoken words “cat” and “food” is shorter than usual. This is a favorite example of the neuroscientist Dean Buonomano.
Do we have direct experience only of an instantaneous present event or instead do we have direct experience only of the specious present, a present event that lasts a short stretch of physical time. Informally, the issue is said to be whether the present is thin or thick. Plato, Aristotle, Thomas Reid, and Alexius Meinong believed in a thin present. Shadworth Hodgson, Mary Calkins and William James believed in a thick present. The latter position is now the more favored one by experts in the fields of neuroscience and philosophy of mind.
If it is thick, then how thick? Does the present last longer than the blink of an eye? Among those accepting the notion of a specious present, a good estimate of its duration is approximately eighty milliseconds to three seconds for human beings, although neuroscientists do not yet know why it is not two milliseconds or seven seconds.
Another issue is about overlapping specious presents. We do seem to have a unified stream of consciousness, but how do our individual specious presents overlap to produce this unity?
When you open your eyes, can you see what is happening now? In 1630, René Descartes would have said yes, but nearly all philosophers in the twenty-first century say no. You see the North Star as it was over 300 years ago, not as it is now. Also, light arriving at your eye from an external object contains information about its color, motion, and form. The three kinds of signals arrive simultaneously, but it takes your brain different times to process that information. Color information is processed more quickly than motion information, which in turn is processed more quickly than form information. Only after the light has taken its time to arrive at your eye, and then you have processed all the information, can you construct a correct story that perhaps says, “A white golf ball is flying toward my head.”
So, we all live in the past—in the sense that our belief about what is happening now occurs later than when it really happened according to a clock. Our brain takes about eighty milliseconds or more to reconstruct a story of what is happening based on the information coming in from our different sense organs. Because of its long neck, a giraffe’s specious present might last considerably longer. However, it cannot take too much longer than this or else the story is so outdated that the giraffe risks becoming a predator’s lunch while the information processing is happening. Therefore, evolution has probably fine-tuned each kind of organism’s number of milliseconds of its specious present.
In the early days of television broadcasting, engineers worried about the problem of keeping audio and video signals synchronized. Then they accidentally discovered that they had about a tenth-of-a-second of “wiggle room.” As long as the signals arrive within this period, viewers’ brains automatically re-synchronize the signals; outside that tenth-of-a-second period, it suddenly looks like a badly dubbed movie. (Eagleman, 2009)
Watch a bouncing basketball. The light from the bounce arrives into our eyes before the sound arrives into our ears; then the brain builds a story in which the sight and sound of the bounce happen simultaneously. This sort of subjective synchronizing of visual and audio works for the bouncing ball so long as the ball is less than 100 feet away. Any farther and we begin to notice that the sound arrives later.
Some Eastern philosophies promote living in the present and dimming one’s awareness of the past and the future. Unfortunately, people who “live in the moment” have a more dangerous and shorter life. The cognitive scientist Lera Boroditsky says a crack addict is the best example of a person who lives in the moment.
Philosophers of time and psychologists who study time are interested in both how a person’s temporal experiences are affected by deficiencies in their imagination and their memory and how different interventions into a healthy person’s brain might affect that person’s temporal experience.
Some of neuroscientist David Eagleman’s experiments have shown clearly that under certain circumstances a person can be deceived into believing event A occurred before event B, when in fact the two occurred in the reverse order according to clock time. For more on these topics, see (Eagleman, 2011).
The time dilation effect in psychology occurs when events involving an object coming toward you last longer in psychological time than an event with the same object being stationary. With repeated events lasting the same amount of clock time, presenting a brighter object will make that event seem to last longer. This is likewise true for louder sounds.
Suppose you live otherwise normally within a mine and are temporarily closed off from communicating with the world above. For a long while, simply with memory, you can keep track of how long you have been inside the mine, but eventually you will lose track of the correct clock time. What determines how long the long while is, and how is it affected by the subject matter? And why are some persons better estimators than others?
Do we directly experience the present? This is controversial, and it is not the same question as whether at present we are having an experience. Those who answer “yes” tend to accept McTaggart’s A-theory of time. But notice how different such direct experience would have to be from our other direct experiences. We directly experience green color but can directly experience other colors; we directly experience high-pitched notes but can directly experience other notes. Can we say we directly experience the present time but can directly experience other times? Definitely not. So, the direct experience of the present either is non-existent, or it is a strange sort of direct experience. Nevertheless, we probably do have some mental symbol for nowness in our mind that correlates with our having the concept of the present, but it does not follow from this that we directly experience the present any more than our having a concept of love implies that we directly experience love. For an argument that we do not experience the present, see chapter 9 of (Callender 2017).
If all organisms were to die, there would be events after those deaths. The stars would continue to shine, but would any of these star events be in the future? This is a philosophically controversial question because advocates of McTaggart’s A-theory will answer “yes,” whereas advocates of McTaggart’s B-theory will answer “no” and add “Whose future?”
The issue of whether time itself is subjective, a mind-dependent phenomenon such as a secondary quality, is explored elsewhere in this article.
According to René Descartes’ dualistic philosophy of mind, the mind is not in space, but it is in time. The current article accepts the more popular philosophy of mind that rejects dualism and claims that our mind is in both space and time due to the functioning of our brain. It takes no position, though, on the controversial issue of whether the process of conscious human understanding is a computation.
Neuroscientists and psychologists have investigated whether they can speed up our minds relative to a duration of physical time. If so, we might become mentally more productive, and get more high-quality decision making done per fixed amount of physical time, and learn more per minute. Several avenues have been explored: using cocaine, amphetamines and other drugs; undergoing extreme experiences such as jumping backwards off a ledge into a net; and trying different forms of meditation. These avenues definitely affect the ease with which pulses of neurotransmitters can be sent from one neuron to a neighboring neuron and thus affect our psychological time, but so far, none of these avenues has led to success productivity-wise.
For our final issue about time and mind, do we humans have an a priori awareness of time that can be used to give mathematics a firm foundation? In the early twentieth century, the mathematician and philosopher L.E.J. Brouwer believed so. Many mathematicians and philosophers at that time were suspicious that mathematics was not as certain as they hoped for, and they worried that contradictions might be uncovered within mathematics. Their suspicions were increased by the discovery of Russell’s Paradox and by the introduction into set theory of the controversial non-constructive axiom of choice. In response, Brouwer attempted to place mathematics on what he believed to be a firmer epistemological foundation by arguing that mathematical concepts are admissible only if they can be constructed from an ideal mathematician’s vivid, a priori awareness of time, what in Kantian terminology is called an intuition of inner time. Time, said Kant in his Critique of Pure Reason in 1781, is a structuring principle of all possible experience. As such time is not objective; it is not a feature of things-in-themselves, but rather is a feature of the phenomenal world.
Brouwer supported Kant’s claim that arithmetic is the pure form of temporal intuition. Brouwer tried to show how to construct higher-level mathematical concepts (for example, the mathematical line) from lower-level temporal intuitions; but unfortunately, he had to accept the consequence that his program required both rejecting Aristotle’s law of excluded middle in logic and rejecting some important theorems in mathematics such as the theorem that every real number has a decimal expansion and the theorem that there is an actual infinity as opposed to a potential infinity of points between any two points on the mathematical line. Unwilling to accept those inconsistencies with classical mathematics, most other mathematicians and philosophers instead rejected Brouwer’s idea of an intimate connection between mathematics and time.
For interesting video presentations about psychological time, see (Carroll 2012) and (Eagleman 2011). For the role of time in phenomenology, see the article “Phenomenology and Time-Consciousness.” According to the phenomenologist Edmund Husserl, “One cannot discover the least thing about objective time through phenomenological analysis” (Husserl, 1991, p. 6).
Consider the mind of an extraterrestrial. Could an extraterrestrial arrive here on Earth with no concept of time? Probably not. How about arriving with a very different concept of time from ours? Perhaps, but how different? Stephen Hawking’s colleague James Hartle tried to answer this question by speculating that we and the extraterrestrial will at least, “share concepts of past, present and future, and the idea of a flow of time.”
Arntzenius Frank and H. Greaves. 2009. “Time Reversal in Classical Electromagnetism,” The British Journal for the Philosophy of Science vol. 60 (3), pp. 557-584.
Challenges Feynman’s claim that anti-particles are nothing but particles propagating backwards in time.
Arthur, Richard T. 2014. Leibniz. Polity Press. Cambridge, U.K.
Comprehensive monograph on all things Leibniz, with a detailed examination of his views on time.
Arthur, Richard T. W. 2019. The Reality of Time Flow: Local Becoming in Physics, Springer.
Challenges the claim that the now is subjective in modern physics.
Azzouni, Jody. 2015. “Nominalism, the Nonexistence of Mathematical Objects,” in Mathematics, Substance and Surmise, edited by E. Davis and P.J. Davis, pp. 133-145.
Argues that mathematical objects referred to by mathematical physics do not exist despite Quine’s argument that they do exist. Azzouni also claims that a corporation does not exist.
Barbour, Julian. 1999. The End of Time, Weidenfeld and Nicolson, London, and Oxford University Press, New York.
A popular presentation of Barbour’s theory which implies that if we could see the universe as it is, we should see that it is static. It is static, he says, because his way of quantizing general relativity, namely quantum geometrodynamics with its Wheeler-DeWitt equation, implies a time-independent quantum state for the universe as a whole. Time is emergent and not fundamental. He then offers an exotic explanation of how time emerges and why time seems to us to exist.
Barbour, Julian. 2009. The Nature of Time, arXiv:0903.3489.
An application of the Barbour’s ideas of strong emergentism to classical physics.
Baron, Sam. 2018. “Time, Physics, and Philosophy: It’s All Relative,” Philosophy Compass, Volume 13, Issue 1, January.
Reviews the conflict between the special theory of relativity and the dynamic theories of time.
Baron, S. and K. Miller. 2015. “Our Concept of Time” in Philosophy and Psychology of Time edited by B. Mölder, V. Arstila, P. Ohrstrom. Springer. Pp 29-52.
Explores the issue of whether time is a functionalist concept.
Bunge, Mario. 1968. “Physical Time: The Objective and Relational Theory.” Philosophy of Science. Vol. 35, No. 4. Pages 355-388.
Examines the dispute between relationism and substantivalism, sometimes acerbically.
Butterfield, Jeremy. 1984.“ Seeing the Present” Mind, 93, pp. 161-76.
Defends the B-camp position on the subjectivity of the present; and argues against a global present.
Callender, Craig, and Ralph Edney. 2001. Introducing Time, Totem Books, USA.
A cartoon-style book covering most of the topics in this encyclopedia article in a more elementary way. Each page is two-thirds graphics and one-third text.
Callender, Craig and Carl Hoefer. 2002. “Philosophy of Space-Time Physics” in The Blackwell Guide to the Philosophy of Science, ed. by Peter Machamer and Michael Silberstein, Blackwell Publishers, pp. 173-98.
Discusses whether it is a fact or a convention that in a reference frame the speed of light going one direction is the same as the speed coming back.
Callender, Craig. 2010,. Is Time an Illusion?”, Scientific American, June, pp. 58-65.
Explains how the belief that time is fundamental may be an illusion.
Callender, Craig. 2017. What Makes Time Special? Oxford University Press.
A comprehensive monograph on the relationship between the manifest image of time and its scientific image. The book makes a case for how, if information gathering and utilizing systems like us are immersed in an environment with the physical laws that do hold, then we will create the manifest image of time that we do. Not written at an introductory level.
Carnap, Rudolf. 1966. Philosophical Foundations of Physics: An Introduction to the Philosophy of Science. Basic Books, Inc. New York.
Chapter 8 “Time” is devoted to the issue of how to distinguish an accurate clock from an inaccurate one.
Carroll, John W. and Ned Markosian. 2010. An Introduction to Metaphysics. Cambridge University Press.
This introductory, undergraduate metaphysics textbook contains an excellent chapter introducing the metaphysical issues involving time, beginning with the McTaggart controversy.
Carroll, Sean. 2010. From Eternity to Here: The Quest for the Ultimate Theory of Time, Dutton/Penguin Group, New York.
Part Three “Entropy and Time’s Arrow” provides a very clear explanation of the details of the problems involved with time’s arrow. For an interesting answer to the question of what happens in an interaction between our part of the universe and a part in which the arrow of time goes in reverse, see endnote 137 for p. 164.
Carroll, Sean. 2011. “Ten Things Everyone Should Know About Time,” Discover Magazine, Cosmic Variance.
Contains the quotation about how the mind reconstructs its story of what is happening “now.”
Carroll, Sean. 2012. Mysteries of Modern Physics: Time. The Teaching Company, The Great Courses: Chantilly, Virginia.
A series of popular lectures about time by a renowned physicist with an interest in philosophical issues. Emphasizes the arrow of time.
Carroll, Sean. 2016. The Big Picture. Dutton/Penguin Random House. New York.
A physicist surveys the cosmos’ past and future, including the evolution of life.
Carroll, Sean. 2022. The Biggest Ideas in the Universe: Space, Time, and Motion. Dutton/Penguin Random House.
A sophisticated survey of what relativity theory implies about space, time, and motion, with some emphasis on the philosophical issues. Introduces the relevant equations, but is aimed at a general audience and not physicists.
Carroll, Sean. 2019. Something Deeply Hidden: Quantum Worlds and the Emergence of Spacetime, Dutton/Penguin Random House.
Pages 287-289 explain how time emerges in a quantum universe governed by the Wheeler-DeWitt equation, a timeless version of the Schrödinger equation. The chapter “Breathing in Empty Space” explains why the limits of time (whether it is infinite or finite) depend on the total amount of energy in the universe. His podcast Mindscape in August 13, 2018 “Why Is There Something Rather than Nothing?” discusses this topic in its final twenty minutes. His answer is that this may not be a sensible question to ask.
Crowther, Karen. 2019. “When Do We Stop Digging? Conditions on a Fundamental Theory of Physics,” in What is ‘Fundamental’?, edited by Anthony Aguirre, Brendan Foster, and Zeeya Merali, Springer International Publishing.
An exploration of what physicists do mean and should mean when they say a particular theory of physics is final or fundamental rather than more fundamental. She warns, “a theory formally being predictive to all high-energy scales, and thus apparently being the lowest brick in the tower [of theories] (or, at least, one of the bricks at the lowest level of the tower), is no guarantee that it is in fact a fundamental theory. …Yet, it is one constraint on a fundamental theory.” When we arrive at a fundamental theory, “the question shifts from ‘What if there’s something beyond?’ to ‘Why should we think there is something beyond?” That is, the burden of justification is transferred.”
Damasio, Antonio R. 2006. “Remembering When,” Scientific American: Special Edition: A Matter of Time, vol. 287, no. 3, 2002; reprinted in Katzenstein, pp.34-41.
A look at the brain structures involved in how our mind organizes our experiences into the proper temporal order. Includes a discussion of Benjamin Libet’s claim to have discovered in the 1970s that the brain events involved in initiating our free choice occur about a third of a second before we are aware of our making the choice. This claim has radical implications for the philosophical issue of free will.
Dainton, Barry. 2010. Time and Space, Second Edition, McGill-Queens University Press: Ithaca.
An easy-to-read, but technically correct, book. This is probably the best single book to read for someone desiring to understand in more depth the issues presented in this encyclopedia article.
Davies, Paul. 1995. About Time: Einstein’s Unfinished Revolution, Simon & Schuster.
An easy-to-read survey of the impact of the theory of relativity and other scientific advances on our understanding of time.
Davies, Paul. 2002. How to Build a Time Machine, Viking Penguin.
A popular exposition of the details behind the possibilities of time travel.
Deutsch, David and Michael Lockwood. 1994. “The Quantum Physics of Time Travel,” Scientific American, pp. 68-74. March.
An investigation of the puzzle of acquiring information for free by traveling in time.
Deutsch, David. 2013. “The Philosophy of Constructor Theory,” Synthese, Volume 190, Issue 18.
Challenges Laplace’s Paradigm that physics should be done by predicting what will happen from initial conditions and laws of motion. http://dx.doi.org/10.1007/s11229-013-0279-z.
Dowden, Bradley. 2009. The Metaphysics of Time: A Dialogue, Rowman & Littlefield Publishers, Inc.
An undergraduate textbook in dialogue form that covers many of the topics discussed in this encyclopedia article. Easy reading for newcomers to the philosophy of time.
Dummett, Michael. 2000. Is Time a Continuum of Instants?,” Philosophy, Cambridge University Press, pp. 497-515.
A constructivist model of time that challenges the idea that time is composed of durationless instants.
Eagleman David. 2009. “Brain Time.” In What’s Next? Dispatches on the Future of Science. Max Brockman, Ed., Penguin Random House.
A neuroscientist discusses the plasticity of time perception or temporal distortion.
Eagleman David. 2011. “David Eagleman on CHOICE,” Oct. 4, https://www.youtube.com/watch?v=MkANniH8XZE.
Commentary on research on subjective time.
Einstein, Albert. 1982. “Autobiographical Notes.” In P. A. Schilpp, ed. Albert Einstein: Philosopher-Scientist, vol. 1. LaSalle, Il. Open Court Publishing Company.
Describes his early confusion between the structure of the real number line and the structure of time itself.
Earman, John. 1972. “Implications of Causal Propagation Outside the Null-Cone,” Australasian Journal of Philosophy, 50, pp. 222-37.
Describes his rocket paradox that challenges time travel to the past.
Fisher, A. R. J. 2015. “David Lewis, Donald C. Williams, and the History of Metaphysics in the Twentieth Century.” Journal of the American Philosophical Association, volume 1, issue 1, Spring.
Discusses the disagreements between Lewis and Williams, who both are four-dimensionalists, about the nature of time travel.
Gödel, Kurt. 1959. “A Remark about the Relationship between Relativity Theory and Idealistic Philosophy,” in P. A. Schilpp, ed., Albert Einstein: Philosopher-Scientist, Harper & Row, New York.
Discussion of solutions to Einstein’s equations that allow closed causal chains, that is, traveling to your past.
Gott, J. Richard. 2002. Time Travel in Einstein’s Universe: The Physical Possibilities of Travel Through Time.
Presents an original theory of the origin of the universe involving backward causation and time travel.
Grant, Andrew. 2015. “Time’s Arrow,” Science News, July 25, pp. 15-18.
Popular description of why our early universe was so orderly even though nature should always have preferred the disorderly.
Greene, Brian. 2011. The Hidden Reality: Parallel Universes and the Deep Laws of the Universe, Vintage Books, New York.
Describes nine versions of the Multiverse Theory, including the Ultimate multiverse theory described by the philosopher Robert Nozick.
Grey, W. 1999. “Troubles with Time Travel,” Philosophy 74: pp. 55-70.
Examines arguments against time travel.
Grünbaum, Adolf. 1950-51. “Relativity and the Atomicity of Becoming,” Review of Metaphysics, pp. 143-186.
An attack on the notion of time’s flow, and a defense of the treatment of time and space as being continua. Difficult reading.
Grünbaum, Adolf. 1971. “The Meaning of Time,” in Basic Issues in the Philosophy of Time, Eugene Freeman and Wilfrid Sellars, eds. LaSalle, pp. 195-228.
An analysis of the meaning of the term time in both the manifest image and scientific image, and a defense of the B-theory of time. Difficult reading.
Guth, Alan. 2014. “Infinite Phase Space and the Two-Headed Arrow of Time,” FQXi conference in Vieques, Puerto Rico. https://www.youtube.com/watch?v=AmamlnbDX9I. 2014.
Guth argues that an arrow of time could evolve naturally even though it had no special initial conditions on entropy, provided the universe has an infinite available phase space that the universe could spread out into. If so, its maximum possible entropy is infinite, and any other state in which the universe begins will have relatively low entropy.
Haack, Susan. 1974. Deviant Logic, Cambridge University Press.
Chapter 4 contains a clear account of Aristotle’s argument (in section 14d of the present article) for truth-value gaps, and its development in Lukasiewicz’s three-valued logic.
Hawking, Stephen. 2018. Brief Answers to the Big Questions. Bantam Books, New York.
Popular survey of science’s impact upon big questions such as “How did it all begin?, What is inside a black hole?, Is time travel possible?, Will artificial intelligence outsmart us?
Hawking, Stephen. 1992. “The Chronology Protection Hypothesis,” Physical Review. D 46, p. 603.
Nature conspires somehow to block backward time travel.
Hawking, Stephen. 1996. A Brief History of Time, Updated and Expanded Tenth Anniversary Edition, Bantam Books.
A leading theoretical physicist and cosmologist provides introductory chapters on space and time, black holes, the origin and fate of the universe, the arrow of time, and time travel. Hawking suggests that perhaps our universe originally had four space dimensions and no time dimension, and time came into existence when one of the space dimensions evolved into a time dimension. He called this special space dimension “imaginary time.”
Horwich, Paul. 1975. “On Some Alleged Paradoxes of Time Travel,” Journal of Philosophy, 72: pp.432-44.
Examines some of the famous arguments against past time travel.
Horwich, Paul. 1987. Asymmetries in Time, The MIT Press.
A monograph that relates the central problems of time to other problems in metaphysics, philosophy of science, philosophy of language and philosophy of action. Horwich argues that time itself has no arrow.
Hossenfelder, Sabine. 2022. Existential Physics: A Scientist’s Guide to Life’s Biggest Questions, Viking/Penguin Random House LLC.
A theoretical physicist who specializes in the foundations of physics examines the debate between Leibniz and Newton on relational vs. absolute (substantival) time. Her Chapter Two on theories about the beginning and end of the universe is especially deep, revealing, and easy to understand.
Huggett, Nick. 1999. Space from Zeno to Einstein, MIT Press.
Clear discussion of the debate between Leibniz and Newton on relational vs. absolute (substantival) time.
Husserl, Edmund. 1991. On the Phenomenology of the Consciousness of Internal Time. Translated by J. B. Brough. Originally published 1893-1917. Dordrecht: Kluwer Academic Publishers.
The father of phenomenology discusses internal time consciousness.
Katzenstein, Larry. 2006. ed. Scientific American Special Edition: A Matter of Time, vol. 16, no. 1.
A collection of Scientific American articles about time.
Kirk, G.S. and Raven, J.E. 1957. The Presocratic Philosophers. New York: Cambridge University Press,
Krauss, Lawrence M. and Glenn D. Starkman, 2002. “The Fate of Life in the Universe,” Scientific American Special Edition: The Once and Future Cosmos, Dec. pp. 50-57.
Discusses the future of intelligent life and how it might adapt to and survive the expansion of the universe.
Krauss, Lawrence M. 2012. A Universe from Nothing. Atria Paperback, New York.
Discusses on p. 170 why we live in a universe with time rather than with no time. The issue is pursued further in the afterward to the paperback edition that is not included within the hardback edition. Krauss’ position on why there is something rather than nothing was challenged by the philosopher David Albert in his March 23, 2012 review of Krauss’ hardback book in The New York Times newspaper.
Kretzmann, Norman. 1966. “Omniscience and Immutability,” The Journal of Philosophy, July, pp. 409-421.
Raises the question: If God knows what time it is, does this demonstrate that God is not immutable?
Lasky, Ronald C. 2006. “Time and the Twin Paradox,” in Katzenstein, pp. 21-23.
A short analysis of the twin paradox, with helpful graphs showing how each twin would view his or own clock plus the other twin’s clock.
Le Poidevin, Robin and Murray MacBeath, 1993. The Philosophy of Time, Oxford University Press.
A collection of twelve influential articles on the passage of time, subjective facts, the reality of the future, the unreality of time, time without change, causal theories of time, time travel, causation, empty time, topology, possible worlds, tense and modality, direction and possibility, and thought experiments about time. Difficult reading for undergraduates.
Le Poidevin, Robin. 2003. Travels in Four Dimensions: The Enigmas of Space and Time, Oxford University Press.
A philosophical introduction to conceptual questions involving space and time. Suitable for use as an undergraduate textbook without presupposing any other course in philosophy. There is a de-emphasis on teaching the scientific theories, and an emphasis on elementary introductions to the relationship of time to change, the implications that different structures for time have for our understanding of causation, difficulties with Zeno’s Paradoxes, whether time passes, the nature of the present, and why time has an arrow.
Lewis, David K. 1986. “The Paradoxes of Time Travel.” American Philosophical Quarterly, 13:145-52.
A classic argument against changing the past. Lewis assumes the B-theory of time.
Lockwood, Michael. 2005. The Labyrinth of Time: Introducing the Universe, Oxford University Press.
A philosopher of physics presents the implications of contemporary physics for our understanding of time. Chapter 15, “Schrödinger’s Time-Traveler,” presents the Oxford physicist David Deutsch’s quantum analysis of time travel.
Lowe, E. J. 1998. The Possibility of Metaphysics: Substance, Identity and Time, Oxford University Press.
This Australian metaphysician defends the A-theory’s tensed view of time in chapter 4, based on an ontology of substances rather than events.
Mack, Katie. 2020. The End of Everything (Astrophysically Speaking). Scribner, New York.
Exploration of alternative ways the universe might end.
Markosian, Ned. 2003. “A Defense of Presentism,” in Zimmerman, Dean (ed.), Oxford Studies in Metaphysics, Vol. 1, Oxford University Press.
Maudlin, Tim. 1988. “The Essence of Space-Time.” Proceedings of the Biennial Meeting of the Philosophy of Science Association, Volume Two: Symposia and Invited Papers (1988), pp. 82-91.
Maudlin discusses the hole argument, manifold substantivalism and metrical essentialism.
Maudlin, Tim. 2002. “Remarks on the Passing of Time,” Proceedings of the Aristotelian Society, New Series, Vol. 102 (2002), pp. 259-274 Published by: Oxford University Press. https://www.jstor.org/stable/4545373. 2002.
Defends eternalism, the block universe, and the passage of time.
Maudlin, Tim. 2007. The Metaphysics Within Physics, Oxford University Press.
Chapter 4, “On the Passing of Time,” defends the dynamic theory of time’s flow, and he argues that the passage of time is objective.
Maudlin, Tim. 2012. Philosophy of Physics: Space and Time, Princeton University Press.
An advanced introduction to the conceptual foundations of spacetime theory.
McCall, Storrs. 1966. “II. Temporal Flux,” American Philosophical Quarterly, October.
An analysis of the block universe, the flow of time, and the difference between past and future.
McGinn, Colin. 1999. The Mysterious Flame: Conscious Minds in a Material World. Basic Books.
Claims that the mind-body problem always will be a mystery for your mind but not for your genes.
McTaggart, J. M. E. 1927. The Nature of Existence, Cambridge University Press.
Chapter 33 restates more clearly the arguments that McTaggart presented in 1908 for his A series and B series and how they should be understood to show that time is unreal. Difficult reading. The argument for the inconsistency that a single event has only one of the properties of being past, present, or future, but that any event also has all three of these properties is called “McTaggart’s Paradox.” The chapter is renamed “The Unreality of Time,” and is reprinted on pp. 23-59 of (Le Poidevin and MacBeath 1993).
Mellor, D. H. 1998. Real Time II, International Library of Philosophy.
This monograph presents a subjective theory of tenses. Mellor argues that the truth conditions of any tensed sentence can be explained without tensed facts.
Merali, Zeeya. 2013. Theoretical Physics: The Origins of Space and Time,” Nature, 28 August , vol. 500, pp. 516-519.
Describes six theories that compete for providing an explanation of the basic substratum from which space and time emerge.
Miller, Kristie. 2013. “Presentism, Eternalism, and the Growing Block,” in A Companion to the Philosophy of Time. Ed. by Heather Dyke and Adrian Bardon, John Wiley & Sons, Inc., pp. 345-364.
Compares the pros and cons of competing ontologies of time.
Morris, Michael S., Kip S. Thorne and Ulvi Yurtsever. 1988. “Wormholes, Time Machines, and the Weak Energy Condition,” Physical Review Letters, vol. 61, no. 13, 26 September.
The first description of how to build a time machine using a wormhole.
Moskowitz, Clara. 2021. “In Every Bit of Nothing There is Something,” Scientific American, February.
Describes how the Heisenberg Uncertainty Principle requires there to be continual creation and annihilation of virtual particles. This process is likely to be the cause of dark energy and the accelerating expansion of space.
Mozersky, M. Joshua. 2013. “The B-Theory in the Twentieth Century,” in A Companion to the Philosophy of Time. Ed. by Heather Dyke and Adrian Bardon, John Wiley & Sons, Inc., pp. 167-182.
A detailed evaluation and defense of the B-Theory.
Muller, Richard A. 2016a. NOW: The Physics of Time. W. W. Norton & Company, New York.
An informal presentation of the nature of time by an experimental physicist at the University of California, Berkeley. Chapter 15 argues that the correlation between the arrow of time and the increase of entropy is not a causal connection. Chapter 16 discusses the competing arrows of time. Muller favors space expansion as the cause of time’s arrow, with entropy not being involved. And he recommends a big bang theory in which both space and time expand, not simply space. Because space and time are so intimately linked, he says, the expansion of space is propelling time forward, and this explains the “flow” of time. However, “the new nows [are] created at the end of time, rather than uniformly throughout time.” (p. 8)
Muller, Richard. 2016b. “Now and the Flow of Time,” arXiv, https://arxiv.org/pdf/1606.07975.pdf.
Argues that the flow of time consists of the continual creation of new moments, new nows, that accompany the creation of new space.”
Nadis, Steve. 2013. “Starting Point,” Discover, September, pp. 36-41.
Non-technical discussion of the argument by cosmologist Alexander Vilenkin that the past of the multiverse must be finite (there was a first bubble) but its future must be infinite (always more bubbles).
Argues that, “We don’t find passage in our present theories and we would like to preserve the vanity that our physical theories of time have captured all the important facts of time. So we protect our vanity by the stratagem of dismissing passage as an illusion.”
Novikov, Igor. 1998. The River of Time, Cambridge University Press.
Chapter 14 gives a very clear and elementary description of how to build a time machine using a wormhole.
Oaklander, L. Nathan. 2008. The Ontology of Time. Routledge.
An authoritative collection of articles on all the major issues. Written for an audience of professional researchers.
Øhrstrøm, P. and P. F. V. Hasle. 1995. Temporal Logic: from Ancient Ideas to Artificial Intelligence. Kluwer Academic Publishers.
An elementary introduction to the logic of temporal reasoning.
Penrose, Roger. 2004. The Road to Reality: A Complete Guide to the Laws of the Universe. Alfred A. Knopf.
A mathematical physicist discusses cosmology, general relativity, and the second law of thermodynamics, but not at an introductory level.
Perry, John. 1979. “The Problem of the Essential Indexical,” Noûs,13 (1), pp. 3-21.
Argues that indexicals are essential to what we want to say in natural language; they cannot all be explicated by, reduced to, or eliminated in favor of B-theory discourse.
Pinker, Steven. 2007. The Stuff of Thought: Language as a Window into Human Nature, Penguin Group.
Chapter 4 discusses how the conceptions of space and time are expressed in language in a way very different from that described by either Kant or Newton. Page 189 says that time in only half the world’s languages is the ordering of events expressed in the form of grammatical tenses. Chinese has no tenses, in the sense of verb conjugations, but of course, it expresses all sorts of concepts about time in other ways.
Plato. Parmenides. 1961. Trans. by F. Macdonald Cornford in The Collected Dialogues of Plato, ed. E. Hamilton and H. Cairns. Princeton, NJ: Princeton University Press.
Plato discusses time.
Pöppel, Ernst. 1988. Mindworks: Time and Conscious Experience. San Diego: Harcourt Brace Jovanovich.
A neuroscientist explores our experience of time.
Price, Huw. 1996. Time’s Arrow & Archimedes’ Point: New Directions for the Physics of Time. Oxford University Press.
Price believes the future can affect the past, the notion of direction of the flow cannot be established as an objective notion, and philosophers of physics need to adopt an Archimedean point of view outside of time in order to discuss time in an unbiased manner.
Argues that a tenseless or B-theory of time fails to account for our feeling of relief that painful past events are in the past rather than in the present.
Prior, A.N. 1967.Past, Present and Future, Oxford University Press.
Pioneering work in temporal logic, the symbolic logic of time, that permits propositions to be true at one time and false at another.
Prior, A.N. 1969. “Critical Notices: Richard Gale, The Language of Time,” Mind, 78, no. 311, 453-460.
Contains his attack on the attempt to define time in terms of causation.
Prior, A.N. 1970. “The Notion of the Present,” Studium Generale, volume 23, pp. 245-8.
A brief defense of presentism, the view that the past and the future are not real.
Putnam, Hilary. 1967. “Time and Physical Geometry,” The Journal of Philosophy, 64, pp. 240-246.
Comments on whether Aristotle is a presentist. Putnam believes that the manifest image of time is refuted by relativity theory.
Quine, W.V.O. 1981. Theories and Things. Cambridge, MA: Harvard University Press.
Quine argues for physicalism in metaphysics and naturalism in epistemology.
Rovelli, Carlo. 2017. Reality is Not What It Seems: The Journey to Quantum Gravity. Riverhead Books, New York.
An informal presentation of time in the theory of loop quantum gravity. Loop theory focuses on gravity; string theory is a theory of gravity plus all the forces and matter.
Rovelli, Carlo. 2018. The Order of Time. Riverhead Books, New York.
An informal discussion of the nature of time by a theoretical physicist. The book was originally published in Italian in 2017. Page 70 contains the graph of the absolute elsewhere that was the model for the one in this article.
Rovelli, Carlo. 2018. “Episode 2: Carlo Rovelli on Quantum Mechanics, Spacetime, and Reality” in Sean Carroll’s Mindscape Podcast at www.youtube.com/watch?v=3ZoeZ4Ozhb8. July 10.
Rovelli and Carroll discuss loop quantum gravity vs. string theory, and whether time is fundamental or emergent.
Russell, Bertrand. 1915. “On the Experience of Time,” Monist, 25, pp. 212-233.
The classical tenseless theory.
Russell, Bertrand. 1929. Our Knowledge of the External World. W. W. Norton and Co., New York, pp. 123-128.
Russell develops his formal theory of time that presupposes the relational theory of time.
Saunders, Simon. 2002. “How Relativity Contradicts Presentism,” in Time, Reality & Experience edited by Craig Callender, Cambridge University Press, pp. 277-292.
Reviews the arguments for and against the claim that, since the present in the theory of relativity is relative to reference frame, presentism must be incorrect.
Savitt, Steven F. 1995. Time’s Arrows Today: Recent Physical and Philosophical Work on the Direction of Time. Cambridge University Press.
A survey of research in this area, presupposing sophisticated knowledge of mathematics and physics.
Savitt, Steven F. “Being and Becoming in Modern Physics.” In E. N. Zala (ed.). The Stanford Encyclopedia of Philosophy.
In surveying being and becoming, it suggests how the presentist and grow-past ontologies might respond to criticisms that appeal to relativity theory.
Sciama, Dennis. 1986. “Time ‘Paradoxes’ in Relativity,” in The Nature of Time edited by Raymond Flood and Michael Lockwood, Basil Blackwell, pp. 6-21.
A clear account of the twin paradox.
Shoemaker, Sydney. 1969. “Time without Change,” Journal of Philosophy, 66, pp. 363-381.
A thought experiment designed to show us circumstances in which the existence of changeless periods in the universe could be detected.
Sider, Ted. 2000. “The Stage View and Temporary Intrinsics,” The Philosophical Review, 106 (2), pp. 197-231.
Examines the problem of temporary intrinsics and the pros and cons of four-dimensionalism.
Sider, Ted. 2001. Four-Dimensionalism: An Ontology of Persistence. Oxford University Press, New York.
Defends the ontological primacy of four-dimensional events over three-dimensional objects. He freely adopts causation as a means of explaining how a sequence of temporal parts composes a single perduring object. This feature of the causal theory of time originated with Hans Reichenbach.
Sklar, Lawrence. Space. 1976. Time, and Spacetime, University of California Press.
Chapter III, Section E discusses general relativity and the problem of substantival spacetime, where Sklar argues that Einstein’s theory does not support Mach’s views against Newton’s interpretations of his bucket experiment; that is, Mach’s argument against substantivalism fails.
Slater, Hartley. 2012. “Logic is Not Mathematical,” Polish Journal of Philosophy, Spring, pp. 69-86.
Discusses, among other things, why modern symbolic logic fails to give a proper treatment of indexicality.
Smith, Quentin. 1994. “Problems with the New Tenseless Theories of Time,” pp. 38-56 in Oaklander, L. Nathan and Smith, Quentin (eds.), The New Theory of Time, New Haven: Yale University Press.
Challenges the new B-theory of time promoted by Mellor and Smith.
Smolin, Lee. 2013. Time Reborn. Houghton, Mifflin, Harcourt Publishing Company, New York.
An extended argument by a leading theoretical physicist for why time is real. Smolin is a presentist. He believes the general theory of relativity is mistaken about the relativity of simultaneity; he believes every black hole is the seed of a new universe; and he believes nothing exists outside of time.
Sorabji, Richard. 1988. Matter, Space, & Motion: Theories in Antiquity and Their Sequel. Cornell University Press.
Chapter 10 discusses ancient and contemporary accounts of circular time.
Steinhardt, Paul J. 2011. “The Inflation Debate: Is the theory at the Heart of Modern Cosmology Deeply Flawed?” Scientific American, April, pp. 36-43.
Argues that the big bang Theory with inflation is incorrect and that we need a cyclic cosmology with an eternal series of big bangs and big crunches but with no inflation. The inflation theory of quantum cosmology implies the primeval creation of a very large universe in a very short time.
Tallant, Jonathan. 2013. “Time,” Analysis, Vol. 73, pp. 369-379.
Examines these issues: How do presentists ground true propositions about the past? How does time pass? How do we experience time’s passing?
Tegmark, Max. 2017. “Max Tegmark and the Nature of Time,” Closer to Truth, https://www.youtube.com/watch?v=rXJBbreLspA, July 10.
Speculates on the multiverse and why branching time is needed for a theory of quantum gravity.
Thorne, Kip. 2014. The Science of INTERSTELLAR. W. W. Norton & Company, New York, London.
This specialist on time travel describes scientific implications about time machines, black holes, and the big bang.
Unruh, William. 1999. “Is Time Quantized? In Other Words, Is There a Fundamental Unit of Time That Could Not Be Divided into a Briefer Unit?” Scientific American, October 21. https://www.scientificamerican.com/article/is-time-quantized-in-othe/
Discusses whether time has the same structure as a mathematical continuum.
Van Fraassen, Bas C. 1985. An Introduction to the Philosophy of Time and Space, Columbia University Press.
An advanced undergraduate textbook by an important philosopher of science.
Van Inwagen, Peter. 2015. Metaphysics, Fourth Edition. Westview Press.
An introduction to metaphysics by a distinguished proponent of the A-theory of time.
Veneziano, Gabriele. 2006. “The Myth of the Beginning of Time,” Scientific American, May 2004, pp. 54-65, reprinted in Katzenstein, pp. 72-81.
An account of string theory’s impact on our understanding of time’s origin. Veneziano hypothesizes that our big bang was not the origin of time but simply the outcome of a preexisting state.
Wallace, David. 2021. Philosophy of Physics: A Very Short Introduction. Oxford University Press.
An excellent introduction to the philosophical issues within physics and how different philosophers approach them.
Wasserman, Ryan. 2018. Paradoxes of Time Travel, Oxford University Press.
A detailed review of much of the philosophical literature about time travel. The book contains many simple, helpful diagrams.
Whitehead, A. N. 1938. Modes of Thought. Cambridge University Press.
Here Whitehead describes his “process philosophy” that emphasizes the philosophy of becoming rather than of being, for instance, traveling the road rather than the road traveled.
Whitrow, G. J. 1980. The Natural Philosophy of Time, Second Edition, Clarendon Press.
A broad survey of the topic of time and its role in physics, biology, and psychology. Pitched at a higher level than the Davies books.
Author Information
Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.
What Else Science Requires of Time (That Philosophers Should Know)
The term theory has many senses, even in physics. In the main article “Time” and in these supplements, it is used in a special, technical sense, not in the sense of an explanation as in the remark, “My theory is that the mouse stole the cheese,” nor in the sense of a prediction as in the remark, “My theory is that the mouse will steal the cheese.” The general theory of relativity is an example of our intended sense. The key feature is that the theory contain laws that are not vague. Philosophical theories tend to be vague. Think of the philosophical theories of mind, meaning, history, free will, and so forth. They are often mere slogans or a few sentences that sketch an idea intended to resolve something that is puzzling, even if the sketch fills an entire monograph.
Ideally the confirmed theories of physics do three things: explain what we already know, predict what we don’t, and help us understand what we can. Theories themselves do not do the explaining; we humans use them in order to explain. However, the idiom is commonly used.
Whether to add a fourth thing—that the fundamental theories are true or at least approximately true—has caused considerable controversy among philosophers of science. The Harvard philosopher Hilary Putnam is noted for arguing that the success of precise theories in physics would be miraculous if they were not at least approximately true.
The field of physics contains many philosophical presuppositions: that nature is understandable, that nature is lawful, that those laws are best represented in the language of mathematics, that the laws tell us how nature changes from time to time, that the fundamental laws do not change with time, that there is only one correct fundamental theory of everything physical, that a scientific law is not really a law if it holds only when a supernatural being decides not to intervene and allow a miracle to be performed, and that we are not brains in a vat nor characters in someone’s computer game. But these philosophical presuppositions are not held dogmatically. Ideally, they would be rejected if scientists were to find new evidence that they should be changed.
Here is the opinion of the cosmologist Stephen Hawking on the nature of scientific laws:
I believe that the discovery of these laws has been humankind’s greatest achievement…. The laws of nature are a description of how things actually work in the past, present and future…. But what’s really important is that these physical laws, as well as being unchangeable, are universal [so they apply to everything everywhere all the time]. (Brief Answers to the Big Questions, 2018).
We humans are lucky that we happen to live in a universe that is so explainable, predictable and understandable, and that is governed by so few laws. The philosophical position called “scientific realism” implies that entities we do not directly observe but only infer theoretically from the laws (such as spacetime) really do exist. Scientific realism is controversial among philosophers, despite its popularity among physicists.
A popular version of scientific realism that accounts for the fact that scientific theories eventually are falsified and need to be revised but not totally rejected is called “structural scientific realism.” For example, much of the structure of early 20th century atomic theory is retained even though that theory was replaced by a more sophisticated version of atomic theory later in the 20th century and an even more sophisticated version in the 21st century. Atoms are not what they used to be.
Most importantly for the “Time” article, the theories of physics help us understand the nature of time. They do this primarily by their laws. Much has been said in the literature of the philosophy of science about what a scientific law is. The philosopher David Lewis claimed that a scientific law is whatever provides a lot of information in a compact and simple expression. This is a justification for saying a law must be a general claim. The claim that Mars is farther from the Sun than is the Earth is true, but it does not qualify as being a law because it is not general enough. The Second Law of Thermodynamics is general enough.
It is because theories in science are designed for producing interesting explanations, not for encompassing all the specific facts, that there is no scientific law that specifies your age and phone number. Some theories are expressed fairly precisely, and some are expressed less precisely. All other things being equal, the more precise the better. If they have important simplifying assumptions but still give helpful explanations of interesting phenomena, then they are often said to be models. Very simple models are said to be toy models. However, physicists do not always use the terms this way. Very often they use the terms “theory” and “model” interchangeably. For example, the Standard Model of Particle Physics is a theory in the sense used in this section, but for continuity with historical usage of the term physicists have never bothered to replace the word “model” with “theory.”
In physics, the fundamental laws in the theories are equations. The equations of the laws are meant to be solved for different environments, with the environment providing different initial values for the variables within the equations. Solutions to the equations are used to provide predictions about what will happen. For example, Karl Schwarzschild found the first exact solution to Einstein’s equations of general relativity. The environment (the set of initial conditions) he chose was a large sphere of gas in an otherwise empty universe, and the solution was what is now called a black hole. At the time, Einstein said he believed this solution was not predicting the existence of anything that is physically real, though now we know Einstein was mistaken. Roger Penrose won a Nobel Prize for proving that under a variety of normal conditions and their perturbations in our spacetime, the general theory of relativity implies that there will be black holes containing singularities within an event horizon that can be passed through.
According to a great many physicists, predictions made by using the theories of physics should be as accurate as possible and not merely precise. In addition, most researchers say a theory ideally should tell us how the system being studied would behave if certain conditions were to be changed in a specified way, for example, if the density of water were doubled or more moons were orbiting the planet. Knowing how the system would behave under different conditions helps us understand the causal structure of the system.
Physicists want their theories to help make accurate and precise predictions, but when the predications in a test are not accurate and precise, the first thought is that perhaps there was a sloppy test of the prediction. If the physicists become satisfied that the test is well run, then their thoughts turn to whether the test might be a sign that there exists some as yet unknown particle or force at work causing the mismatch between theory and experiment. That is why physicists love anomalies.
Theories of physics are, among other things, a set of laws and a set of ways to link its statements to the real, physical world. A theory might link the variable “t” to time as measured with a standard clock, and link the constant “M” to the known mass of the Earth. In general, the mathematics in mathematical physics is used to create mathematical representations of real entities and their states and behaviors. That is what makes it be an empirical science, unlike pure mathematics.
Do the laws of physics actually govern us? In Medieval Christian theology, the laws of nature were considered to be God’s commands, but today saying nature “obeys” scientific laws or that nature is “governed” by laws is considered by scientists to be a harmless metaphor. Scientific laws are called “laws” because they constrain what can happen; they imply this will happen and that will not. It was Pierre Laplace who first declared that fundamental scientific laws are hard and fast rules with no exceptions.
Philosophers’ positions on laws divide into two camps, Humean and anti-Humean. Anti-Humeans consider scientific laws to be bringing nature forward into existence. It is as if laws are causal agents. Some anti-Humeans side with Aristotle that whatever happens is because parts of the world have essences and natures, and the laws are describing these essences and natures. This position is commonly accepted in the manifest image. Humeans, on the other hand, consider scientific laws simply to be patterns of nature that very probably will hold in the future. The patterns summarize the behavior of nature. The patterns do not “lay down the law for what must be.” In response to the question of why these patterns and not other patterns, some Humeans say they are patterns described with the most useful concepts for creatures with brains like ours (and other patterns might be more useful for extraterrestrials). More physicists are Humean than anti-Humean. More philosophers are anti-Humean than Humean.
In our fundamental theories of physics, the standard philosophical presupposition is that a state of a physical system describes what there is at some time, and a law of the theory—an “evolution law” or “dynamical law”—describes how the system evolves from a state at one time into a state at another time. All evolution laws in our fundamental theories are differential equations.
All fundamental laws of relativity theory are time-reversible. Time-reversibility implies the fundamental laws do not notice the future direction from the past direction. The second law of thermodynamics does notice this; it says entropy tends to increase toward the future; so the theory of thermodynamics is not time-reversible (but it is also not a fundamental theory). And time-reversibility fails for quantum measurements (for a single universe).
Time-translation invariance is a meta-law that implies the laws of physics we have now are the same laws that held in the past and will hold in the future, and it implies that all instants are equivalent. This is not implying that if you bought an ice cream cone yesterday, you will buy one tomorrow. Unfortunately there are difficulties with time-translation invariance. For example, a translation in time to a first moment would be to a special moment with no earlier moment, so there is at least one exception to the claim that all moments are indistinguishable. A deeper question is whether any of the laws we have now might change in the future. The default answer is “no,” but this is just an educated guess. And any evidence that a fundamental can fail will be treated by some physicists as evidence that it was never a law to begin with, while it will be treated by others as proof that time-translation invariance fails. Hopefully a future consensus will be reached one way or the other.
Epistemologically, the laws of physics are hypotheses that are helpful to hold and that have not been refuted. However, some laws are believed less strongly than others, and so are more likely to be changed than others if future observations indicate a change is needed. The laws held most strongly in this sense are the Second Law of thermodynamics and the laws of general relativity and quantum mechanics.
Physical constants are parameters in a physical theory that cannot be explained by that theory. The laws of our fundamental theories contain many constants such as the fine-structure constant, the value for the speed of light in a vacuum, Planck’s constant, and the rest mass of a proton. For some of these constants (a.k.a. parameters), the Standard Model of Particle Physics indicates that we should be able to compute the value exactly, but practical considerations of solving the equations in order to obtain this exact value have been insurmountable, so we have had to make do with a good measurement. That is, we measure the constant carefully and precisely, and then select this measurement outcome as a best, specific value for the constant to be placed into the theories containing the constant. A virtue of a theory is to not have too many such constants. If there were too many, then the theory could never be disproved by data because the constants always could be adjusted to account for any new data, and so the theory would be pseudoscientific. Unfortunately, the constants in quantum field theory look remarkably arbitrary.
Regarding the divide between science and pseudoscience, the leading answer is that:
what is really essential in order for a theory to be scientific is that some future information, such as observations or measurements, could plausibly cause a reasonable person to become either more or less confident of its validity. This is similar to Popper‘s criteria of falsifiability, while being less restrictive and more flexible (Dan Hooper).
a. The Core Theory
Some physical theories are fundamental, and some are not. Fundamental theories are foundational in the sense that their laws cannot be derived from the laws of other physical theories even in principle. For example, the second law of thermodynamics is not fundamental, nor are the laws of plate tectonics in geophysics despite their being critically important to their respective sciences. The following two theories are fundamental: (i) the general theory of relativity, and (ii) quantum mechanics. Their amalgamation is what Nobel Prize winner Frank Wilczek called the Core Theory, the theory of almost everything physical. It is a version of quantum field theory.
Nearly all scientists believe this Core Theory holds not just in our solar system, but all across the universe, and it held yesterday and will hold tomorrow. Wilczek claimed:
[T]he Core has such a proven record of success over an enormous range of applications that I can’t imagine people will ever want to junk it. I’ll go further: I think the Core provides a complete foundation for biology, chemistry, and stellar astrophysics that will never require modification. (Well, “never” is a long time. Let’s say for a few billion years.)
This implies one could think of biology as applied quantum theory.
The Core Theory does not include the big bang theory, which is the standard model of cosmology. The Core Theory does not use the terms time’s arrow or now. The concept of time in the Core Theory is primitive or “brute.” It is not definable, but rather it is used to define other concepts such as length.
It is believed by most physicists that the Core Theory can be used in principle to adequately explain the behavior of a leaf, a galaxy, and a brain. The hedge phrase “in principle” is important. One cannot replace it with “in practice” or “practically.” Practically there are many limitations on the use of the Core Theory. Here are some of the limitations. Leaves are too complicated. There are too many layers of emergence needed from the Core Theory to leaf behavior. Also, there is a margin of error in any measurement of anything. There is no way to acquire the leaf data precisely enough to deduce the exact path of a leaf falling from a certain tree 300 years ago. Even if this data were available, the complexity of the needed calculations would be prohibitive. Commenting on these various practical limitations for the study of galaxies, the cosmologist Andrew Ponzen said “Ultimately, galaxies are less like machines and more like animals—loosely understandable, rewarding to study, but only partially predictable.”
The Core has been tested in many extreme circumstances and with great sensitivity, so physicists have high confidence in it. There is no doubt that for the purposes of doing physics the Core Theory provides a demonstrably superior representation of reality to that provided by its alternatives. But all physicists know the Core is not strictly true and complete, and they know that some features will need revision—revision in the sense of being modified or extended. Physicists are motivated to discover how to revise it because such a discovery can lead to great praise from the rest of the physics community. Wilczek says the Core will never need modification for understanding (in principle) the special sciences of biology, chemistry, stellar astrophysics, computer science and engineering, but he would agree that the Core needs revision in order to adequately explain why 95 percent of the universe consists of dark energy and dark matter, why the universe has more matter than antimatter, why neutrinos change their identity over time, and why the energy of empty space is as small as it is. One philosophical presupposition here is that the new Core Theory will be a single, logically consistent theory.
The Core Theory presupposes that time exists, that it is a feature of spacetime, and that spacetime is more fundamental than time. Within the Core Theory, relativity theory allows space to curve, ripple, and expand; and curving, rippling, and expanding can vary over time. Quantum theory alone does not allow any of these, although a future revision of quantum theory within the Core Theory is expected to allow them.
In the Core Theory, the word time is a theoretical term, and the dimension of time is treated somewhat like a single dimension of space. Space is informally considered to be a set of all possible point-locations. Time is a set of all possible point-times. Spacetime is a set of all possible point-events. Spacetime is presumed to be at least four-dimensional and also to be a continuum of points and thus to be continuous, with time being a distinguished, one-dimensional sub-space of spacetime. Because the time dimension is so different from a space dimension, physicists very often speak of (3+1)-dimensional spacetime rather than 4-dimensional spacetime. Both relativity theory and quantum theory assume that three-dimensional space is isotropic (rotation symmetric) and homogeneous (translation symmetric) and that there is translation symmetry in time (but other considerations in cosmology cast doubt on this symmetry). Regarding all these symmetries, all the physical laws do need to obey them, but specific physical systems within space-time need not. For example, your body could become very different if you walk across the road at noon on Tuesday instead of Friday, even though the Tuesday laws are also the Friday laws.
The Core Theory also presupposes reductionism throughout science in the sense that large-scale laws are based on the small-scale laws. For example, the laws of geology are based on the fundamental laws of physics. The only exception to reductionism seems to be due to quantum coherence in which the behavior of any group of particles is not fully describable by complete knowledge of the behavior of all its individual particles. This is a very important exception to reductionism.
The Core Theory presupposes an idea Laplace had in 1800 that is now called the Laplacian Paradigm—that all dynamical laws should have the form of describing how a state of a system at one time turns into a different state at another time. This implies that a future state is entailed by a single past state rather does not demand more information such as the entire history of the system. This latter implication is often described by saying the laws are Markovian.
The Core Theory does not presuppose or explicitly mention consciousness. The typical physicist believes consciousness is contingent; it happens to exist but it is not a necessary feature of the universe. That is, consciousness happened to evolve because of fortuitous circumstances, but it might not have. Many philosophers throughout history have disagreed with this treatment of consciousness, especially the idealist philosophers of the 19th century.
[For the experts: More technically, the Core Theory is the renormalized, effective quantum field theory that includes both the Standard Model of Particle Physics and the weak field limit of Einstein’s General Theory of Relativity in which gravity is very weak and spacetime is almost flat, and no assumption is made about the character or even the existence of space and time below the Planck length and Planck time.]
2. Relativity Theory
Of all the theories of science, relativity theory has had the greatest impact upon our understanding of the nature of time. Relativity theory implies time is one component of four-dimensional spacetime, and time can curve and dilate.
When the term relativity theory is used, it usually refers to the general theory of relativity of 1915 with the addition of a cosmological constant, but sometimes it refers to the special theory of relativity of 1905. Both are theories of time. Both have been well-tested; and they are almost universally accepted among physicists. Today’s physicists understand them better than Einstein himself did. “Einstein’s twentieth-century laws, which—in the realm of strong gravity—began as speculation, became an educated guess when observational data started rolling in, and by 1980, with ever-improving observations, evolved into truth” (Kip Thorne).
Although the Einstein field equations in his general theory:
are exceedingly difficult to manipulate, they are conceptually fairly simple. At their heart, they relate two things: the distribution of energy in space, and the geometry of space and time. From either one of these two things, you can—at least in principle—work out what the other has to be. So, from the way that mass and other energy is distributed in space, one can use Einstein’s equations to determine the geometry of that space, And from that geometry, we can calculate how objects will move through it (Dan Hooper).
The main assumption of GR, general relativity theory, is the principle of equivalence: gravity is basically acceleration. That is, for small objects and for a short duration, gravitational forces cannot be distinguished from forces produced by acceleration.
GR has many assumptions and implications that are usually never mentioned so explicitly. One is that gravity did not turn off for three seconds during the year 1777 in Australia. A more general one is that the theory’s fundamental laws are the same regardless of what time it is. This feature is called time-translation invariance.
The relationship between the special and general theories is slightly complicated. Both theories are about the motion of objects and both approach agreement with Newton’s theory the slower the speed of those objects, and the weaker the gravitational forces involved, and the lower the energy of those objects. General relativity implies the truth of special relativity in all infinitesimal regions of spacetime, but not vice versa.
General relativity holds in all reference frames, but special relativity holds only for inertial reference frames, namely non-accelerating frames. Special relativity implies the laws of physics are the same for all inertial observers, that is, observers who are moving at a constant velocity relative to each other. ‘Observers’ in this sense are also the frames of reference themselves, or they are persons of zero mass and volume making measurements from a stationary position in a coordinate system. These observers need not be conscious beings.
Special relativity allows objects to have mass but not gravity. Also, it always requires a flat geometry—that is, a Euclidean geometry for space and a Minkowskian geometry for spacetime. General relativity does not have those restrictions. General relativity is a very specific theory of gravity, assuming the theory is supplemented by a specification of the distribution of matter-energy at some time. Both the special and general theory imply that Newton’s two main laws of F = ma and F = GmM/r2 hold only approximately.
Special relativity is not a specific theory but rather is a general framework for theories, and it is not a specific version of general relativity. Nor is general relativity a generalization of special relativity. The main difference between the two is that, in general relativity, spacetime does not simply exist passively as a background arena for events. Instead, spacetime is dynamical in the sense that changes in the distribution of matter and energy in any region of spacetime are directly related to changes in the curvature of spacetime in that region (though not necessarily vice versa).
Unlike classical theories, general relativity is geometric. What this means is that when an artillery shell flies through the air and takes a curved path in space relative to the ground because of a gravitational force acting upon it, what is really going on is that the artillery shell is taking a geodesic or the straightest path of least energy in spacetime, which is a curved path as viewed from a higher space dimension. That is why gravity or gravitational attraction is not a force but rather a curvature, a curvature of spacetime.
The theory of relativity is generally considered to be a theory based on causality:
One can take general relativity, and if you ask what in that sophisticated mathematics is it really asserting about the nature of space and time, what it is asserting about space and time is that the most fundamental relationships are relationships of causality. This is the modern way of understanding Einstein’s theory of general relativity….If you write down a list of all the causal relations between all the events in the universe, you describe the geometry of spacetime almost completely. There is still a little bit of information that you have to put in, which is counting, which is how many events take place…. Causality is the fundamental aspect of time. (Lee Smolin).
(An aside for the experts: The theory of relativity requires spacetime to have at least four dimensions, not exactly four dimensions. Technically, any spacetime, no matter how many dimensions it has, is required to be a differentiable manifold with a metric tensor field defined on it that tells what geometry it has at each point. General relativistic spacetimes are manifolds built from charts involving open subsets of R4. General relativity does not consider a time to be a set of simultaneous events that do or could occur at that time; that is a Leibnizian conception. Instead, general relativity specifies a time in terms of the light cone structures at each place. A light cone at a spacetime point specifies what events could be causally related to that point, not what events are causally related to it.)
Relativity theory implies time is a continuum of instantaneous times that is free of gaps just like a mathematical line. This continuity of time was first emphasized by the philosopher John Locke in the late seventeenth century, but it is meant here in a more detailed, technical sense that was developed for calculus only toward the end of the 19th century.
According to both relativity theory and quantum theory, time is not discrete or quantized or atomistic. Instead, the structure of point-times is a linear continuum with the same structure as the mathematical line or the real numbers in their natural order. For any point of time, there is no next time because the times are packed together so tightly. Time’s being a continuum implies that there is a non-denumerably infinite number of point-times between any two non-simultaneous point-times. Some philosophers of science have objected that this number is too large, and we should use Aristotle’s notion of potential infinity and not the late 19th century notion of a completed infinity. Nevertheless, accepting the notion of an actual nondenumerable infinity is the key idea used to solve Zeno’s Paradoxes and to remove inconsistencies in calculus.
The fundamental laws of physics assume the universe is a collection of point events that form a four-dimensional continuum, and the laws tell us what happens after something else happens or because it happens. These laws describe change but do not themselves change. At least that is what laws are in the first quarter of the twenty-first century, but one cannot know a priori that this is always how laws must be. Even though the continuum assumption is not absolutely necessary to describe what we observe, so far it has proved to be too difficult to revise our theories in order to remove the assumption and retain consistency with all our experimental data. Calculus has proven its worth.
No experiment is so fine-grained that it could show times to be infinitesimally close together, although there are possible experiments that could show the assumption to be false if the graininess of time were to be large enough.
Not only is there much doubt about the correctness of relativity in the tiniest realms, there is also uncertainty about whether it works differently on cosmological scales than it does at the scale of atoms, houses, and solar systems, but so far there are no rival theories that have been confirmed. A rival theory intended to incorporate what is correct about the quantum realm is often called a theory of quantum gravity.
Einstein claimed in 1916 that his general theory of relativity needed to be replaced by a theory of quantum gravity. Subsequent physicists generally agree with him, but that theory has not been found so far. A great many physicists of the 21st century believe a successful theory of quantum gravity will require quantizing time so that there are atoms of time. But this is just an educated guess.
If there is such a thing as an atom of time and thus such a thing as an actual next instant and a previous instant, then an interval of time cannot be like an interval of the real number line because no real number has a next number or a previous number. It is conjectured that, if time were discrete, then a good estimate for the duration of an atom of time is 10-44 seconds, the so-called Planck time. No physicist can yet suggest a practical experiment that is sensitive to this tiny scale of phenomena. For more discussion, see (Tegmark 2017).
The special and general theories of relativity imply that to place a reference frame upon spacetime is to make a choice about which part of spacetime is the space part and which is the time part. No choice is objectively correct, although some choices are very much more convenient for some purposes. This relativity of time, namely the dependency of time upon a choice of reference frame, is one of the most significant philosophical implications of both the special and general theories of relativity.
Since the discovery of relativity theory, scientists have come to believe that any objective description of the world can be made only with statements that are invariant under changes in the reference frame. Saying, “It is 8:00” does not have a truth value unless a specific reference frame is implied, such as one fixed to Earth with time being the time that is measured by our civilization’s standard clock. This relativity of time to reference frames is behind the remark that Einstein’s theories of relativity imply time itself is not objectively real whereas spacetime is.
Regarding relativity to frame, Newton would say that if you are seated in a vehicle moving along a road, then your speed relative to the vehicle is zero, but your speed relative to the road is not zero. Einstein would agree. However, he would surprise Newton by saying the length of your vehicle is slightly different in the two reference frames, the one in which the vehicle is stationary and the one in which the road is stationary. Equally surprising to Newton, the duration of the event of your drinking a cup of coffee while in the vehicle is slightly different in those two reference frames. These relativistic effects are called space contraction and time dilation, respectively. So, both length and duration are frame dependent and, for that reason, say physicists, they are not objectively real characteristics of objects. Speeds also are relative to reference frame, with one exception. The speed of light in a vacuum has the same value c in all frames that are allowed by relativity theory. Space contraction and time dilation change in tandem so that the speed of light in a vacuum is always the same number.
Relativity theory allows great latitude in selecting the classes of simultaneous events, as shown in this diagram. Because there is no single objectively-correct frame to use for specifying which events are present and which are past—but only more or less convenient ones—one philosophical implication of the relativity of time is that it seems to be easier to defend McTaggart’s B theory of time and more difficult to defend McTaggart’s A-theory that implies the temporal properties of events such as “is happening now” or “happened in the past” are intrinsic to the events and are objective, frame-free properties of those events. In brief, the relativity to frame makes it difficult to defend absolute time.
Relativity theory challenges other ingredients of the manifest image of time. For two point-events A and B common sense says they either are simultaneous or not, but according to relativity theory, if A and B are distant from each other and occur close enough in time to be within each other’s absolute elsewhere, then event A can occur before event B in one reference frame, but after B in another frame, and simultaneously with B in yet another frame. No person before Einstein ever imagined time has such a strange feature.
The special and general theories of relativity provide accurate descriptions of the world when their assumptions are satisfied. Both have been carefully tested. One of the simplest tests of special relativity is to show that the characteristic half-life of a specific radioactive material is longer when it is moving fast.
The special theory does not mention gravity, and it assumes there is no curvature to spacetime, but the general theory requires curvature in the presence of mass and energy, and it requires the curvature to change as their distribution changes. The presence of gravity in the general theory has enabled the theory to be used to explain phenomena that cannot be explained with either special relativity or Newton’s theory of gravity or Maxwell’s theory of electromagnetism.
Because of the relationship between spacetime and gravity, the equations of general relativity are much more complicated than are those of special relativity. But general relativity assumes the equations of special relativity hold at least in all infinitesimal regions of spacetime.
To give one example of the complexity just mentioned, the special theory clearly implies there is no time travel to events in one’s own past. Experts do not agree on whether the general theory has this same implication because the equations involving the phenomena are too complex for them to solve directly. Because of the complexity of Einstein’s equations, all kinds of tricks of simplification and approximation are needed in order to use the laws of the theory on a computer for all but the simplest situations. Approximate solutions are a practical necessity.
Regarding curvature of time and of space, the presence of mass at a point implies intrinsic spacetime curvature at that point, but not all spacetime curvature implies the presence of mass. Empty spacetime can still have curvature, according to general relativity theory. This unintuitive point has been interpreted by many philosophers as a good reason to reject Leibniz’s classical relationism. The point was first mentioned by Arthur Eddington.
Two accurate, synchronized clocks do not stay synchronized if they undergo different gravitational forces. This is a second kind of time dilation, in addition to dilation due to speed. So, a correct clock’s time depends on the clock’s history of both speed and gravitational influence. Gravitational time dilation would be especially apparent if a clock were to approach a black hole. The rate of ticking of a clock approaching the black hole slows radically upon approach to the horizon of the hole as judged by the rate of a clock that remains safely back on Earth. This slowing is sometimes misleadingly described as “time slowing down.” After a clock falls through the event horizon, it can still report its values to Earth, and when it reaches the center of the hole not only does it stop ticking, but it also reaches the end of time, the end of its proper time.
The general theory of relativity theory has additional implications for time. It implies that spacetime can curve or warp. Whether it curves into a fifth dimension is unknown, but it definitely curves as if it were curving into a fifth dimension. In 1948-9, the logician Kurt Gödel discovered radical solutions to Einstein’s equations, solutions in which there are what are called “closed time-like curves” in graphical representations of spacetime. The unusual curvature is due to the rotation of all the matter throughout Gödel’s universe. As one progresses forward in time along one of these curves, one arrives back at one’s starting point—thus, backward time travel! There is no empirical evidence that our own universe has this rotation. Some experts in relativity theory are not convinced by Gödel’s work that time travel is possible in any universe.
Here is Einstein’s reaction to Gödel’s work on time travel:
Kurt Gödel’s essay constitutes, in my opinion, an important contribution to the general theory of relativity, especially to the analysis of the concept of time. The problem involved here disturbed me already at the time of the building of the general theory of relativity, without my having succeeded in clarifying it.
Let’s explore the microstructure of time in more detail while repeating a few points that have already been made within the article. In mathematical physics that is used in both relativity theory and quantum theory, the ordering of instants by the happens-before relation of temporal precedence is complete in the sense that there are no gaps in the sequence of instants. Any interval of time is a continuum, so the points of time form a linear continuum. Unlike physical objects, physical time and physical space are believed to be infinitely divisible—that is, divisible in the sense of the actually infinite, not merely in Aristotle’s sense of potentially infinite. Regarding the density of instants, the ordered instants are so densely packed that between any two there is a third so that no instant has a very next instant. Regarding continuity, time’s being a linear continuum implies that there is a nondenumerable infinity of instants between any two non-simultaneous instants. The rational number line does not have so many points between any pair of different points; it is not continuous the way the real number line is, but rather contains many gaps. The real numbers such as pi and the square root of two help to fill the gaps.
The actual temporal structure of events can be embedded in the real numbers, at least locally, but how about the converse? That is, to what extent is it known that the real numbers can be adequately embedded into the structure of the instants, at least locally? This question is asking for the justification of saying time is not discrete, that is, not atomistic. The problem here is that the shortest duration ever measured is about 250 zeptoseconds. A zeptosecond is 10-21 second. For times shorter than about 10-43 second, which is the physicists’ favored candidate for the duration of an atom of time, science has no experimental grounds for the claim that between any two events there is a third. Instead, the justification of saying the reals can be embedded into the structure of the instants is that (i) the assumption of continuity is very useful because it allows the mathematical methods of calculus to be used in the physics of time; (ii) there are no known inconsistencies due to making this assumption; and (iii) there are no better theories available. The qualification earlier in this paragraph about “at least locally” is there in case there is time travel to the past. A circle is continuous, and one-dimensional, but it is like the real numbers only locally.
One can imagine two empirical tests that would reveal time’s discreteness if it were discrete—(1) being unable to measure a duration shorter than some experimental minimum despite repeated tries, yet expecting that a smaller duration should be detectable with current equipment if there really is a smaller duration, and (2) detecting a small breakdown of Lorentz invariance. But if any experimental result that purportedly shows discreteness is going to resist being treated as a mere anomaly, perhaps due to error in the measurement apparatus, then it should be backed up with a confirmed theory that implies the value for the duration of the atom of time. This situation is an instance of the kernel of truth in the physics joke that no observation is to be trusted until it is backed up by theory.
It is commonly remarked that, according to relativity theory, nothing can go faster than c, the speed of light, not even the influence of gravity. The remark needs some clarification, else it is incorrect. Here are three ways to go faster than the speed c. (1) First, the medium needs to be specified. c is the speed of light in a vacuum. The speed of light in certain crystals can be much less than c, say 40 miles per hour, and if so, then a horse outside the crystal could outrun the light beam. (2) Second, the limit c applies only to objects within space relative to other objects within space, and it requires that no object pass another object locally at faster than c. However, the general theory of relativity places no restrictions on how fast space itself can expand. So, two galaxies can fly apart from each other at faster than the speed c of light if the intervening space expands sufficiently rapidly. (3) Imagine standing still outside on the flat ground and aiming your (ideal, perfectly narrow beam) laser pointer forward and parallel to the ground. Now change the angle in order to aim the pointer down at your feet. During that process of changing the angle, the point of intersection of the pointer and the tangent plane of the ground will move toward your feet faster than the speed c. This does not violate relativity theory because the point of intersection is merely a geometrical object, not a physical object, so its speed is not restricted by relativity theory.
In 1916, Einstein claimed that his theory implies gravitational waves would be produced by any acceleration of matter. Drop a ball from the Leaning Tower of Pisa, and this will shake space-time and produce ripples that will emanate in all directions from the Tower. The existence of these ripples was confirmed in 2015 by the LIGO observatory (Laser Interferometer Gravitational-Wave Observatory) when it detected ripples caused by the merger of two black holes.
This sub-section has emphasized time and space, but according to relativity theory it is not just time and space that are relative. So are energy and mass. The energy you measure for some phenomenon differs depending on how fast you move and in what direction.
In addition to relativity theory, the other fundamental theory of physics is quantum mechanics. According to the theory, the universe is fundamentally quantum. What this means is that everything fluctuates randomly. Kip Thorne gives this helpful example:
When we use high-precision instruments to look at tiny things, we see big fluctuations. The location of an electron inside an atom fluctuates so rapidly and so randomly, that we can’t know where the electron is at any moment of time. The fluctuations are as big as the atom itself. That’s why the quantum laws of physics deal with probabilities for where the electron is….
Quantum mechanics was developed in the late 1920s. At that time, it was applied to particles and not to fields. In the 1970s, it was successfully applied to quantum fields via the new theory called “quantum field theory.” There is considerable agreement among the experts that quantum mechanics and quantum field theory have deep implications about the nature of time, but there is little agreement on what those implications are.
Time is a continuum in quantum mechanics, just as it is in all fundamental classical theories of physics, but change over time is treated in quantum mechanics very differently than in all previous theories—because of quantum discreteness and because of discontinuous wave function collapse during measurement with a consequent loss of information.
First, let’s consider the discreteness. This discreteness is not shown directly in the equations, but rather in two other ways. (1) For any wave, according to quantum mechanics, there is a smallest possible amplitude it can have, called a “quantum.” Smaller amplitudes simply do not occur. As Hawking quipped: “It is a bit like saying that you can’t buy sugar loose in the supermarket, it has to be in kilogram bags.” (2) The possible solutions to the equations of quantum mechanics form a discrete set, not a continuous set. For example, the possible values of certain variables such as energy states of an electron within an atom are allowed by the equations to have values that change to another value only in multiples of minimum discrete steps in a shortest time. A single step is sometimes called a “quantum leap.” For example, when applying the quantum equation to a world containing only a single electron in a hydrogen atom, the solutions imply the electron can have -13.6 electron volts of energy or -3.4 electron volts of energy, but no value between those two. This illustrates how energy levels are quantized. However, in the equation, the time variable can change continuously and thus have any of a continuous range of real-number values.
Quantum mechanics is our most successful theory in all of science. One success is that the theory has been used to predict the measured value of the anomalous magnetic moment of the electron extremely precisely and accurately. The predicted value, expressed in terms of a certain number g, is the real number:
g/2 = 1.001 159 652 180 73…
Experiments have confirmed this predicted value to this many decimal places. This accuracy of one part in a trillion is analogous to measuring the distance to a footprint on the moon to within a hair’s width. No similar feat of precision and accuracy can be accomplished by any other theory of science.
The variety of phenomena that quantum mechanics can be used to successfully explain is remarkable. For four examples, it explains (i) why you can see through a glass window but not a potato, (ii) why the Sun has lived so long without burning out, (iii) why atoms are stable so that the negatively-charged electrons do not spiral into the positively-charged nucleus, and (iv) why the periodic table of elements has the structure and numerical values it has. Without quantum mechanics, these four facts (and many others) must be taken to be brute facts of nature.
Regarding the effect of quantum theory on ontology, the world’s potatoes, galaxies and brains have been considered by a number of twentieth-century philosophers to be just different mereological sums of particles, but the majority viewpoint among philosophers of physics in the twenty-first century is that potatoes, galaxies and brains are, instead, fairly stable patterns over time of interacting quantized fields. Also, the 20th century debate about whether an electron is a point object or an object with a small, finite width has been settled by quantum field theory. It is neither. An electron takes up all of space; it is a “bump” or “packet of waves” with a narrow peak but that actually that trails off to trivially lower and lower amplitude throughout the electron field. The electron field itself fills all of space. A sudden disturbance in a field will cause wave packets to form, thus permitting particle creation. In short, a particle is an epiphenomenon of fields.
Scientists sometimes say “Almost everything is made of quantum fields.” They mean everything physical is made of quantum fields except gravity. Cows and electrons are made of quantum fields. But this is not claiming that the physicists have a solution to the notorious ontological problems of philosophy such as what songs, numbers, and chess games are made of.
Quantum mechanics is well tested and very well understand mathematically, yet it is not well understood intuitively or informally or philosophically or conceptually. One of the founders of quantum field theory, Richard Feynman, said he did not really understand his own theory. Surprisingly, physicists still do not agree on the exact formulation of the theory and how it should be applied to the world.
Three, among many, popular attempts to explain quantum mechanics and to make it be more precise are the Copenhagen interpretation, the hidden variables interpretation, and the many-worlds interpretation. The three are described below. They are proposed answers to the question, “What is really going on?” Because these interpretations have different physical principles and can make different experimental predictions, they are actually competing theories. That is also why there is no agreement on what the axioms of quantum mechanics are if it were ever to be formalized and axiomatized. For much of the history of the 20th century, many physicists resisted the need to address the question “What is really going on?” Their mantra was “Shut up and calculate” and do not explore the philosophical questions involving quantum mechanics. Yet the competing interpretations of quantum mechanics are the result of a deep disagreement about what once were considered to be philosophical questions. Turning away from a head-in-the-sand approach to quantum mechanics, Andrei Linde, co-discoverer of the theory of inflationary cosmology, said, “We [theoretical physicists] need to learn how to ask correct questions, and the very fact that we are forced right now to ask…questions that were previously considered to be metaphysical, I think, is to the great benefit of all of us.”
Quantum mechanics began as a non-relativistic particle theory in the 1920s. It now includes quantum field theory, which is quantum mechanics applied to quantum fields and obeying the laws of special relativity, but not necessarily general relativity. Most physicists believe what had once been called a “particle” is really a group of vibrations in a field, with any single vibration filling all of space. The electron is called a particle, but it is really a wave packet of a wave that vibrates a million billion times every second and has a localized peak in amplitude but is nearly zero amplitude throughout the rest of space. If we use a definition that requires a fundamental particle to be an object with a precise, finite location, then quantum mechanics now implies there are no fundamental particles. For continuity with the past usage, particle physicists do still call themselves “particle physicists” and say they study “particles” and “the particles’ positions,” and other “particle behavior”; but they know this is not what is really going on. These terms are not intended to be taken literally, nor used in the informal sense of ordinary language. The particle language, though, is very often useful pretense because it is good enough for many scientific purposes such as in Feynman diagrams of quantum field theory to simplify what would otherwise be an enormously complex description requiring solutions of thousands of integrals.
Max Born, one of the fathers of quantum mechanics, first suggested interpreting quantum waves as being waves of probability. As Stephen Hawking explains it:
In quantum mechanics, particles don’t have well-defined positions and speeds. Instead, they are represented by what is called a wave function. This is a number at each point of space. The size of the wave function gives the probability that the particle will be found in that position. The rate at which the wave function varies from point to point gives the speed of the particle. One can have a wave function that is very strongly peaked in a small region. This will mean that the uncertainty in position is small. But the wave function will vary very rapidly near the peak, up on one side and down on the other. Thus the uncertainty in the speed will be large. Similarly, one can have wave functions where the uncertainty in the speed is small but the uncertainty in the position is large.
The wave function contains all that one can know of the particle, both its position and its speed. If you know the wave function at one time, then its values at other times are determined by what is called the Schrödinger equation. Thus one still has a kind of determinism, but it is not the sort that Laplace envisaged (Hawking 2018, 95-96).
As quantum mechanics is typically understood, if we want to describe the behavior of a system over time, then we start with its initial state, namely the wave function Ψ(x,t0) for places x and a particular time t0, and we insert this wave function into the Schrödinger wave equation that says how the wave function (that is, the state) changes over time. That equation is the partial differential equation:
” />
i is the square root of negative one. h-bar is Planck’s constant divided by 2π, and H is the Hamiltonian operator. That output can be used (with some manipulation) to show the probability p(x,t) that a certain particle will be measured to be at place x at a future time t, if a measurement were made, where
p(x,t) = Ψ*(x,t)Ψ(x,t).
The asterisk is the complex conjugate operator, but let’s not delve any more into the mathematical details.
On most interpretations of quantum mechanics (but not for the Bohm interpretation) fundamental particles are considered to be waves, or, to speak more accurately, they are considered to be “wavicles,” namely entities that have both a wave and a particle nature, but which are never truly either. The electron that once was conceived to be a tiny particle orbiting an atomic nucleus is now better conceived as something larger and not precisely defined spatially, a cloud that completely surrounds the nucleus, a cloud of possible places where the electron could be found if it were to be measured. The electron or any other particle is no longer well-conceived as having a sharply defined trajectory. A wave cannot have a single, sharp, well-defined trajectory. The location and density distribution of the electron cloud around an atom is the product of two opposite tendencies: the electron-qua-wave “wants” to spread out away from the nucleus just as waves in a pond spread out away from the point where the rock fell into the pond, and the electron-qua-particle is a negatively-charged particle moving at high speed around the nucleus and that “wants” to reach the positive electric charge of the nucleus because opposite charges attract.
As goes the electron, so goes the human body. Ontologically, we humans are made of wavicles in quantum fields.
Indeterminism
An especially interesting philosophical question is whether quantum theory implies indeterminism, namely, that the state of the universe at one time does not determine all future states. This is still an open question, but the majority position is that it does imply indeterminism in our universe, and information is not conserved in our universe if measurement processes are included because measurements lose information.
The scientific ideal since Newton’s time has been that information is always conserved. If so, then physical determinism is true. That is, prediction of both all past states and all future states from one present state is theoretically possible, at least for Laplace’s Demon who knows all the laws and has no limits on its computational abilities. But in quantum mechanics, because it includes measurements, there could not be a Laplace’s Demon. Another way of expressing this same point is to point out that all possible available quantum information would not be enough for the Demon.
Let’s explain this a bit more. Consider the difference between practical predictions and theoretically possible predictions. There are three kinds of reasons why physicists cannot practically predict what will happen in the future: (1) It is too tedious of a job to acquire knowledge of the microstate of a system; the microstate is fixed by the locations and momenta of each of its zillions of molecules at the same time. (2) The equations to be used are just too complicated for us to solve even with the aid of computers and even if we were to completely know an initial state at a time. (3) Physical systems are often chaotic. For example, not accounting for a flap of a single butterfly’s wings at some instant in China last month can radically affect the predicted weather next month in France.
These practical obstacles are not obstacles for Laplace’s Demon who has unlimited knowledge of all that can be known, and who has unlimited computational power. Information about forces is not needed because in principle the Newtonian force equation F = ma allows the acceleration a to be computed from the information about velocity. But Laplace’s Demon has new problems. With the rise of quantum mechanics, scientists have had to revise their ideal of scientific determinism, for two reasons that set obstacles in principle and not just in practice: (1) The theory of quantum mechanics implies the wave function evolves deterministically, except during measurements. (2) Heisenberg’s Uncertainty Principle sets limits on the precise values of pairs of variables. For example, the more precise the position of a particle is fixed the less precisely is its velocity fixed.
According to the Copenhagen interpretation of quantum mechanics, which became the orthodox interpretation of the twentieth century, given how things are at some initial time, the Schrödinger equation describes not what will happen at later times, but only the probabilities of what will happen at later times. The probabilities imply indeterminism. The probabilities are not a product of the practical limitations on the human being’s ability to gather all the information about the initial state nor are the probabilities a product of the limits of the computers being used to help make predictions.
The presence of these irremovable probabilities indicates a characteristic randomness at the heart of nature. The probabilities rarely reveal themselves to us in our everyday, macroscopic experience because, at our scale, every value of the relevant probabilities is extremely close to one. Nevertheless, everything fluctuates randomly, even cows and moons.
According to quantum mechanics, a state of a system is described very differently from all earlier theories of physics. It is described using the Schrödinger wave function. The wave is not a wave similar to the electromagnetic wave that exists in our physical space; the wave is a mathematical tool. The state is represented as a vector in an infinite dimensional Hilbert space that is smooth and so continuous. Schrödinger’s wave function describes the state, and Schrödinger’s wave equation describes how the state changes deterministically from one time to another (except for measurements).
The theory of quantum mechanics is tied to physical reality by the Born Rule. This rule says the square of the amplitude of the wave function is proportional to the probability density function. To oversimplify a bit, what this means is that the Born Rule specifies for a time and place not what exactly will happen there then but only the probabilities of this or that happening there then, such as it being 5% probable an electron will be detected in this spatial region when a certain electron-detecting measurement is made at a certain time. So, probability is apparently at the heart of quantum mechanics and thus of Nature itself. For this reason, Max Born and then many other experts recommended thinking of the wave function as a wave of probabilities. Because of these probabilities, if you were to repeat a measurement, then the outcome the second time might be different even if the both initial states are the same. So, the key principle of causal determinism, namely “same cause, same effect,” fails.
The Copenhagen Interpretation
The Copenhagen interpretation is a vague, anti-realist theory containing a collection of beliefs about what physicists are supposed to do with the mathematical formalism of quantum mechanics. This classical interpretation of quantum mechanics was created by Niels Bohr and his colleagues in the 1920s. It is called the Copenhagen interpretation because Bohr taught at the University of Copenhagen. According to many of its advocates, it has implications about time reversibility, determinism, the conservation of information, locality, the principle that causes affect the future and not the past, and the reality of the world independently of its being observed—namely, that they all fail.
Let’s consider how a simple experiment can reveal quantum mechanics’ implications for how we should understand the world in a new way. In the famous double-slit experiment, which is a modern variant on Thomas Young’s double slit experiment that convinced physicists to believe that light was a wave, electrons all having the same energy are repeatedly ‘shot’ toward two parallel slits or openings in an otherwise impenetrable metal plate. Here is a diagram of the experimental set up when the electrons are observed passing through the slits:
The target shows two fuzzy rows where the electrons collide with the optical screen The optical screen that displays the dots behind the plate is similar to a computer monitor that displays a pixel-dot when and where an electron collides with it. The diagram is an aerial view or bird’s eye view of electrons passing through two slits in a plate (such as a piece of steel containing two narrow, parallel slits) and then hitting an optical screen that is behind two slits. The screen is shown twice, once in an aerial view and also in a full frontal view as seen from the left. The electrons can pass through the plate by entering through the plate’s left (upper) slit or right (lower) slit and can ricochet off the edges and each other. The slits are very narrow and are closely spaced apart. Bullets, pellets, BBs, grains of sand, and other macroscopic objects would produce an analogous pattern.
What is especially interesting is that the electrons behave differently depending upon whether they are being observed going through the slits. When observed, the electrons leave a pattern of only two parallel bands (thick, fuzzy lines) on the screen behind the plate as shown in the above diagram, but they behave differently when not observed at the slits.
Here is a diagram of the experimental situation when the electrons are not observed at the slits:
When unobserved, the electrons leave a pattern of many alternating dark and bright bands on the screen as shown in the diagram above. This pattern is very similar to the pattern obtained by diffraction of classical waves such as water waves when they interfere with themselves either constructively or destructively after passing through two nearby openings in a wall at the water’s surface. When a wave’s trough meets a peak at the screen, no dot is produced. When two troughs meet at the screen, the result is a dot. Ditto for two peaks. There are multiple, parallel stripes produced along the screen, but only five are shown in the diagram. Stripes farther from the center of the screen are slightly dimmer. Waves have no problem going through two or more slits simultaneously. Because the collective electron behavior over time looks so much like optical wave diffraction, this is considered to be definitive evidence of electrons behaving as waves. The same pattern of results occurs if neutrons or photons are used in place of electrons.
The other remarkable feature of this experiment is that the pattern of interference is produced even when the electrons are shot one at a time at the plate over several minutes. One would not expect this result because presumably the phenomenon seems to depend on two electrons simultaneously travelling through separate slits and interacting with each other on the other side of the plate. But can an electron interact with itself from a few seconds earlier? The Princeton University physicist John Wheeler answered this question with a “yes,” which astounded his colleagues because this answer implies the present affects the past. But in his 1983 book Quantum Theory and Measurement, Wheeler declared: “Equipment operating in the here and now has an undeniable part in bringing about that which appears to have happened.”
The favored explanation of the double-slit experiment is to assume so-called “wave-particle duality,” namely that a single electron or neutron has both wave and particle properties. When an electron is unobserved, it is a wave that can be in many places at once, but when it is observed it is a particle having a specific location. This mix of two apparently incompatible properties (wave properties and particle properties) is called a “duality,” and the electron is said to behave as a “wavicle.”
Advocates of the Copenhagen Interpretation of quantum mechanics conclude that, when an electron is not observed at the moment of passing through the slits, it passes through both of the two slits, but not in the sense that a tiny bullet-like object is in two places at once but rather in the sense that the experiment’s state is a superposition of two states, one in which the electron goes through the left slit and one in which it goes through the right slit. Any measurement of which slit the electron goes through will “collapse” this superposition and force there to be a state in which the electron acts like a bullet and hits the region expected to be hit by a bullet-like object. The wave function Ψ suddenly becomes shaped like a spike. The term “collapse” means that the physical system abruptly stops being correctly described as having a deterministic evolution according to the Schrödinger equation, the equation of quantum mechanics that describes the evolution of quantum states. As David Alpert describes superposition, the wave function corresponding to a particle at slit A will have a bump near A and be zero everywhere else. The wave function for a particle located at slit B will have a bump near B and be zero everywhere else. The wave function that represents a particle in a superposition of being located at A and being located at B will have a bump at point A and a bump at point B and be zero everywhere else.
Advocates of superposition as a means of explaining the two slit experiment assume that, if it is not known what path is taken by the electron, then it is allowed to do everything possible simultaneously.
This positing of superposition is the most popular assumption, but, as will be explained below, many physicists object to the assumption and prefer to explain the double-slit experiment by positing that during the experiment physical time will “branch” into multiple times as the universe splits into many worlds or universes. In one world, an electron goes through the left slit; but in another world it goes through the right slit. This many-worlds interpretation is described below.
Influenced by Logical Positivism, which was dominant in analytic philosophy in the first half of the twentieth century, some advocates of the Copenhagen interpretation say that our belief that there is something a physical system is doing when it is not being measured is meaningless. In other words, a fully third-person perspective is impossible.
To explain the double-slit experiment, Niels Bohr proposed an instrumentalist interpretation of the world by saying there is no determinate, unfuzzy way the world is when it is not being observed. There is only a cloud of possible values for each property of the system that might be measured. Eugene Wigner, a Nobel Prize winning physicist, promoted the more extreme claim that there exists a determinate, unfuzzy reality only when a conscious being is observing it. This is an anti-realist interpretation that many have claimed was foreshadowed by the writing of Eastern mysticism. The interpretation prompted Einstein, an opponent of mysticism and anti-realism, to ask a supporter of Bohr whether he really believed that the moon exists only when it is being looked at.
Sympathetic to this attitude of Einstein’s, Erwin Schrödinger created his thought experiment about a cat in a windowless box. A vial of poison gas in the box has a 50% probability of being broken during the next minute depending on the result of a quantum event such as the fission of a radioactive uranium atom. If the vial is broken during the next minute, the cat is poisoned and dies. Otherwise it is not poisoned. According to Wigner’s version of the Copenhagen Interpretation, argued Schrödinger, if the box is not observed by a conscious being at the end of the minute, the cat remains in a superposition of two states, the cat being alive and the cat being dead, and this situation can continue for days until someone finally looks into the box. Schrödinger believed this Copenhagen interpretation of the cat in the box is absurd, so the Copenhagen interpretation is therefore false.
The double-slit experiment and the Schrödinger’s cat thought experiment have caused philosophers of physics to disagree about what an object is, what it means for an object to have a location, how an object maintains its identity over time, and whether consciousness of the measurer is required in order to make reality become determinate and “not fuzzy” or “not blurry.” There was speculation that perhaps a device that collapses the wave function could be used as a consciousness detector that would detect whether an insect or a computer has consciousness. Eugene Wigner and John von Neumann were the most influential physicists to suggest that perhaps consciousness collapses the wave function.
Einstein was unhappy with another implication of quantum theory, that one could know in principle everything there is to know about a system of particles, yet know nothing for sure about any part of the system such as the behavior of a single particle.
Reacting to the incompleteness demanded by the Copenhagen interpretation, Einstein proposed that there would be a future discovery of as yet unknown “hidden” variables. These are extra variables or properties that, when taken into account by a revised Schrödinger wave function, would make quantum mechanics be deterministic and thus representationally complete. Einstein believed you would not need probabilities if you had access to the precise values of all the variables affecting a system, including the variables that are currently hidden. In the 1950s, David Bohm agreed with Einstein and went some way in this direction by building a revision of quantum mechanics that has hidden variables and, unlike the Copenhagen Interpretation, has no instantaneous collapse of the wave function during measurement, but his interpretation has not succeeded in moving the needle of scientific opinion because of the difficulty of accounting for quantum field theory.
The Measurement Problem
The quantum measurement problem is the problem of how to understand the process of measurement. It is quite a difficult problem, and it has occupied the very best minds among the community of physicists and philosophers of physics for many years. There has been controversy about whether it is merely a philosophical problem or also a scientific problem.
A measurement is often done by a conscious being, but a measurement in the most general sense of the term includes any interaction with anything external to the system that causes the system’s wave function to collapse into a single state rather than a state of superposition of the states with each indicating a different value for the measurement outcome. During measurement the system collapses into a definite state of whatever observable is being measured. The equations of quantum theory tell the probability for the collapse to a particular state. During the collapse, no laws of physics are changed, but important information is lost. What this means is that from a knowledge of the state of the situation after the measurement (in any single universe), one cannot compute the state before the measurement.
Wouldn’t you like to know the mechanism that produced the value of 4 when your measurement could have had the values of 1, 2, 3, 4, or 5? Quantum theory cannot give you an answer.
Classically, an ideal measurement need not disturb the system being measured. As first suggested by Werner Heisenberg, according to quantum mechanics this classical ideal of measurement is unachievable in principle; experimenters always disturb the system they are measuring, and the measurement causes loss of information. This disturbance happens locally and instantaneously. Also, because of the information loss, there is a fundamental time asymmetry in the measurement process so reversing the process in time need not take you back to the situation before the measurement began.
However, different quantum mechanical interpretations solve the measurement problem differently. According to the Copenhagen interpretation and many other interpretations, any measurement triggers the instantaneous collapse of the system’s quantum state from being a superposition of possible measurement outcomes to a single state with a definite measured outcome. This notion of instantaneous collapse conflicts with the theory of relativity which requires effects to move no faster than the speed of light in a vacuum. Unfortunately, creating an experiment to confirm any claim about the speed of the collapse of the wave function faces the obstacle that no practical measurement can detect such a short interval of time:
Yet what we do already know from experiments is that the apparent speed at which the collapse process sweeps through space, cleaning the fuzz away, is faster than light. This cuts against the grain of relativity in which light sets an absolute limit for speed (Andrew Pontzen).
The Copenhagen interpretation implies that, during the measurement process, the continuous evolution of the wave function halts abruptly, and the wave function “collapses” from a superposition of multiple possible states of the system under investigation to a single state with a single, definite value for whatever is being measured. Using a detector to measure which slit the electron went through in the double-slit experiment is the paradigm example. Before the measurement, the system’s state is a superposition of two states: the electron going through the left slit and the same electron going through the right slit simultaneously. But during the measurement the superposition collapses. If an observer determines which slit the electron goes through, then this interferes with what is being measured, and the interference pattern beyond the slits collapses or disappears and the electrons act like small, discrete particles.
Here is a simple, crude analogy that has been pedagogically helpful. Think of electrons as if they are spinning coins on a table top. They are neither heads up nor tails up until your hand pushes down on the coin, forcing it to have just one of the two possibilities. Your hand activity is the measurement process.
The Copenhagen interpretation implies that a measurement apparatus itself cannot be in a superposition, nor can an observer, nor can a universe. Quantum theory on the Copenhagen interpretation cannot apply to everything because it necessarily must split the universe into a measured part and an unmeasured part, and it can describe only the measured part but not the process of measurement itself nor what is happening when there is no measurement. So, in that sense, this quantum theory is an incomplete theory of nature, as Einstein so often emphasized. Einstein was very dissatisfied with the Copenhagen interpretation’s requirement that, during any measurement, the usual principles of quantum mechanics stop applying. Einstein wanted a quantum theory that describes the world without mentioning measuring instruments or the terms “measurement” and “collapse.” He wanted what is called “completeness.”
In response to this understanding of quantum measurement, the influential physicist John Bell said, “Bohr was inconsistent, unclear, willfully obscure, and right. Einstein was consistent, clear, down-to-earth, and wrong.” Bohr’s style of writing was very Hegelian. He is noted for saying, “Clarity crowds out depth!”
According to the Copenhagen Interpretation, during any measurement the wave function expressing the possible values of the measurement collapses to one with the actual value. The other possibilities are deleted. And quantum information is quickly lost. The measurement process is irreversible. So, during any measurement, from full knowledge of the new state after the measurement, the prior state cannot be deduced, even by Laplace’s Demon. Different initial states may transition into the same final state. So, the following classical principles fail: time reversibility, conservation of information, and determinism.
When a measurement occurs, it is almost correct to explain this as follows: At the beginning of the measurement, the system “could be in any one of various possibilities, we’re not sure which.” Strictly speaking, this is not quite correct. Before the measurement is made, the system is actually in a superposition of multiple states, one for each possible outcome of the measurement, with each outcome having a fixed probability of occurring as determined by the formalism of quantum mechanics; and the measurement itself is a procedure that removes the superposition and realizes just one of those states. Informally, this is sometimes summarized in the remark that measurement turns the situation from fuzzy to definite.
For an instant, a measurement on an electron can say it is there at this specific place, but immediately afterward, due to some new interaction, the electron becomes fuzzy again, and once again there is no single truth about precisely where an electron is, but only a single truth about the probabilities for finding the electron in various places if certain kinds of measurements were to be made.
The measurement problem is really an unsolved scientific problem, not merely a problem of interpretation. Following the lead of Einstein’s complaints in the 1930s, there has been growing dissatisfaction with the Copenhagen’s requirement that, during a measurement of quantum properties, quantum mechanics fails to apply to the measurement situation because of a collapse. Many opponents of the Copenhagen Interpretation have reacted in this way:
In the wake of the Solvay Conference (in 1927), popular opinion within the physics community swung Bohr’s way, and the Copenhagen approach to quantum mechanics settled in as entrenched dogma. It’s proven to be an amazingly successful tool at making predictions for experiments and designing new technologies. But as a fundamental theory of the world, it falls woefully short (Sean Carroll).
George Ellis, co-author with Stephen Hawking of the influential book The Large-Scale Structure of Space-Time, identifies what he believes is a key difficulty with our understanding of collapse during measurement: “Usually, it is assumed that the measurement apparatus does not obey the rules of quantum mechanics, but this [assumption] contradicts the presupposition that all matter is at its foundation quantum mechanical in nature.”
Those who want to avoid having to bring consciousness of the measurer into quantum physics and who want to restore time-reversibility and determinism and conservation of quantum information typically recommend adopting a different interpretation of quantum mechanics that changes how measurement is treated. Einstein had a proposal for an original theory of quantum mechanics, the Hidden Variables interpretation. He hoped that by adding new laws specifying the behavior of so-called “underlying variables” affecting the system, the consequence would be that determinism, time-reversibility, and information conservation would be restored, and there would be no need to speak of a discontinuous collapse of the wave function during measurement. Also, quantum probabilities would be epistemological; they would be caused by our lack of knowledge of the extra variables. Nature herself wouldn’t have any imprecision. Challenging Einstein’s proposal, John Bell showed that any hidden variable assumption designed to make quantum mechanics deterministic would produce the so-called “Bell inequalities.” Later experiments in the 21st century showed that the inequalities fail. So, Einstein’s proposal never gathered much support.
During the twentieth century, the Copenhagen interpretation continued to be accepted as the principal way to understand quantum mechanics, but it has been in decline in the 21st century among experts, if not in all the graduate school textbooks. The main problem is to give a better explanation of the Copenhagen “collapse.” Nevertheless, either the wave function actually does collapse, or else something is happening that makes it look very much as if the wave function collapses. What is this “something”?
The philosophical background of the measurement problem began with the 18th century dispute between rationalists and empiricists. Speaking very loosely, the empiricist wanted to get the observer out of the system being measured, and the rationalist said that the observer is inextricably bound to the system being observed. So, quantum mechanics according to the Copenhagen Interpretation has an original notion of what it is to control an experiment.
The Many-Worlds Interpretation and Branching Time
According to the Many-Worlds interpretation of quantum mechanics, there are no collapses, and anything that can happen according to quantum mechanics in our universe does happen in some universe or other. If at noon you could go to lunch or stay working in your office, then at noon the universe branches into two universes, one in which you go to lunch at noon, and one in which you stay working in your office at noon. It introduces many universes, that is, many worlds, and so it requires a revision in the meaning of the terms “universe,” “world” and “you.”
That is the maximalist interpretation of the Many-World interpretation and the one that has attracted the most attention from philosophers of physics. A more minimalist interpretation does not talk about lunches at noon and restricts itself to talk about changes in quantum states of the microworld such as whether a particle travelling in a certain direction is measured to be in a state of spin up or instead spin down.
The universe according to the Many-Worlds interpretation is deterministic, and information is always conserved; but a single world is not deterministic nor is information conserved there. The Many-Worlds interpretation produces the same probabilities for a prediction in our actual world as does the Copenhagen interpretation. But if Laplace’s Demon were to have access to all the quantum information in the multiverse at one instant, it could tell you the outcome of any measurement made in our world or any other world. Quantum information is conserved in the sum of all worlds, though not in our actual world.
Saying the universe splits into many worlds is a higher-level emergent claim that helps humans understand what happens during certain changes, but actually the universe’s wave function evolves smoothly and continuously over time. Loosely, over time there is more branching, more entanglement, and more entropy.
The Many-Worlds proposal is attractive to many philosophers of physics because it has the virtue that it removes the radical distinction between the measurer and what is measured and replaces it with a continuously evolving wave function for the combined system of measurement instruments plus measurer (for the entire collection of worlds). During a measurement, it will appear to a person within a single world as if there is a collapse of the world’s wave function, but the wave function for the totality of worlds does not actually collapse. The laws of the Many-Worlds interpretation are time-reversible symmetric and deterministic in the sense described above, and they conserve quantum information, and there is no need for the anti-realist stance taken by Bohr.
The principle of the conservation of quantum information must confront the fact that black holes eventually evaporate. There has been much controversy about whether quantum information that falls into the black hole either (i) never gets out or (ii) gets out before and during the evaporation of the black hole. The Copenhagen Interpretation implies the information is lost. The Many Worlds Interpretation implies it is not lost (because it is conserved somewhere among the many universes). Many physicists believe the information gets out by being encoded within the cloud of escaping Hawking radiation that is emitted during the evaporation process.
It is an open question for the Many-World interpretation whether the same fundamental scientific laws hold in all universes. And it is an open question whether to take the universes literally or, instead, to say they are helpful “bookkeeping devices.”
Are all these other worlds far away from our world or close by? This is not a good question. Space exists within a single world, not across worlds.
The Many-Worlds interpretation is frequently called the Everettian interpretation for its founder Hugh Everett III. It implies that, during any measurement (or possible measurement) having some integer number n of possible outcomes, our world splits instantaneously into n copies of itself, each with a different outcome for the measurement. If a measurement can produce any value from 0 to 10, and we measure the value to be “8,” then the counterparts of us who live in the other worlds and who have the same memories as us see an outcome other than “8”. Clearly, the weirdness of the Copenhagen theory has been traded for a new kind of weirdness.
In the Many-Worlds interpretation, there is no access from one world to another. They exist “in parallel” and not within the same physical space, so any two are neither far from nor close to each other. Instead, space exists within a world, each world having its own space. Information is conserved across the worlds, but not within any single world. If we had access to all information about all the many worlds (the collective wave function) and had unlimited computational capacity like Laplace’s Demon, then we could see that the many worlds evolve deterministically and time-reversibly and see that the collective wave function never collapses discontinuously. Unfortunately, nobody can know the exact wave function for the entire multiplicity of worlds. In a single world, the ideally best available information implies only the probability of a measurement outcome, not an exact value of the measurement. So, in this sense, probability and randomness remain at the heart of our own world.
As we have seen above, the Many-Worlds theory does not accept the Copenhagen version of measurement collapse. Instead, it implies that, when a system is measured, all that is required is that the system interact with and become entangled with its environment during the measurement, thereby producing a single value for the measurement. This interaction process is called “decoherence.” So, in the two-slit experiment, the electron does not go into a superposition state but rather the universe splits into two universes, one in which the electron goes through the left slit and a completely separate universe in which the electron goes through the right slit. The many-worlds theory implies measurement is a reversible process, as are all other processes.
Most interactions are strong enough to produce decoherence; so, it takes very careful work to create the kind of interaction that preserves coherence. Preserving coherence is the most difficult goal to achieve in the construction of a quantum computer, and cooling is one of the main techniques used to achieve the goal. Interactions that cause decoherence are called “noise” in a quantum computer.
According to the Many-Worlds theory, the moon is there when it is not being looked at because the moon is always interacting with some particle or other and thereby decohering and, in that sense, getting measured. Decoherence is also why the moon’s quantum properties are not visible to us at our macroscale. Nevertheless, the moon is a quantum object (an object obeying the rules of quantum theory), as are all other objects.
The multiverse of the Many-Worlds theory is a different multiverse from the multiverse of chaotic inflation that is described below in the section about extending the big bang theory. Those universes produced by inflation exist within a single background physical space, unlike in the multiverse of the Many-Worlds theory of quantum mechanics where space exists only within a single world. However, in both kinds of multiverse time is better envisioned, not as linear, but rather as increasingly branching into the times of the new universes. It is extremely likely that there will be no un-branching nor branch removal nor branch fusing. At any time in any universe, that universe had relatively fewer branches in the past, and this feature will continue forever. The future is “a garden of forking paths,” said Jorge Luis Borges. If Leibniz were alive, he might say we live in the best of all possible branches.
Even though every expert agrees on what the wave function is doing mathematically and that it gets new parts when there is an interaction, including a measurement, not every expert wants to say a new part is literally describing a new world; some experts consider this to be ontological overreach. But Carroll’s book Something Deeply Hidden defends the claim that the multiverse theory satisfies Occam’s Razor better than all competitors.
What the Copenhagen theory calls quantum fuzziness or a superposition of states, Everett calls a superposition of many alternate, unfuzzy universes. The reason that there is no problem with energy conservation is that, if a world splits into seven new worlds, then each new world has one-seventh the energy of its parent world.
A principal problem for the many-worlds interpretation is the difficulty of explaining how the concept of a probability measure works across many worlds. For example, it is unclear what it means to say the electron went through the left slit in 50% of the worlds.
Some researchers have suggested there is a problem with showing that the Many-Worlds interpretation is logically consistent with what else we know. Other problems exist such as the fact that experts do not agree on whether the quantum wave function is a representation of reality, or only of our possible knowledge of reality. And there is no consensus on whether we currently possess the fundamental laws of quantum theory, as Everett believed, or instead only an incomplete version of the fundamental laws, as Einstein believed.
Heisenberg’s Uncertainty Principle
In quantum mechanics, various Heisenberg Uncertainty Principles restrict the simultaneous values of pairs of variables, for example, a particle’s position and momentum. The values cannot both be zero at the same time. Another Heisenberg uncertainty principle places the same restriction on time and energy, such as during particle emission or absorption.
The Copenhagen Interpretation presented the uncertainty principle as being about measurement restrictions and about disturbing a system by measuring it. As Sean Carroll emphasizes, this misunderstands the principle. He argues that the Uncertainty Principle is not about measurements, although it has implications about measurements. It is about what states exist, and it says there exists no state in which a particle has a precise position and momentum at the same time. It is not about a limitation on our knowledge of the state; it is not implying there is a precise position and momentum, but we measurers are limited in what we can know about this. The Heisenberg Uncertainty Principle has nothing to do with whether a measurement does or does not disturb a system (by, for example, bombarding it with a photon). The principle is a claim about nature independent of whether a measurement is ever made, says Carroll, and it describes the inherent fuzziness of nature.
Epistemological uncertainty differs from ontological uncertainty. We are referring to epistemological uncertainty when we say, “I am uncertain. I just don’t know.” We are referring to ontological uncertainty when we say, “Things are inherently fuzzy. They are not determinate.” Most theoretical physicists believe the Heisenberg Uncertainty Principle of quantum mechanics is about ontological uncertainty.
Those who prefer epistemological uncertainty often recommend thinking about having a very sharp photograph of a moving ball, such as a tennis ball taken during a tennis match. The photograph provides precise information about where the ball is, but not where it is going or how fast it is moving. On the other hand, think about having a time-lapse photograph showing the ball as a blurry streak. This photograph gives you considerable information about where the ball is going and how fast it is moving, but provides little information about where the ball is at a specific time. On its epistemological interpretation, Heisenberg’s Uncertainty Principle is a constraint saying you can have one of the two photographs but not both. Nature herself “has” both photographs, but your knowledge is restricted to at best one photograph.
Experts are still unsure how well they understand quantum measurement, and they worry they may have to alter the story if quantum measurement becomes better understood.
Quantum uncertainties of measurement do not appear in a single measurement. They are detected over a collection of measurements because any single measurement has (in principle and not counting practical measurement error) a precise value and is not “fuzzy” or uncertain or indeterminate. Repeated measurements necessarily produce a spread in values that reveal the fuzzy, wavelike characteristics of the phenomenon being measured, and these measurements collectively obey the Heisenberg inequality. Heisenberg himself thought of his uncertainty principle as being about how the measurer necessarily disturbs the measurement and not about how nature itself does not have definite values.
The Heisenberg Uncertainty Principle about energy is commonly said to be a loan agreement with nature in which borrowed energy must be paid back. There can be temporary violations in the classical law of the conservation of energy as the borrowing takes place. The classical law of conservation says the total energy of a closed and isolated system is always conserved and can only change its form but not disappear or increase. For example, a falling rock has kinetic energy of motion during its fall to the ground, but when it collides with the ground, the kinetic energy changes its form to extra heat in the ground, extra heat in the rock, and the sound energy of the collision. No energy is lost in the process. This classical law can be violated in two ways: (1) if the universe (or the isolated system being studied) expands in volume, and (2) by being violated by an amount –E for a time –t, as described by Heisenberg’s Uncertainty Principle. The classical law is often violated for very short time intervals and is less likely to be violated as the time interval increases. Some philosophers of physics have described this violation as something coming from nothing and as something disappearing into nothing, which is misleading to people who use these terms in their informal sense instead of the sense intended by quantum field theory. The quantum “nothing” or quantum vacuum, however, is not really what many philosophers call “nothing.” Quantum field theory (rather than quantum mechanics) does contain a more sophisticated law of conservation of energy that has no violations and that accounts for the deviations from the classical law.
Virtual Particles, Wormholes, and Quantum Foam
Quantum theory and relativity theory treat the vacuum radically differently from each other. Quantum theory’s vacuum contains virtual particles and probably a “foam” of them. Quantum theory requires virtual particles to be created out of the quantum vacuum via spontaneous, random quantum fluctuations—due to Heisenberg’s Uncertainty Principles. Because of this behavior, no quantum field can have a zero value at any place for very long.
Despite their name, virtual particles are real, but they are unusual, because they borrow energy from the vacuum and pay it back very quickly, so quickly that they cannot be detected with our instruments. What happens is that, when a pair of energetic virtual particles—say, an electron and anti-electron—form from “borrowed” energy in the vacuum, the two exist for a short time before being annihilated or reabsorbed, thereby giving back their borrowed energy. The greater the energy of the virtual pair, the shorter the duration of their existence before being reabsorbed. The more energy that is borrowed, the quicker it is paid back.
There are never any isolated particles because elementary particles are surrounded by a cloud of virtual particles. Many precise experiments can be explained only by assuming there is this cloud. Without assuming the existence of virtual particles, quantum theory would not be able to predict this precise value of the electron’s magnetic moment
g/2 = 1.001 159 652 180 73…
that agrees to this many significant digits with our most careful measurements. So, physicists are confident in the existence of virtual particles.
An electron is continually surrounded by virtual photons of temporarily borrowed energy. Some virtual photons exist long enough to produce electron-positron pairs, and these buffet the electron they came from. This buffeting produces the so-called “Lamb shift” of energy levels within an atom.
Virtual particles are not exactly particles like the other particles of the quantum fields. Both are excitations of these fields, and they both have gravitational effects and thus effects on time, but virtual particles are not equivalent to ordinary quantum particles, although the longer lived ones are more like ordinary particle excitations than the short lived ones.
Virtual particles are just a way to calculate the behavior of quantum fields, by pretending that ordinary particles are changing into weird particles with impossible energies, and tossing such particles back and forth between themselves. A real photon has exactly zero mass, but the mass of a virtual photon can be absolutely anything. What we mean by “virtual particles” are subtle distortions in the wave function of a collection of quantum fields…but everyone calls them particles [in order to keep their names simple] (Carroll 2019, p. 316).
The physicist John Wheeler suggested that the ultramicroscopic structure of spacetime for periods on the order of the Planck time (about 5.4 x 10-44 seconds) or less in regions about the size of the Planck length (about 1.6 x 10-35 meters) is a quantum foam of rapidly changing curvature of spacetime, with micro-black-holes and virtual particle-pairs and perhaps wormholes rapidly forming and dissolving. There is chaos in the tiniest realms if Wheeler is correct.
The Planck time is the time it takes light to travel a Plank length in a vacuum. The terms Planck length and Planck time were inventions of Max Planck in the early twentieth-century during his quest to find basic units of length and time that could be expressed in terms only of universal constants. He defined the Planck unit of time algebraically as √(ħG/c5). √ is the square root symbol. ħ is Planck’s constant in quantum theory divided by 2π; G is the gravitational constant in Newtonian mechanics; c is the speed of light in a vacuum in relativity theory. Three different theories of physics are tied together in this one expression. The Planck time is a theoretically interesting unit of time, but not a practical one. No known experimental procedure can detect events that are this brief.
Positive but indirect evidence for the existence of quantum foam comes from careful measurements of the Casimir Effect between two mirrors or conducting plates, in which as they get very near to each other there is a new force that starts pushing them even closer. But Kip Thorne warned us in 2014: “Back in the 1950s John Wheeler gave persuasive arguments for quantum foam, but there is now evidence that the laws of quantum gravity might suppress the foam and might even prevent it from arising.”
Another remarkable, but speculative, implication about virtual particles is that it seems to many physicists that it is possible in principle to connect two black holes into a wormhole and then use the hole for time travel to the past. “Vacuum fluctuations can create negative mass and negative energy and a network of wormholes that is continually fluctuating in and out of existence…. The foam is probabilistic in the sense that, at any moment, there is a certain probability the foam has one form and also a probability that it has another form, and these probabilities are continually changing” (Kip Thorne). The foam process can create a negative energy density and thus create exotic matter whose gravity repels rather than attracts, which is the key ingredient needed to widen a wormhole and turn it into a time machine for backward time travel that would be usable by human beings. A wormhole is a tunnel through space and time from one place to another in which your travel through the hole could allow you to reach a place before anyone moving at the speed of light or less, but not through the hole, had time to get there. Without sufficient negative gravitational force in its neck connecting its two opening holes, it has a natural tendency to close its neck, that is, “pinch off” to a width with zero diameter. For a popular-level discussion of how to create this real time machine as opposed to a science fiction time machine, see the book The Warped Side of Our Universe: An Odyssey Through Black Holes, Wormholes, Time Travel, and Gravitational Waves by Kip Thorne and Lia Halloran, 2023. Thorne says: “One way to make a wormhole, where previously there was none, is to extract it from the quantum foam…, enlarge it to human size or larger, and thread it with exotic matter to hold it open.” Later in the present article, there is more explanation of the negative gravitational energy of this exotic matter.
Another controversial implication about virtual particles is that there is a finite but vanishingly small probability that a short-lived potato or brain will spontaneously fluctuate out of the vacuum in your closet tomorrow. If such an improbable event were to happen, many persons would be apt to say that a miracle had happened, and God had temporarily intervened and suspended the laws of science.
Entanglement and Non-Locality
Schrodinger introduced the term “entanglement” in 1935 to describe what is perhaps the strangest feature of quantum mechanics. Entanglement is strange, yet it is an experimentally well-confirmed feature of reality.
When two particles become entangled, they can no longer be treated independently of each other because their properties are tied together even if they move a great distance away from each other. Normally we can fully describe an object without referring to objects elsewhere, but this feature called “locality” fails in the case of entanglement. Locality is the feature that implies an object is influenced directly only by its immediate surroundings. The distant sun influences our skin on Earth, but not directly—only indirectly by sending photons that eventually collide with our skin. Most versions of quantum theory imply that locality fails to be a universal feature of our universe. Einstein was bothered more by quantum mechanics’ non-local entanglement than by either its indeterminism or its uncertainty principles or the breakdown of the idea that knowing a whole you still do not know any part for sure. He was the first person to clearly see that quantum mechanics is local but incomplete or else complete but non-local. He hoped for the incompleteness.
Failure of locality of a system arises only during measurement. When measurement is not involved, quantum theory is a local theory.
If two particles somehow become entangled, this does not mean that, if you move one of them, then the other one moves, too. It is not that kind of entanglement. If you act on one member of an entangled pair, nothing happens to the other member. So, entanglement cannot be used as a means of message transfer. Entanglement is only a correlation. Normally, when an entangled pair is created in a laboratory, the two are very close in space to each other, but they can stay entangled as they move far from each other. A quantum measurement by Alice of a certain property of one member of an entangled pair of particles will instantaneously or nearly instantaneously determine the value of that property that would be found by Bob if he were to make a similar measurement on the other member of the pair, no matter how far away the two particles are from each other and no matter the duration between the two acts of measuring. So, Alice and Bob’s measurement processes can be space-like separated from each other.
In a letter to Max Born in 1947, Einstein referred to non-locality as “spooky action at a distance.” Most physicists still use the term, but actually it is not a causal action. It is a spooky correlation over a distance. It is a way of propagating definiteness. Non-locality is very unintuitive, but the most favored explanation of the experimental data is that neither particle has a definite value for the property to be measured until after the first particle is measured, after which the second one’s value is fixed instantaneously. This implies that quantum theory does not allow Alice to know the specific value of her measurement before it is made, so she cannot know in advance what value Bob will measure.
In 1935, Erwin Schrödinger said:
Measurements on (spatially) separated systems cannot directly influence each other—that would be magic.
Einstein agreed. Yet the magic seems to exist. “I think we’re stuck with non-locality,” said John Bell.
Entanglement comes in degrees. Ontologically, the key idea about quantum entanglement is that if a particle becomes entangled with one or more other particles within the system, then it loses some of its individuality. The whole system becomes more than the sum of its parts. The state of an entangled group of particles is not determined by the sum of the states of each separate particle. In that sense, quantum mechanics has led to the downfall of classical reductionism.
Many physicists believe entanglement is linked to the emergence of space in the sense that if we were to know the degree of entanglement between two quantum particles, then we could derive the distance between them. Some of them speculate that time also is a product of quantum entanglement. To settle that issue, entanglement needs to be better understood.
The philosopher David Albert has commented that “In order to make sense of this ‘instaneity’ of the quantum correlation, it looks as if there is a danger that one may require an absolute notion of simultaneity of exactly the kind that the special theory of relativity denied.” The philosopher Huw Price speculated in (Price 1996) that nonlocal processes are really backwards causal processes with effects occurring before their causes. Juan Maldacena has conjectured that entanglement of two objects is really a wormhole connecting the two. Leonard Susskind has emphasized that it is not just particles that can become entangled. Parts of space can be entangled with each other, and he conjectures that it is this entanglement that “holds space together.”
Quantum tunneling
Quantum mechanics allows tunneling in the sense that a particle can penetrate through a potential energy barrier that is higher in energy than the particle should be able to penetrate according to classical theory. For example, according to quantum mechanics, there is a chance that, if a rock is sitting quietly in a valley next to Mt. Everest, it will leave the valley spontaneously and pass through the mountain and appear intact on the other side. The probability is insignificant but not zero. It is an open question in physics as to how long it takes the object to do the tunneling. Some argue that the speed of the tunneling is faster than light speed. The existence of quantum tunneling is accepted because it seems to be needed to explain some radioactive decays, and some chemical bonds, and how sunlight is produced by protons in our sun overcoming their mutual repulsion and fusing.
Approximate Solutions
Like the equations of the theory of relativity, the equations of quantum mechanics are very difficult to solve and use except in very simple situations. The equations cannot be used directly in digital computers. There have been many Nobel-Prize winning advances in chemistry by finding methods of approximating quantum theory in order to simulate the results of chemical activity within a computer. For one example, Martin Karplus won the Nobel Prize for chemistry in 2013 for creating approximation methods for computer programs that describe the behavior of the retinal molecule in our eye’s retina. The molecule has almost 160 electrons, but he showed that, for describing how light strikes the molecule and begins the chain reaction that produces the electrical signals that our brain interprets during vision, chemists can successfully use an approximation; they need to pay attention only to the molecule’s outer electrons.
Emergent Time and Quantum Gravity
There has been much speculation about the role of time in a theory of quantum gravity, a theory of quantum mechanics that reconciles its differences with general relativity. Perhaps the new theory will need to make use of special solutions to the Schrödinger equation that normally are not discussed—solutions describing universes that don’t evolve at all. For these solutions, there is no time, and the quantum state is a superposition of many different classical possibilities:
In any one part of the state, it looks like one moment of time in a universe that is evolving. Every element in the quantum superposition looks like a classical universe that came from somewhere, and is going somewhere else. If there were people in that universe, at every part of the superposition they would all think that time was passing, exactly as we actually do think. That’s the sense in which time can be emergent in quantum mechanics…. This kind of scenario is exactly what was contemplated by physicists Stephen Hawking and James Hartle back in the early 1980s (Carroll 2016, 197-9).
a. Standard Model
The Standard Model of particle physics was proposed in the 1970s, and subsequently it has been revised and well tested. The Model is designed to describe elementary particles and the physical laws that govern them. The Standard Model is really a loose collection of theories describing seventeen different particle fields except for gravitational fields. It is our civilization’s most precise and powerful theory of physics. It originally was called a model, but now has the status of a confirmed theory. Because of this it probably should not be called a “model” because it does not contain simplifications as do other models, but its name has not changed over time.
The theory sets severe limits of what exists and what can possibly happen. The Standard Model implies that a particle can be affected by some forces but not others. It implies that a photon cannot decay into two photons. It implies that protons attract electrons and never repel them. It also implies that every proton consists in part of two up quarks and one down quark that interact with each other by exchanging gluons. The gluons “glue” the particles together via the strong nuclear force just as photons glue electrons to protons via the electromagnetic force. Unlike how Isaac Newton envisioned forces, all forces are transmitted by particles. That is, all forces have carrier particles that “carry” the force from one place to another.
For example, consider how the photon is treated in the Standard Model. The exchange of gluons within the proton “glues” its constituent quarks together and keeps them from escaping. More than 90% of the mass of the proton is not due to the mass of its quarks. It is due to a combination of virtual quarks, virtual antiquarks and virtual gluons. Because the virtual particles exist over only very short time scales, they are too difficult to detect by any practical experiment, and so they are called “virtual particles.” However, this word “virtual” does not imply “not real.”
The properties of spacetime points that serve to distinguish any particle from any other are a spacetime point’s values for mass, spin, and charge at that point. Nothing else. There are no other differences among what is at a point, according to the Standard Model, so in that sense fundamental physics is very simple. If we are talking about a point inside a pineapple, what about the value of its pineapple-ness? In principle, according to the Standard Model, the pineapple’s characteristics depend only on these other, more fundamental characteristics. Charge, though, is not simply electromagnetic charge. There are three kinds of color charge for the strong nuclear force, and two kinds of charge for the weak nuclear force. In the atom’s nucleus, the strong force holds two protons together tightly enough that their positive electric charges do not push them away from each other. The strong force also holds the three quarks together inside a proton. The weak force turns neutrons into protons and spits out electrons. It is the strangest of all the forces because it allows some rare exceptions to symmetry under T, the operation of time transformation. (T is the transformation that reverses all processes).
Except for gravity, the Standard Model describes all the universe’s forces. Strictly speaking however, these theories are about interactions rather than forces. A force is just one kind of interaction. Another kind of interaction does not involve forces but rather changes one kind of particle into another kind. The neutron, for example, changes its appearance depending on how it is probed. The weak interaction can transform a neutron into a proton. It is because of transformations like this that the concepts of something being made of something else and of one thing being a part of a whole become imprecise for very short durations and short distances. So, classical mereology—the formal study of parts and the wholes they form—fails at this scale.
The concept of interaction is very exotic. When a particle interacts with another particle, the two particles exchange other particles, the so-called carriers of the interactions. So, when milk is spilled onto the floor, what is going on is that the particles of the milk and the particles in the floor and the particles in the surrounding air exchange a great many carrier particles with each other, and the exchange is what is called “spilling milk onto the floor.” Yet all these varied particles are just tiny fluctuations of fields. This scenario indicates one important way in which the scientific image has moved very far away from the manifest image.
According to the Standard Model, but not according to general relativity theory, all particles must move at light speed c unless they interact with other fields. All the particles in your body such as its protons and electrons would move at the speed c if they were not continually interacting with the Higgs Field. The Higgs Field can be thought as being like a “sea of molasses” that slows down all protons and electrons and gives them the mass and inertia they have. “All mass is interaction,” said Richard Feynman. Neutrinos are not affected by the Higgs Field, but they move slightly less than c because they are slightly affected by the field of the weak interaction. Of all the particles described by the Standard Model of Particle Physics, the Higgs boson is the strangest.
The Standard Model helps explain what is happening in an atomic clock when an electron in a cesium atom changes energy levels and radiates some light indicating the clock is properly tuned. The Standard Model implies that the electron, being a localized vibration in the electron field suddenly vibrates less, thereby loses energy, and the lost energy is transferred to the electromagnetic field, creating a localized vibration there is a new photon.
As of the first quarter of the twenty-first century, the Standard Model is incomplete because it cannot account for gravity or dark matter or dark energy or the fact that there is more matter than anti-matter. When a new version of the Standard Model does all this, then it will perhaps become the long-sought “theory of everything.”
4. Big Bang
The big bang theory in some form or other (with or without inflation) is accepted by nearly all cosmologists, astronomers, astrophysicists, and philosophers of physics, but it is not as firmly accepted as is the theory of relativity and is not part of the Core Theory of Physics.
The big bang theory is our civilization’s standard model for cosmology. The classical version of the big bang theory implies that the universe once was extremely small, extremely dense, extremely hot, nearly uniform, at minimal entropy, expanding; and it had extremely high energy density and severe curvature of its spacetime at all scales. Now the universe has lost all these properties except one: it is still expanding.
Some cosmologists believe time began with the big bang 13.8 billion years ago, at the famous cosmic time of t = 0, but the classical big bang theory itself does not imply anything about when time began, nor whether anything was happening before the big bang, although those features could be added into a revised theory of the big bang.
As far as is known, the big bang explosion was a rapid expansion of space itself, not an expansion of something into a pre-existing void. Think of the expansion as being due to the creation of new space everywhere very quickly. Space has no center around which it expanded. As the universe expanded, it diluted. It probably expanded in all directions almost evenly, and it probably did not produce any destruction. As it expanded, some of the energy was converted into matter (via E=mc2) until finally the first electron was created; and later, the first atom.
The big bang theory is only a theory of the observable universe, not of the whole universe. The observable universe is the part of the universe that in principle could be observed by creatures on Earth or that could have interacted with us observers via actions that move at the speed of light.
The unobservable universe may have no edge, but the observable universe definitely does. Its diameter is about 93 billion light years, and it is rapidly growing more every day. The observable universe is a sphere containing from 350 billion to one trillion large galaxies; it is also called “our Hubble Bubble” and “our pocket universe.” It is still producing new stars, but the production rate is ebbing. 95% of the stars that will ever exist have already been born. The very first stars came into existence about 200-400 million years after the big bang. Every galaxy, including the Milky Way, has about a trillion stars.
Scientists have no well-confirmed idea about the universe as a whole; the universe might or might not be very similar to the observable universe, but the default assumption is that the unobservable universe is like the observable universe. It is unknown whether the unobservable universe’s volume is infinite, but many cosmologists believe the actual universe is not infinite and is about 250 times the volume of our observable universe.
Each day, more stars become inaccessible to us here on Earth. Because of their high speed of recession from Earth, we could never send observers or signals to affect those stars. “Of the 2 trillion galaxies contained within our observable Universe, only 3% of them are presently reachable, even at the speed of light” (Ethan Siegel). That percentage will slowly reduce to zero.
The big bang explosion began approximately 13.8 billion years ago. At that time, the observable universe would have had an ultramicroscopic volume. The explosion created new space, and this explosive process of particles flying away from each other continues to create new space today. Four and a half billion years ago, our solar system was formed from products of this big bang.
The classical theory of the big bang was revised in 1988 to say the expansion rate has been accelerating slightly for the last five billion years due to the pervasive presence of a “dark” energy, and this rate will continue to increase forever. Dark energy is whatever it is that speeds up the expansion of the universe at the cosmic level. It has its name because so little is known about it. It is also sometimes called “the energy of the vacuum,” but many physicists believe this is a bad name because, if it were the energy of the vacuum, then the universe would have pulled itself apart very soon after the big bang. Those who suspect that this energy density of empty space cannot dilute and so stays constant or only very slightly decreases as space expands also refer to it as “Einstein’s cosmological constant.” Because of this energy, the term “empty space” does not mean to physicists what it means in ordinary language such as when we say the space in his closet is now empty of all his belongings, so it is ready for re-painting. Nor does it mean the same thing to philosophers who believe empty space contains absolutely nothing. A physicist who uses the term “empty space” usually means a space with no significant curvature.
The discovery of dark energy helped explain the problem that some stars seemed to be slightly older than the predicted age of the universe. The presence of dark energy indicates that the universe is older than this predicted age, so the problem was solved.
Currently, space is expanding as time increases because most clusters of galaxies are flying farther away from each other, even though galaxies, planets, and molecules themselves are not now expanding. Eventually though, according to the most popular version of the big bang theory, in the very distant future, even these objects will expand away from each other and all structures of particles eventually will be annihilated as will all non-elementary particles themselves, leaving only an expanding soup of elementary particles as the universe chills and asymptotically approaches thermodynamic equilibrium. This is the universe’s so-called heat death or big chill.
The big bang theory presupposes that the ultramicroscopic-sized observable universe at a very early time had an extremely large curvature, but most cosmologists believe that the universe has flattened out and now no longer has any significant spatial curvature on the largest scale of billions of light years. Also, astronomical observations reveal that the current distribution of matter in the universe tends towards uniformity as the scale increases, so its initial curvature is fading away. At these very large scales, the material in our space is homogeneous and isotropic. That is, no matter where in the observable universe you are located and what direction you are looking, you will see at large distances about the same overall temperature, the same overall density, and the same lumpy structure of dense super-clustered galaxies separated by hollow voids.
Here is a picture that displays the evolution of the observable universe since the big bang—although the picture displays only two of our three spatial dimensions. Time is increasing to the right while space increases both up and down and in and out of the picture:
Attribution: NASA/WMAP Science Team
Clicking on the picture will produce an expanded picture with more detail.
The term big bang does not have a precise definition. It does not always refer to a single, first event; rather, it more often refers to a brief duration of early events as the universe underwent a rapid expansion. In fact, the idea of a first event is primarily a product of accepting the theory of relativity, which is known to fail in the limit as the universe’s volume approaches zero. Actually, the big bang theory itself is not a specific theory, but rather a framework for more specific big bang theories.
Astronomers on Earth detect microwave radiation arriving in all directions. It is a fossil record of the cooled down heat from the big bang. More specifically, it is electromagnetic radiation produced about 380,000 years after the big bang when the universe suddenly turned transparent for the first time. At the time of first transparency the universe was about one hundredth of its current age and one millionth of its present size and had cooled down to 3,000 degrees Kelvin, which was finally cool enough to form atoms and to allow photons for the first time to move freely without being immediately reabsorbed by neighboring particles. This primordial electromagnetic radiation has now reached Earth as the universe’s most ancient light. Because of space’s expansion during the light’s travel to Earth, the ancient light has cooled and dimmed, and its wavelength has increased and become microwave radiation with a corresponding temperature of only 2.73 degrees Celsius above absolute zero. The microwave’s wavelength is about two millimeters and is small compared to the 100-millimeter wavelength of the microwaves in kitchen ovens. Measuring this incoming Cosmic Microwave Background (CMB) radiation reveals it to be extremely uniform in all directions in the sky (provided you are not moving relative to it).
Extremely uniform, but not perfectly uniform. CMB radiation varies very slightly with the angle it is viewed from. Any two directions differ by about one part in 100,000 or about ten thousandth of a degree of temperature. These small temperature fluctuations of the currently arriving microwave radiation are caused by fluctuations in the density of the matter of the early plasma and so are probably the origin of what later would become today’s galaxies with the dark voids between them because the high density regions will contract under the pull of gravity and can eventually cause a collapse of its matter into stars and galaxies and clusters of galaxies, and the low density regions will thereby become less dense. So, the inflation theory is one way to explain the pattern of galaxies and clusters of galaxies that we see today.
After the early rapid expansion ended, the universe’s expansion rate became constant and comparatively low for billions of years. This rate is now accelerating slightly because there is a another source of expansion—the repulsion of dark energy. The influence of dark energy was initially insignificant for billions of years, but its key feature is that it does not dilute as the space undergoes expansion. So, finally, after about seven or eight billion years of space’s expanding after the big bang, the dark energy became an influential factor and started to significantly accelerate the expansion. Today the expansion rate is becoming more and more significant. For example, the diameter of today’s observable universe will double in about 10 billion years. This influence from dark energy is shown in the above diagram by the presence of the curvature that occurs just below and before the abbreviation “etc.” Future curvature will be much greater. Most cosmologists believe this dark energy is the energy of space itself, and they call it “vacuum energy.”
The initial evidence for dark energy came from observations in 1998 of Doppler shifts of supernovas. These observations are called “redshifts,” and they are best explained by the assumption that the average distance between supernovas are increasing at an accelerating rate. Because of this rate increase, any receding galaxy cluster will eventually recede from us faster than the speed of light and thus become causally disconnected from us. In 100 billion years, the Milky Way will be the only galaxy left in the observable universe.
Seen from a distance, the collection of galaxy clusters look somewhat like a spider web. But the voids are eating the spider web. Observations by astronomers indicate the dark voids are pushing the nearby normal matter away and are growing and now are beginning to rip apart the filaments in the web.
The universe is currently expanding, so every galaxy cluster is, on average, moving a bit away from the others. The influence of the expansion is not currently significant except at the level of galaxy clusters, but the influence is accelerating, and in a few billion years it will rip apart all galaxy superclusters, then later the individual clusters, then galaxies, and someday all solar systems, and ultimately even all configurations of elementary particles, as the universe approaches its “heat death” or “big chill.”
The term “our observable universe” and the synonymous term “our Hubble bubble,” refer to everything that some person on Earth could in principle observe. Cosmologists presume there is no good reason to suppose that distant observers elsewhere couldn’t see more things than are observable from here on Earth. Physicists are agreed that, because of this reasoning, there exist objects that are in the universe but not in our observable universe. Because those unobservable objects are also the product of our big bang, cosmologists assume that they are similar to the objects we on Earth can observe—that those objects form atoms and galaxies, and that time behaves there as it does here. But there is no guarantee that this convenient assumption is correct. Occam’s Razor suggests it is correct, but that is the sole basis for such a claim. So, it is more accurate to say the classical big bang theory implies that the observable universe once was extremely small, dense, hot, and so forth, and not that the entire universe was this way.
Occasionally, someone remarks that the big bang is like a time-reversed black hole. The big bang is not like this because the entropy in a black hole is extremely high, but the entropy of the big bang was extremely low.
Because the big bang happened about 13.8 billion years ago, you might think that no observable object can be more than 13.8 billion light-years from Earth, but this would be a mistake that does not take into account the fact that the universe has been expanding all that time. The relative distance between galaxies is increasing over time. That is why astronomers can see about 45 billion light-years in any direction and not merely 13.8 billion light-years.
When contemporary physicists speak of the age of our universe and of the time since our big bang, they are implicitly referring to cosmic time measured in the cosmological rest frame. This is time measured in a unique reference frame in which the average motion of all the galaxies is stationary and the Cosmic Microwave Background radiation is as close as possible to being the same in all directions. This frame is not one in which the Earth is stationary. Cosmic time is time measured by a clock that would be sitting as still as possible while the universe expands around it. In cosmic time, the time of t = 0 years is when the time that the big bang began, and t = 13.8 billion years is our present. If you were at rest at the spatial origin in this frame, then the Cosmic Microwave Background radiation on a very large scale would have about the same average temperature in any direction.
The cosmic rest frame is a unique, privileged reference frame for astronomical convenience, but there is no reason to suppose it is otherwise privileged. It is not the frame sought by the A-theorist who believes in a unique present, nor by Isaac Newton who believed in absolute rest, nor by James Clerk Maxwell who believed in an aether at rest and that waved whenever a light wave passed through.
The cosmic frame’s spatial origin point is described as follows:
In fact, it isn’t quite true that the cosmic background heat radiation is completely uniform across the sky. It is very slightly hotter (i.e., more intense) in the direction of the constellation of Leo than at right angles to it…. Although the view from Earth is of a slightly skewed cosmic heat bath, there must exist a motion, a frame of reference, which would make the bath appear exactly the same in every direction. It would in fact seem perfectly uniform from an imaginary spacecraft traveling at 350 km per second in a direction away from Leo (towards Pisces, as it happens)…. We can use this special clock to define a cosmic time…. Fortunately, the Earth is moving at only 350 km per second relative to this hypothetical special clock. This is about 0.1 percent of the speed of light, and the time-dilation factor is only about one part in a million. Thus to an excellent approximation, Earth’s historical time coincides with cosmic time, so we can recount the history of the universe contemporaneously with the history of the Earth, in spite of the relativity of time.
Similar hypothetical clocks could be located everywhere in the universe, in each case in a reference frame where the cosmic background heat radiation looks uniform. Notice I say “hypothetical”; we can imagine the clocks out there, and legions of sentient beings dutifully inspecting them. This set of imaginary observers will agree on a common time scale and a common set of dates for major events in the universe, even though they are moving relative to each other as a result of the general expansion of the universe…. So, cosmic time as measured by this special set of observers constitutes a type of universal time… (Davies 1995, pp. 128-9).
It is a convention that cosmologists agree to use the cosmic time of this special reference frame, but it is an interesting fact and not a convention that our universe is so organized that there is such a useful cosmic time available to be adopted by the cosmologists. Not all physically possible spacetimes obeying the laws of general relativity can have this sort of cosmic time.
History of the Theory
The big bang theory originated with several people, although Edwin Hubble’s very careful observations in 1929 of galaxy recession from Earth were the most influential pieces of evidence in its favor. Noticing that the more distant galaxies are redder than nearby ones, he showed that on average the farther a galaxy is from Earth, the faster is recedes from Earth. (But neither he nor anyone else noticed until the end of the twentieth century that far away galaxies were actually accelerating away from nearby galaxies.) In 1922, the Russian physicist Alexander Friedmann discovered that the general theory of relativity allows an expanding universe. Unfortunately, Einstein reacted to this discovery by saying this is a mere physical possibility and not a feature of the actual universe. He later retracted this claim, thanks in large part to the influence of Hubble’s data. The Belgian physicist Georges Lemaître is another father of the big bang theory. He suggested in 1927 that there is some evidence the universe is expanding, and he defended his claim using previously published measurements of galaxy speeds. He calculated these speeds from the Doppler shifts in their light frequency, as did Hubble.
The big bang theory was very controversial when it was created in the 1920s. At the time and until the 1960s, physicists were unsure whether proposals about cosmic origins were pseudoscientific and so should not be discussed in a well-respected physics journal. In the late 1960s, Stephen Hawking and Roger Penrose convinced the professional cosmologists that there must have been a big bang. The theory’s primary competitor during the preceding time was the steady state theory. That theory allows space to expand in volume but only if this expansion is compensated for by providing spontaneous creation of matter in order to keep the universe’s overall density constant over time.
In the 2020s, the standard model of the big bang is known as the lambda-CDM model. Lambda is the force accelerating the expansion, and CDM is cold dark matter.
a. Cosmic Inflation
According to one popular revision of the classical big bang theory, the cosmicinflation theory, the universe was created from quantum fluctuations in an inflaton field, then the field underwent a cosmological phase transition for some unknown reason causing an exponentially accelerating expansion of space (thereby putting the “bang” in the big bang), and, then for some unknown reason it stopped inflating very soon after it began. When the inflation ended, the universe continued expanding at a slower, and almost constant, rate. In the earliest period of the inflation, the universe’s temperature was zero and it was empty of particles, but at the end it was extremely hot and flooded with particles that were created from the potential energy of the inflaton field.
By the time that inflation was over, every particle was left in isolation, surrounded by a vast expanse of empty space extending in every direction. And then—only a fraction of a fraction of an instant later—space was once again filled with matter and energy. Our universe got a new start and a second beginning. After a trillionth of a second, all four of the known forces were in place, and behaving much as they do in our world today. And although the temperature and density of our universe were both dropping rapidly during this era, they remained mind-boggingly high—all of space was at a temperature of 1015 degrees. Exotic particles like Higgs bosons and top quarks were as common as electrons and photons. Every last corner of space teemed with a dense plasma of quarks and gluons, alongside many other forms of matter and energy. After expanding for another millionth of a second, our universe had cooled down enough to enable quarks and gluons to bind together forming the first protons and neutrons (Dan Hooper, At the Edge of Time, p. 2).
Epistemologically, cosmic inflation is an informed guess. About half the cosmologists do not believe in cosmic inflation. They hope there is another explanation of the phenomena that inflation theory explains.
The virtue of the inflation theory is that it provides an explanation for (i) why there is currently so little curvature of space on large scales (the flatness problem), (ii) why the microwave radiation that arrives on Earth from all directions is so uniform (the cosmic horizon problem), (iii) why there are not point-like magnetic monopoles most everywhere (called the monopole problem), and (iv) why we have been unable to detect proton decay that has been predicted (the proton decay problem). It is difficult to solve these problems in some other way than by assuming inflation.
According to the theory of inflation, assuming the big bang began at time t = 0, then the epoch of inflation (the epoch of radically repulsive gravity) began at about t = 10-36 seconds and lasted until about t = 10-33 seconds, during which time the volume of space increased by a factor of a billion billion billion times or 1026, and any initial unevenness in the distribution of energy was almost all smoothed out, that is, smoothed out from the large-scale perspective, somewhat in analogy to how blowing up a balloon removes its initial folds and creases so that it looks flat when a small section of it is viewed close up.
Although the universe at the beginning of the inflation was actually much smaller than the size of a proton, to help with understanding the rate of inflation you can think of the universe instead as having been the size of a marble. Then during the inflation period this marble-sized object expanded abruptly to a gigantic sphere whose radius is the distance that now would reach from Earth to the nearest supercluster of galaxies. This would be a spectacular change in something marble-sized.
The speed of this inflationary expansion was much faster than light speed. However, this fast expansion speed does not violate Einstein’s general theory of relativity because that theory places no limits on the speed of expansion of space itself.
At the end of that inflationary epoch at about t = 10-33 seconds or so, the inflation stopped. In more detail, what this means is that the explosive material decayed for some unknown reason and left only normal matter with attractive gravity. Meanwhile, our universe continued to expand, although now at a slow, nearly constant, rate. It went into its “coasting” phase. Regardless of any previous curvature in our universe, by the time the inflationary period ended, the overall structure of space on the largest scales was nearly flat in the sense that it had very little spatial curvature, and its space was extremely homogeneous. Today, we see evidence from the Cosmic Microwave Background that the universe is homogeneous on its largest scale.
But at the very beginning of the inflationary period, there surely were some very tiny imperfections due to the earliest quantum fluctuations in the inflaton field. These quantum imperfections inflated into small perturbations or slightly bumpy regions at the end of the inflationary period. Subsequently, the densest regions attracted more material than the less dense regions, and these dense regions would eventually turn into future galaxies. The less dense regions would eventually evolve into the current dark voids between the galaxies. Those early quantum fluctuations have now left their traces in the very slight hundred-thousandth of a degree differences in the temperature of the cosmic microwave background radiation at different angles as one now looks out into space from Earth with microwave telescopes. In this way, the inflation theory predicts the CMB values that astronomers on Earth see with their microwave telescopes.
Let’s re-describe the process of inflation. Before inflation began, for some as yet unknown reason the universe contained an unstable inflaton field or false vacuum field. For some other, as yet unknown reason, this energetic field expanded and cooled and underwent a spontaneous phase transition (somewhat analogous to what happens when cooling water spontaneously freezes into ice). That phase transition caused the highly repulsive primordial material to hyper-inflate exponentially in volume for a very short time. To re-describe this yet again, during the primeval inflationary epoch, the gravitational field’s stored, negative, repulsive, gravitational energy was rapidly released, and all space wildly expanded. At the end of this early inflationary epoch at about t = 10-33 seconds, the highly repulsive material decayed for some as yet unknown reason into ordinary matter and energy, and the universe’s expansion rate stopped increasing exponentially, and the expansion rate dropped precipitously and became nearly constant. During the inflationary epoch, the entropy continually increased, so the second law of thermodynamics was not violated.
Alan Guth described the inflationary period this way:
There was a period of inflation driven by the repulsive gravity of a peculiar kind of material that filled the early universe. Sometimes I call this material a “false vacuum,” but, in any case, it was a material which in fact had a negative pressure, which is what allows it to behave this way. Negative pressure causes repulsive gravity. Our particle physics tells us that we expect states of negative pressure to exist at very high energies, so we hypothesize that at least a small patch of the early universe contained this peculiar repulsive gravity material which then drove exponential expansion. Eventually, at least locally where we live, that expansion stopped because this peculiar repulsive gravity material is unstable; and it decayed, becoming normal matter with normal attractive gravity. At that time, the dark energy was there, the experts think. It has always been there, but it’s not dominant. It’s a tiny, tiny fraction of the total energy density, so at that stage at the end of inflation the universe just starts coasting outward. It has a tremendous outward thrust from the inflation, which carries it on. So, the expansion continues, and as the expansion happens the ordinary matter thins out. The dark energy, we think, remains approximately constant. If it’s vacuum energy, it remains exactly constant. So, there comes a time later where the energy density of everything else drops to the level of the dark energy, and we think that happened about five or six billion years ago. After that, as the energy density of normal matter continues to thin out, the dark energy [density] remains constant [and] the dark energy starts to dominate; and that’s the phase we are in now. We think about seventy percent or so of the total energy of our universe is dark energy, and that number will continue to increase with time as the normal matter continues to thin out. (World Science U Live Session: Alan Guth, published November 30, 2016 at https://www.youtube.com/watch?v=IWL-sd6PVtM.)
Before about t = 10-46 seconds, there was a single basic force rather than the four we have now. The four basic forces (or basic interactions) are: the force of gravity, the strong nuclear force, the weak force, and the electromagnetic force. At about t = 10-46 seconds, the energy density of the primordial field was down to about 1015 GEV, which allowed spontaneous symmetry breaking (analogous to the spontaneous phase change in which water cools enough to spontaneously change to ice); this phase change created the gravitational force as a separate basic force. The other three forces had not yet appeared as separate forces.
Later, at t = 10-12 seconds, there was even more spontaneous symmetry breaking. First the strong nuclear force, then the weak nuclear force and finally the electromagnetic force became separate forces. For the first time, the universe now had exactly four separate forces. At t = 10-10 seconds, the Higgs field turned on. This slowed down many kinds of particles by giving them mass so they no longer moved at light speed.
Much of the considerable energy left over at the end of the inflationary period was converted into matter, antimatter, and radiation, such as quarks, antiquarks, and photons. The universe’s temperature escalated with this new radiation; this period is called the period of cosmic reheating. Matter-antimatter pairs of particles combined and annihilated, removing from the universe all the antimatter and almost all the matter. At t = 10-6 seconds, this matter and radiation had cooled enough that quarks combined together and created protons and neutrons. After t = 3 minutes, the universe had cooled sufficiently to allow these protons and neutrons to start combining strongly to produce hydrogen, deuterium, and helium nuclei. At about t = 379,000 years, the temperature was low enough (around 2,700 degrees C) for these nuclei to capture electrons and to form the initial hydrogen, deuterium, and helium atoms of the universe. With these first atoms coming into existence, the universe became transparent in the sense that short wavelength light (about a millionth of a meter) was now able to travel freely without always being absorbed very soon by surrounding particles. Due to the expansion of the universe since then, this early light’s wavelength expanded and is today invisible on Earth because it is at much longer wavelength than it was 379,000 years ago. That radiation is now detected on Earth as having a wavelength of 1.9 millimeters, and it is called the cosmic microwave background radiation or CMB. That energy is continually arriving at the Earth’s surface from all directions. It is almost homogenous and almost isotropic.
As the universe expands, the CMB radiation loses energy; but this energy is not lost from the universe, nor is the law of conservation of energy violated. There is conservation because the same amount of energy is gained by going into expanding the space.
In the literature in both physics and philosophy, descriptions of the big bang often speak of it as if it were the first event, but the big bang theory does not require there to be a first event, an event that had no prior event. Any description mentioning the first event is a philosophical position, not something demanded by the scientific evidence. Physicists James Hartle and Stephen Hawking once suggested that looking back to the big bang is just like following the positive real numbers back to ever-smaller positive numbers without ever reaching the smallest positive one. There isn’t a smallest positive number. If Hartle and Hawking are correct that time is strictly analogous to this, then the big bang had no beginning point event, no initial time.
The classical big bang theory is based on the assumption that the universal expansion of clusters of galaxies can be projected all the way back to a singularity, to a zero volume at t = 0. The assumption is faulty. Physicists now agree that the projection to a smaller volume must become untrustworthy for any times less than the Planck time. If a theory of quantum gravity ever gets confirmed, it is expected to provide more reliable information about the Planck epoch from t=0 to the Planck time, and it may even allow physicists to answer the questions, “What caused the big bang?” and “Did anything happen before then?”
History of the Theory
The original theory of inflationary expansion (without eternal inflation and many universes) was created by Alan Guth, along with Andrei Linde, Paul Steinhardt, Alexei Sterobinsky and others in the period 1979-1982. It saved the big bang theory from refutation because it explained so many facts that the classical big bang theory conflicts with.
The theory of primordial cosmic strings has been the major competitor to the theory of cosmic inflation, but the above problems labeled (i), (ii), (iii), and (iv) are more difficult to solve with strings and without inflation, and the anisotropies of the Cosmic Microwave Background (CMB) radiation are very difficult to make consistent with cosmic inflation but not with primordial cosmic strings. The theory of inflation is accepted by a great many members of the community of professional cosmologists, but it is not as firmly accepted as is the big bang theory. Princeton cosmologist Paul Steinhardt and Neil Turok of the Perimeter Institute are two of inflation’s noteworthy opponents, although Steinhardt once made important contributions to the creation of inflation theory. One of their major complaints is that at the time of the big bang, there should have been a great many long wavelength gravitational waves created, and today we have the technology that should have detected these waves, but we find no evidence for them. Steinhardt recommends replacing inflation theory with a revised big bounce theory.
For a short lecture by Guth on these topics that is designed for students, see https://www.youtube.com/watch?v=ANCN7vr9FVk.
b. Eternal Inflation and Many Universes
Although there is no consensus among physicists about whether there is more than one universe, many of the big bang inflationary theories are theories of eternal inflation, of the eternal creation of more big bangs and thus more universes. The theory is called the theory of chaotic inflation, the theory of the inflationary multiverse, the Multiverse Theory, and the Many-Worlds theory (although these worlds are different from the worlds of Hugh Everett’s theory). The key idea is that once inflation gets started it cannot easily be turned off.
The inflaton field is the fuel of our big bang and of all of the other big bangs. Advocates of eternal inflation say that not all the inflaton fuel is used up in producing just one big bang, so the remaining fuel is available to create other big bangs, at an exponentially increasing rate because the inflaton fuel increases exponentially faster than it gets used. Presumably, there is no reason why this process should ever end, so there will be a potentially infinite number of universes in the multiverse. Also, there is no good reason to suppose our actual universe was the first one, although technically whether one big bang occurred before or after another is not well defined.
A helpful mental image here is to think of the multiverse as a large, expanding space filled with bubbles of all sizes, all of which are growing. Each bubble is its own universe, and each might have its own physical constants, its own number of dimensions, even some laws of physics different from ours. In some of these universes, there may be no time at all. Regardless of whether a single bubble universe is inflating or no longer inflating, the space between the bubbles is inflating and more bubbles are being born at an exponentially increasing rate. Because the space between bubbles is inflating, nearby bubbles are quickly hurled apart. That implies there is a low probability that our bubble universe contains any empirical evidence of having interacted with a nearby bubble.
After any single big bang, eventually the hyper-inflation ends within that universe. We say its bit of inflaton fuel has been used up. However, after the hyper-inflation ends, the expansion within that universe does not. Our own expanding bubble was produced by our big bang 13.8 billion years ago. It is called the Hubble Bubble.
Even if our Hubble Bubble has a finite volume, unobservable space in our universe might be infinite, and if so then there probably are an infinite number of infinite universes among all the bubbles.
The inflationary multiverse is not the quantum multiverse predicted by the many-worlds theory. The many-worlds theory says every possible outcome of a quantum measurement persists in a newly created world, a parallel universe. If you turn left when you could have turned right, then two universes are instantly created, one in which you turned left, and a different one in which you turned right. A key feature of both the inflationary multiverse and the quantum multiverse is that the wave function does not collapse when a measurement occurs. Unfortunately both theories are called the multiverse theory as well as the many-worlds theory, so a reader needs to be alert to the use of the term. The Everettian Theory is the theory of the quantum multiverse but not of the inflationary multiverse.
The theory of eternal inflation with a multiverse was created by Linde in 1983 by building on some influential work by Gott and Vilenkin. The multiplicity of universes of the inflationary multiverse also is called parallel worlds, many worlds, alternative universes, alternate worlds, and branching universes—many names denoting the same thing. Each universe of the multiverse normally is required to use some of the same physics (there is no agreement on how much) and all the same mathematics. This restriction is not required by a logically possible universe of the sort proposed by the philosopher David Lewis.
Normally, philosophers of science say that what makes a theory scientific is not that it can be falsified (as the philosopher Karl Popper proposed), but rather is that there can be experimental evidence for it or against it. Because it is so difficult to design experiments that would provide evidence for or against the multiverse theories, many physicists complain that their fellow physicists who are developing these theories are doing technical metaphysical conjecture, not physics. However, the response from defenders of multiverse research is usually that they can imagine someday, perhaps in future centuries, running crucial experiments, and, besides, the term physics is best defined as being whatever physicists do professionally.
5. Infinite Time
Is time infinitely divisible? Yes, because general relativity theory and quantum theory require time to be a continuum. But this answer will change to “no” if these theories are eventually replaced by a Core Theory that quantizes time. “Although there have been suggestions by some of the best physicists that spacetime may have a discrete structure,” Stephen Hawking said in 1996, “I see no reason to abandon the continuum theories that have been so successful.” Twenty-five years later, the physics community became much less sure that Hawking is correct.
Did time begin at the big bang, or was there a finite or infinite time period before our big bang? The answer is unknown. There are many theories that imply an answer to the question, but the major obstacle in choosing among them is that the theories cannot be tested practically.
Will time exist infinitely many years from now? The most popular answer is “yes,” but physicists are not sure. Stephen Hawking and James Hartle said the difficulty of knowing whether the past and future are infinite in duration turns on our ignorance of whether the universe’s positive energy is exactly canceled out by its negative energy. All the energy of gravitation and spacetime curvature is negative energy. Hawking said in 2018:
When the Big Bang produced a massive amount of positive energy, it simultaneously produced the same amount of negative energy. In this way, the positive and the negative add up to zero, always. It’s another law of nature. So, where is all this negative energy today? It’s … in space. This may sound odd, …space itself is a vast store of negative energy. Enough to ensure that everything adds up to zero.
A short answer to the question “Why is the energy of gravitation negative and not positive?” is that this negative energy is needed if the law of conservation of energy is going to be true or approximately true. The long answer says to consider a universe containing only a ball above the surface of Earth. It has gravitational potential energy because of its position in the Earth’s gravitational field—the higher, the more energy. The quantitative value of this gravitational potential energy depends on where you set your zero point in the coordinate system you choose, that is, the point where the potential energy is zero. Customarily this is chosen to be at an infinite distance away from Earth (and away from any other objects if they were to be added into our toy universe). Let go of the ball, and it will fall toward the Earth. As gravitational potential energy of position is converted to kinetic energy of motion during the fall of the ball toward Earth, the sum of the two energies remains constant. When the ball reaches Earth, it will have much less than zero potential energy. Its potential energy will be even more negative. An analogous but more complicated argument applies to a large system, such as all the objects of the universe. We would not want to make the zero point for potential energy have anything to do with the Earth if we are making the calculations for all the universe, thus the choice of zero at an infinite distance away from Earth. One assumption in this argument is that what is physically real is not the numerical value of energy but of energy differences.
If total of the universe’s energy is either negative or positive and never zero (and if quantum mechanics is to be trusted, including its law of conservation of energy), then time is infinite in the past and future. Here is the argument for this conclusion. The law of conservation of energy implies energy can change forms, but if the total were ever to be non-zero, then the total energy could never become exactly zero (nor ever have been exactly zero) because that would violate the law of conservation of energy. So, if the total of the universe’s energy is non-zero, then there always have been states whose total energy is non-zero, and there always will be states of non-zero energy. That implies there can be no first instant or last instant and thus that time is eternal.
There is no solid evidence that the total energy of the universe is non-zero, but a slim majority of the experts favor a non-zero total, although their confidence in this is not strong. Assuming there is a non-zero total, there is no favored theory of the universe’s past, but there is a favored theory of the future—the big chill theory. The big chill theory implies the universe just keeps getting chillier forever as space expands and gets more dilute, and so there always will be changes and thus new events produced from old events and time is potentially infinite in the future.
Here are more details of the big chill theory. 95% of all stars that ever will be born have already been born. The last star will burn out in 1015 years. Then all the stars and dust within each galaxy will fall into black holes. Then the material between galaxies will fall into black holes as well, and finally in about 10100 years all the black holes will evaporate, leaving only a soup of elementary particles that gets less dense and therefore “chillier” as the universe’s expansion continues. The microwave background radiation will continue to red shift more and more into longer wavelength radio waves. Future space will expand toward thermodynamic equilibrium. But because of vacuum energy, the temperature will only approach, but never quite reach, zero on the Kelvin scale. Thus the universe descends into a “big chill,” having the same amount of total energy it always has had.
Here is some final commentary about the end of time:
In classical general relativity, the big bang is the beginning of spacetime; in quantum general relativity—whatever that may be, since nobody has a complete formulation of such a theory as yet—we don’t know whether the universe has a beginning or not.
There are two possibilities: one where the universe is eternal, one where it had a beginning. That’s because the Schrödinger equation of quantum mechanics turns out to have two very different kinds of solutions, corresponding to two different kinds of universe.
One possibility is that time is fundamental, and the universe changes as time passes. In that case, the Schrödinger equation is unequivocal: time is infinite. If the universe truly evolves, it always has been evolving and always will evolve. There is no starting and stopping. There may have been a moment that looks like our big bang, but it would have only been a temporary phase, and there would be more universe that was there even before the event.
The other possibility is that time is not truly fundamental, but rather emergent. Then, the universe can have a beginning. …And if that’s true, then there’s no problem at all with there being a first moment in time. The whole idea of “time” is just an approximation anyway (Carroll 2016, 197-8).
Back to the main “Time” article for references and citations.
Author Information
Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.
Epistemic Modality
Epistemic modality is the kind of necessity and possibility that is determined by epistemic constraints. A modal claim is a claim about how things could be or must be given some constraints, such as the rules of logic (logical modality), moral obligations (deontic modality), or the laws of nature (nomic modality). A modal claim is epistemic when these constraints are epistemic in nature, meaning roughly that they are related to knowledge, justification, or rationality. An epistemic possibility is something that may be true, given the relevant epistemic constraints (for example, “Given what we know about the weather, it might rain tomorrow”), while an epistemic necessity is something that must be true given the relevant epistemic constraints (for example, “I don’t see Julie’s car in the parking lot, so she must have gone home”).
The epistemic modal status of a proposition is determined by some body of information, such as an individual or group’s knowledge, a set of data, or the available evidence. A proposition that is not ruled out or eliminated by the information is epistemically possible, whereas a proposition that is in some sense guaranteed by the information is epistemically necessary. As an analogy, consider a detective investigating a crime. Initially, there is little evidence, and so there are many suspects. As more evidence is acquired, suspects are gradually ruled out—it could not have been the butler, since he was in the gazebo at the time of the crime—until only one remains, who must be guilty. Similarly, an epistemic agent may start with limited evidence that leaves open many epistemic possibilities. As the agent acquires more evidence, various possibilities are ruled out until some propositions are epistemically necessary and so must be true.
This article presents the distinctive features of epistemic modality and surveys different answers to the following questions about epistemic modality:
(1) Whose information determines the modal status of a proposition?
(2) How does information determine the modal status of a proposition?
(3) How is epistemic modality related to knowledge?
It concludes with a discussion of alternatives to the standard semantics for epistemic modal language.
An epistemic modal is an epistemic use of a modal term, such as “might”, “necessarily”, or “possible”. On the standard view of epistemic modals, sentences in which these modals are the main operator are used to make epistemic modal claims that attribute an epistemic modal status, either possibility or necessity, to a proposition. For example, (1)-(8) can all be used to make epistemic modal claims:
(1) Maybe it will rain tomorrow.
(2) Terry may not do well on the test.
(3) Perhaps my grandmother is in Venezuela.
(4) The special theory of relativity might be true, and it might be false.
(5) Aristotle might not have been a philosopher.
(6) Given the angle of the blow, the killer must have been over six feet tall.
(7) Sam must be on her way by now.
(8) For all I know, there is no solution.
On this standard view, (1)-(5) can be used to attribute epistemic possibility using the different epistemic modals “maybe”, “may”, “perhaps”, and “might”. (1), for example, attributes epistemic possibility to the proposition that it will rain tomorrow, while (4) attributes epistemic possibility both to the proposition that the special theory of relativity is true and to its negation. (6) and (7), on the other hand, use the epistemic modal “must” to attribute epistemic necessity to the propositions that the killer was over six feet tall and that Sam is on her way, respectively. (8) is also naturally read as expressing an epistemic modal claim attributing epistemic possibility to the proposition that there is no solution, even though no modal term is explicitly used.
The distinguishing characteristic of epistemic modal claims is that their truth is determined by epistemic factors. The epistemic modal status of a proposition is determined by some body of information, and not by logical, metaphysical, or scientific laws. An epistemic possibility is not, for example, some way the world could have been, given the actual laws of physics. Instead, it is a way the world might yet be, given some body of information, such as what we currently know. So, (4), if read as a claim about epistemic possibility, does not assert that the truth and falsehood of the special theory of relativity are both compatible with the laws of physics. It says only that both the truth and falsehood of the theory are individually compatible with some information, such as what the speaker knows. Similarly, an epistemic necessity is not some way the world had to be, given the constraints of logic. Instead, it is a way the world must in fact be, given, for example, what we have discovered about it. So, an utterance of (7) does not assert that some logical contradiction or metaphysical impossibility follows from the assumption that Sam is not on her way. It says only that, given some information, such as what we know about Sam’s schedule, she must in fact be on her way.
As a result, epistemic modal claims are about the actual world in a way that some other modal claims are not. An epistemic possibility is not an alternative way the world might have been had things gone differently, but a way the world might yet turn out to be given the relevant information. (5), if read as expressing metaphysical possibility, is true just in case there is some metaphysically possible world in which Aristotle is not a philosopher. So, it is about the various alternative ways that the world could have been, asserting that at least one of them includes Aristotle not being a philosopher. But (5) is ambiguous and could also be used to make a claim about epistemic possibility: that Aristotle not being a philosopher in this world is left open by the relevant information. If, for example, there were not enough information to determine whether Aristotle had ever done any philosophy in the actual world, it would be epistemically possible that Aristotle was not a philosopher. Unlike the metaphysical possibility claim, this claim is not about an alternative way that the world could have been, but instead about how the past might turn out to have actually been.
Similarly, an epistemic necessity is a way the world must in fact be, but not a way the world had to be— that is, an epistemic necessity might very well not be a metaphysical or logical necessity (and vice versa). The claim that it is metaphysically necessary that 2+2=4, for example, is true just in case there are no metaphysically possible worlds in which the sum of 2 and 2 is something other than 4. So, this claim asserts that, in all possible ways the world could have been, 2+2=4. On the other hand, an epistemic necessity claim made using (6) is true just in case the killer being over six feet tall is in some sense guaranteed by the angle of the blow. This claim is therefore not about how things had to be in all of the various ways the world could have been, but merely about how things must be given our information about how the world in fact is.
Another feature that distinguishes epistemic modality from other kinds of modality is that, because the epistemic modal status of a proposition is determined by epistemic constraints, it can vary over time and between subjects. So, I may truly utter (1) today, but having seen no rain by the end of the day tomorrow, I would have different information and could no longer truly say that rain on that day is possible. Similarly, not knowing my grandmother’s travel itinerary, I could truly say (3), but my cousin who has just found out that our grandmother’s trip to Venezuela was cancelled could not. As a result, it is common to say that a proposition is epistemically possible or necessary for some person or group (for example, “it was possible for Aristotle that the Earth was at the center of our solar system”), meaning it is possible or necessary on that person or group’s information. In contrast, logical, metaphysical, and nomic modalities do not vary across time and between subjects.
a. Modal Puzzles
Distinguishing epistemic modality from other modalities is a key step in solving some philosophical puzzles. Consider Goldbach’s conjecture:
(GC) Every even integer greater than 2 is the sum of two primes.
There is a sense in which (GCT) and (GCF) both seem true
(GCT) It is possible that Goldbach’s conjecture is true.
(GCF) It is possible that Goldbach’s conjecture is false.
But the truth value of Goldbach’s conjecture, like other mathematical truths, is often considered as being of necessity—it is either necessarily false or necessarily true. So, if (GCT) and (GCF) are expressions of mathematical possibility, they generate a contradiction. If Goldbach’s conjecture is possibly true, then it is necessarily true, and if it is possibly false, then it is necessarily false. From (GCT) and (GCF), then, it would follow that Goldbach’s conjecture is both necessarily true and necessarily false.
This result can be avoided by distinguishing mathematical possibility from epistemic possibility. Although (GCT) and (GCF) cannot both be true if they are read as claims about mathematical possibility, they can both be true when read as claims about epistemic possibility. According to one view of epistemic possibility, for example, because we do not yet know whether Goldbach’s conjecture is true or false, both options are epistemically possible for us. However, to avoid contradiction, this must not entail that they are both mathematically possible.
A puzzle involving co-referring names can similarly be solved by appeal to epistemic possibility. According to some views of names (for example, Kripke (1972)), the proposition “Hesperus is identical to Phosphorus” expresses a metaphysically necessary truth, since “Hesperus” and “Phosphorous” are both names for the planet Venus. Nevertheless, a person who has no reason to think that that “Hesperus” and “Phosphorous” refer to the same thing could truly say “it is possible that Hesperus is not identical to Phosphorous”. But if it is (metaphysically) necessarily true that Hesperus is identical to Phosphorous, then it cannot be (metaphysically) possible that Hesperus is not identical to Phosphorous. The contradiction can be avoided by understanding “it is possible Hesperus is not identical to Phosphorous” as a statement of epistemic, rather than metaphysical, possibility. It is epistemically possible, relative to the information of someone who does not know what these names refer to, that Hesperus is not identical to Phosphorous, even though it is not metaphysically possible. (See also Modal Illusions)
These solutions to these puzzles demonstrate two ways in which epistemic possibility is distinct from other kinds of possibility. It is broader in the sense that a logical, mathematical or metaphysical impossibility may be epistemically possible, as in the case of Goldbach’s conjecture or its negation (whichever is false). Similarly, a proposition can be epistemically necessary (for some subject), but not metaphysically, mathematically, or logically necessary. It is not metaphysically necessary that Descartes existed—he could have failed to exist. Nevertheless, it was epistemically necessary for Descartes that he existed—given his information, he must have existed. Epistemic possibility is also narrower in the sense that many logical, mathematical and metaphysical possibilities, such as that no human beings ever existed, are not epistemic possibilities. Epistemic necessity, too, is narrower, in that many logical, mathematical and metaphysical necessities are not epistemically necessary, such as yet-unproven theorems of logic. Because of these differences, epistemic modality “cuts across” logical, mathematical and metaphysical modalities, with the result that facts about epistemic modality cannot be inferred from facts about these other modalities, and vice versa.
2. Whose Information Determines the Epistemic Modal Status of a Proposition?
The epistemic modal status of a proposition is determined by some body of information, but it is not always specified which information is relevant. Phrases like “given the evidence presented today…”, “in view of the information we have…”, and “for all I know…” often indicate the relevant information for a particular epistemic modal claim. However, many sentences used to make epistemic modal claims lack this kind of clear indicator. Claim (1), for example, does not specify for whom or on which body of information it is possible that it will rain tomorrow. Since people have different information about the weather, the proposition that it will rain tomorrow may be possible on the information possessed by some people, but not on the information possessed by others. As a result, a complete theory of epistemic modal claims must have some mechanism for determining whose information is relevant in determining their truth.
a. Context-Dependence
According to some theories, the truth of an epistemic modal claim varies with features of the context in which it is made. Call these “context-dependent theories” of epistemic modal claims. On these views, facts about the context of assertion—that is, the situation in which the epistemic modal claim is spoken, written, or otherwise conveyed—affect the truth of the claim. The simplest kind of context-dependent theory is one in which the relevant information is the information possessed by the speaker:
(Speaker)
“It might be that p” is true as tokened by the speaker S at time t if and only if p is epistemically possible on the information possessed by S at t.
According to (Speaker), whether an epistemic possibility claim expressed by “it might be that p” is true is determined by whether p is epistemically possible on the information possessed by the person who asserts that claim. Thus, the feature of the context that is relevant to determining the claim’s truth value is the information possessed by the speaker. If Paul says, for example, “it might be that God exists”, his claim is true just in case it is epistemically possible for Paul (that is, on the information that he possesses), at the time of his speaking, that God exists.
However, (Speaker) gives counterintuitive results in dialogues about epistemic modality. Suppose, for example, that Katie and Julia are discussing Laura’s whereabouts. Julia knows that Laura left on a flight for Hungary this morning, but Katie does not know that Laura has left. They then have the following discussion:
Katie: Laura might be in the living room.
Julia: No. She can’t be in the living room, because she left for Hungary this morning.
Katie: Oops. I guess I was wrong.
The problem for (Speaker) is that it is epistemically possible on Katie’s information at the beginning of the dialogue that Laura is in the living room. So, according to (Speaker), she speaks truly when she says “Laura might be in the living room”. To some, this seems false—since Julia knows that Laura is on her way to Hungary, Katie’s claim that she might be in the living room cannot be true. Furthermore, Katie seems right to correct herself at the end of the dialogue, but if (Speaker) is true, then this is a mistake. Even though Laura being in the living room is not possible on the information Katie has after talking to Julia, it was possible on her original information at the time that she spoke, which is all that is necessary to make her claim true, according to (Speaker
This problem can be avoided by expanding the relevant information to include information possessed by people other than the speaker, as in:
(Audience)
“It might be that p” is true as tokened by the subject S at time t if and only if p is epistemically possible on the combined information possessed by S and S’s audience at t.
On this view, because Julia is Katie’s audience, her information is also used in determining whether Katie’s claim is true. Since Julia knows that Laura is on her way to Hungary, it is not possible on her information that Laura is in the living room, making Katie’s initial epistemic possibility claim false and her later self-correction warranted.
In some cases, though, the epistemic modal status of a proposition is evaluated relative to the information of some party other than the speaker or its audience. One appropriate response to the question “Is it true that the universe might continue expanding forever?”, for example, is “I don’t know; only a scientist would know if that’s possible.”. But if it is just the information of the speaker and its audience that determines the truth of epistemic possibility claims, this response is bizarre. All it would take to know whether the universe might continue expanding forever is to check the information possessed by those two parties. Here, though, it seems that the possibility of the universe’s continued expansion is being evaluated relative to the information possessed by the scientific community, which the speaker does not have access to. This suggests that, in some contexts, the relevant information is not the information of the speaker or its audience, but the information possessed by the members of some other community. Incorporating this idea gives the following kind of principle:
(Community)
“It might be that p” is true as tokened by the subject S at time t if and only if p is epistemically possible on the information possessed by the members of the relevant community at t.
This view is still context-dependent, as which community is relevant is determined by contextual factors, such as the topic and purpose of the conversation. Note that the relevant community may often include just the speaker or just the speaker and its audience (as in Katie and Julia’s case). (Community) simply allows that in some contexts the information of the scientific community, for example, determines the truth of the epistemic modal claim.
Other examples suggest that even (Community) is insufficiently flexible to account for all epistemic modal claims, as they are sometimes evaluated relative to information that no one actually possesses. The classic example of this comes from Hacking (1967):
Imagine a salvage crew searching for a ship that sank a long time ago. The mate of the salvage ship works from an old log, makes some mistakes in his calculations, and concludes that the wreck may be in a certain bay. It is possible, he says, that the hulk is in these waters. No one knows anything to the contrary. But in fact, as it turns out later, it simply was not possible for the vessel to be in that bay; more careful examination of the log shows that the boat must have gone down at least thirty miles further south. The mate said something false… but the falsehood did not arise from what anyone actually knew at the time.
This kind of case leads some to include not only the information possessed by the relevant community, but also information that members of that community could acquire through investigation. No one in any community has information that rules out that the hulk is in the bay, so that cannot be why the mate’s claim is false. There is, however, an investigation the mate (or any member of the community) could make, a more careful examination of the log, that would yield information that rules out this possibility. If this is why the mate’s claim is false, then the truth of epistemic modal claims must be determined not only by the information possessed by the relevant community, but also by information that is in some way available to that community. Not just any way of gaining information can count, though, as for almost any false proposition, there will be some possible way of acquiring information that rules it out. Since many false propositions are epistemically possible, then, there must be some restriction on which kinds of investigations matter. There are many options for formulating this restriction, but one option is that it is also determined by context:
(Investigation)
“It might be that p” is true as tokened by the subject S at time t if and only if:
(i) p is epistemically possible on the information possessed by the members of the relevant community at t, and
(ii) there is no relevant way for the members of that community to acquire information on which p is not epistemically possible.
According to (Investigation), which ways of acquiring information can affect the truth of epistemic modal claims is determined by features of the context in just the same way that the community is. Depending on the speaker’s background information, motivations, and so forth, different ways of acquiring information will be relevant. Since the mate has just checked the log and knows that he is basing his judgment on the data in the log, checking the log is a relevant way of acquiring information, and so his claim is false. By contrast, a student could truly say during an exam “the answer might be “Gettier”, but I’m not sure”. Although there are ways, such as reading through the textbook, for the student to acquire information that rules out that answer, none of those ways is relevant in the context of an exam. (Investigation) is thus in principle able to account for cases where information that no one has seems to determine the truth of epistemic modal claims.
b. Relativism
According to relativist theories of epistemic modal claims, these claims are true relative to the context in which they are evaluated, rather than to the context in which they are asserted. Whereas context-dependent theories allow for different tokens of the same type of epistemic claim to have different truth values, these relativist views allow for the same token of some epistemic claim to have different truth values when evaluated in different contexts. The primary motivation for this kind of view is that a single token of an epistemic modal claim can be judged true in one context but false in another, and both judgments can seem correct. This happens in eavesdropping cases like the following:
Mara, Ian, and Eliza are playing a game of hide-and-seek. Mara is hiding in the closet while Ian and Eliza are searching for her. Ian and Eliza are discussing where Mara might be, and Ian says “She might be in the kitchen, since we haven’t checked there yet”. Listening from the closet, though, Mara knows that Ian is wrong—she is most definitely not in the kitchen.
The puzzle is that Ian’s reasoning seems perfectly good; any room which they have not checked yet is a room in which Mara might be hiding. Assuming the relevant community does not include Mara, no one in the relevant community has information that rules out that Mara is in the kitchen, and so it is epistemically possible on the community’s information that she is in the kitchen. However, Mara’s assessment that Ian is wrong also seems correct. She could not truthfully say “I know I am not in the kitchen, but Ian is right when he says that I might be in the kitchen”. So, the very same token of “She might be in the kitchen” seems true when evaluated by Ian and Eliza, but false when evaluated by Mara.
To accommodate this sort of intuition, relativists propose that epistemic modal claims are true only relative to the context in which they are assessed, resulting in a view like the following:
(Relativism)
“It might be that p” is true as tokened by the subject S at time t1 and assessed by the agent A at time t2 if and only if p is epistemically possible on the information possessed by A at t2.
According to (Relativism), Ian’s claim that Mara might be in the kitchen is true when assessed by Ian and Eliza, since it is epistemically possible on their information that she is hiding in the kitchen. However, it is not true when assessed by Mara, since it is not epistemically possible on her information that she is in the kitchen; she knows full well that she is in the closet. On this kind of view, the token has no fixed truth value based on the context in which it is asserted; it has a truth value only relative to the context in which it is being assessed. On (Relativism), the feature of the context of assessment that determines this truth value is the information possessed by the person doing the assessing, but other relativist views may include other features such as the assessor’s intentions, the purpose of the assessment, information the assessor could easily obtain, and so forth.
One way of addressing this puzzle within the confines of a context-dependent theory is to include Mara in the relevant community, such that her knowledge that she is not in the kitchen makes Ian’s claim false, regardless of what Ian and Eliza know. This would allow the proponent of a context-dependent theory to say that Ian’s claim is strictly false, but still conversationally appropriate, as he does not know about the information that makes it false. Generalizing this strategy gives an implausible result, however, as there would be equally good reasons to include in the relevant community anyone who will ever consider a claim of epistemic possibility. As a result, any claim of the form “it might be that p”, where p will at some point be discovered by someone to be false, is false. If we know this, then it is almost always inappropriate to assert that p might be true, since we know that if p is ever discovered to be false, then our assertion will have been false. So, if epistemic possibility claims are commonly appropriate to assert, as they seem to be, there is a reason to doubt the context-dependent account of this case.
3. How Does Information Determine the Epistemic Modal Status of a Proposition?
Theories of epistemic modality also differ in how a proposition must be related to the relevant information in order to have a given epistemic modal status. Even if it is agreed that a proposition is epistemically possible for a subject S just in case it is not ruled out by what S knows, for example, there remains the question of what it takes for S’s knowledge to rule out a proposition.
a. Negation
The simplest view of this relation is that a proposition is possible on a body of information just in case the information does not include the negation of that proposition. If the relevant information is a subject’s knowledge, for example, this yields:
(Negation)
p is epistemically possible for a subject S if and only if S does not know that not-p.
So, if Bozo knows that he is at the circus, then it is not epistemically possible for him that he is not at the circus, whereas if he does not know that the square root of 289 is 17, then it is possible for him that it is not.
A difficulty for this sort of view involves a proposition that is intuitively ruled out by what someone knows, even though that person does not explicitly know the proposition’s negation. Suppose, for example, that Holmes knows that Adler has stolen his pipe. Holmes is perfectly capable of deducing from this that someone stole his pipe, but he has not bothered to do so. So, Holmes has not formed the belief that someone stole his pipe. As a result, he does not know that someone stole the pipe. According to (Negation), then, it is still epistemically possible for Holmes that no one stole the pipe (that is, that it is not the case that someone stole the pipe), even though it is not epistemically possible for Holmes that Adler did not steal the pipe. This is problematic, as knowing that Adler stole the pipe seems sufficient to rule out that no one stole the pipe, as the former obviously entails the falsehood of the latter. So, S’s not knowing not-p is not sufficient for p to be epistemically possible for S.
b. Entailment
To accommodate this kind of case, some theories require only that the information include something that entails not-p, such as:
(Entailment)
p is epistemically possible for a subject S if and only if nothing that S knows entails not-p.
This resolves the problem in the Holmes case, as Holmes knows something (that Adler stole the pipe) which entails that someone stole the pipe. So, it is not epistemically possible for Holmes that no one stole the pipe, regardless of whether Holmes has formed the belief that someone stole the pipe.
However, views like (Entailment) face problems involving logically and metaphysically necessary propositions. On the assumption that logically and metaphysically necessary propositions are entailed by any body of information, their negations will be epistemically impossible for any subject on this kind of view. If Goldbach’s conjecture is false, for example, then any subject’s knowledge entails the negation of Goldbach’s conjecture. Nevertheless, it is epistemically possible for many subjects that Goldbach’s conjecture is true. So, S not knowing anything that entails not-p cannot be necessary for p to be epistemically possible for S.”
Another potential problem is that requiring the entailment of not-p to rule out p seems to result in too many epistemic possibilities. For example, if the detective knows that fingerprints matching the butler’s were found on the gun that killed the victim, that powder burns were found on the butler’s hands, that reliable witnesses testified that the butler had the only key to the room where the body was found, and that there is surveillance footage that shows the butler committing the murder, this would still be insufficient to rule out the butler’s innocence according to (Entailment), since none of these facts properly entail that the butler is guilty. Similarly, if the relevant information is not a subject’s knowledge but instead her foundational beliefs and/or experiences, then very few propositions will not be epistemically possible for a given subject. With the exception of necessary truths and propositions about my own mental states, none of the propositions I believe is entailed by my experiences or foundational beliefs (for more, see Fallibilism). As a result, if this information must entail not-p in order to rule out p as an epistemic possibility, nearly all contingent propositions will be epistemically possible for every subject. Because of this, any subject could truly assert “given my evidence, I might be standing on the moon right now”, which is a prima facie problem for this kind of view.
c. Probability
One way of weakening the conditions necessary to rule out a proposition is to analyze epistemic possibility in terms of probability:
(Probability)
p is epistemically possible for a subject S if and only if the probability of p given what S knows is greater than or equal to x, where x is some threshold of probability between 0 and 1.
As long as x is greater than 0, this kind of view allows for p to be ruled out even when S’s knowledge does not entail not-p. If the probability of p given S’s knowledge is greater than 0 but less than x, there is still some chance (given what S knows) that p is true, but it does not follow that p is epistemically possible for S. Note, however, that, for any understanding of probability that obeys the Kolmogorov Axioms, (Probability) will face the same problem with necessary falsehoods that (Entailment) does. On any such understanding, the probability of a logically necessary falsehood is 0, and so no logically and metaphysically necessary falsehoods can be epistemically possible.
d. Dismissing
More complex theories about the “ruling-out relation” include Huemer’s (2007) proposal, which requires, among other things, that S have justification adequate for dismissing p in order for S to rule out p. “Dismissing” is meant to capture particularly strong disbelief in p or “disbelieving [p] and regarding the question as settled” (p. 132). Since, according to Huemer, the degree of justification adequate for dismissing a proposition varies with context, an epistemic possibility claim can be true in one context, but false in another. So, on this view, epistemic modal claims are context-sensitive even when the relevant information is specified, in a way that parallels the context-sensitivity of “know” argued for by epistemic contextualists. For example, on this kind of contextualist view, standards for dismissing may be low in ordinary contexts, but high when confronted with a skeptic. If so, then a subject can truly assert “it is not the case that I might be the victim of an evil demon” in an ordinary context, and also truly assert “I might be the victim of an evil demon” when confronted with a skeptic.
4. How is Epistemic Modality Related to Knowledge?
Since the modality under discussion is epistemic, it is natural to suppose that it is closely related to knowledge. This would account for the common use of “for all I know” and “for all anyone knows” to attribute epistemic possibility, as well as providing a straightforward explanation of what is epistemic about epistemic modality. It would also account for the apparent relevance of what might and must be true to what we know. There are, however, several different accounts of this relation.
a. Knowledge as the Relevant Information
One proposal is that knowledge is the relevant type of information that determines the epistemic modal status of a proposition. Whatever the correct theory of the ruling out relation, then, the following would be true:
p is epistemically possible for a subject S if and only if p is not ruled out by what S knows.
However, there are at least two problems with this sort of view. First, a subject may fail to know something for reasons that are intuitively irrelevant to the modal status of the proposition in question. Let q be a proposition such that if S knew q, then S’s knowledge would rule out p, and suppose that S satisfies every condition for knowledge of q except that S does not believe q. This may be because S lacks a concept required to believe q, has a psychological flaw that prevents her from believing q, or simply has not gotten around to forming the belief that q. These kinds of reasons for not believing q do not seem to affect what is epistemically possible for S, and yet if epistemic possibility is understood in terms of knowledge, they do.
The second problem is that epistemic modal claims are sometimes assessed relative to a body of information that no one actually knows. A computer hard drive, for example, may contain a tremendous amount of data, more than anyone could possibly know. Referencing such a drive, a person could assert “given the information on this drive, system X might contain a planet that would support life”. This is an epistemic modal claim that may be true or false, but its truth value cannot be determined by what anyone knows, since no one knows all of the data on the drive. As a result, the epistemic modal status of a proposition must at least sometimes be determined by a type of information other than knowledge.
An alternative proposal is that epistemic modality is determined by evidence. What this view amounts to depends on one’s theory of evidence. Evidence may include publicly available evidence, such as the data on a drive, the records in a log, or the results of an experiment. Evidence may also be understood as a subject’s personal evidence, consisting of experiences and other mental states (see Evidentialism).
b. Epistemic Modality and Knowledge
Whether or not knowledge is the information that determines epistemic modality, many epistemological views connect knowledge to epistemic modality. In particular, ruling out the epistemic possibility that not-p is claimed by many to be a necessary condition for knowing that p:
(K1)
S knows that p only if not-p is not epistemically possible for S.
Others also accept the potentially stronger claim that knowledge requires ruling out the epistemic possibility of any proposition incompatible with p:
(K2)
S knows that p only if there is no q such that:
(i) q entails not-p, and
(ii) q is epistemically possible for S.
One motivation for this kind of connection between epistemic possibility and knowledge is the idea that epistemic necessity just is knowledge, such that p is epistemically necessary for S just in case S knows that p. On the assumption that epistemic necessity and possibility are duals, in the sense that a proposition is epistemically necessary just in case its negation is not epistemically possible, and vice versa (see Modal Logic), this would entail (K1).
A second motivation is the intuitive idea that knowledge requires the exhaustion of alternative possibilities, that in order to know that p, one must perform a thorough enough investigation to exclude any alternatives to p. For a detective to know that the butler is guilty, for example, she must rule out all of the other suspects. In doing so, she would rule out every possibility in which the butler was not guilty, thereby satisfying the consequent of (K2). If knowledge requires this kind of elimination of alternatives, then, there is good reason to accept (K2).
The main reason to doubt principles like (K1) and (K2) is their apparent inconsistency with fallibilism about knowledge, the view that some or all of our knowledge has inconclusive justification. If our justification for p is inconclusive, then there is some chance, given that justification, that not-p is true. But this seems to commit us to saying that not-p might be true and is therefore epistemically possible. So, if (K1) is true, then we do not know p after all. Since conclusive justification for our beliefs is very rare, applying this reasoning generally has the implausibly skeptical consequence that we have very little knowledge of the world.
c. Concessive Knowledge Attributions
A related issue is the apparent incoherence of Concessive Knowledge Attributions (“CKAs”) in which a subject claims to know something while admitting the (epistemic) possibility of error. (9), for example, sounds odd:
(9) I know that I own a cat, but I might not own a cat.
Furthermore, (9) seems to be in some way self-defeating—admitting the epistemic possibility that the speaker does not own a cat seems like an admission that she does not in fact know that she owns a cat. (10) has similar problems:
(10) I know that I own a cat, but I might not own any animals.
As long as the speaker knows that all cats are animals, asserting (10) seems problematic in roughly the same way as asserting (9) does. The second conjunct seems to commit the speaker to denying the first. An account of the relationship between epistemic possibility and knowledge must therefore give some explanation of the apparent tension in CKAs like (9) and (10).
The most straightforward account of the oddness of CKAs is that they are self-contradictory and therefore false. If (K1) is true, then sentences of the form “S knows that p” and “not-p is epistemically possible for S” are mutually inconsistent. On this kind of view, (9) seems odd and self-defeating because its conjuncts are inconsistent with each other—if the speaker knows that he owns a cat, then it is not epistemically possible for him that he does not own a cat. If (K2) is true, then sentences of the form “S knows that p” and “q is epistemically possible for S” (where q entails not-p) are also mutually inconsistent. So, on this kind of view (10) seems odd and self-defeating for just the same reason. If the speaker knows that she owns a cat, then according to (K2) no proposition that entails that she does not own a cat is epistemically possible for her. Since not owning an animal entails not owning a cat, then, it cannot be epistemically possible for the speaker that she does not own any animals.
On Lewis’s (1996) account, CKAs are not strictly self-contradictory, but they can never be truly asserted. For Lewis, “S knows that p” is true just in case S’s evidence rules out all not-p possibilities that are not properly ignored. Which possibilities are properly ignored varies with the conversational context, such that “S knows that p” may be true in one context, but false in another in which fewer propositions are properly ignored. As a result, there may be contexts in which “S knows that p” and “it might be that q” would be true, even though q entails not-p, so long as q is one of the not-p possibilities that is properly ignored in that context. So, strictly speaking, these sentences are not mutually inconsistent.
However, there are rules that govern when a possibility is properly ignored in a context, one of which is the Rule of Attention. This rule entails that any not-p possibility that is explicitly mentioned is not properly ignored, since it is not ignored at all. Because of this, conjunctions of the form “S knows that p, but it might be that q”, where q is a not-p possibility, cannot be truly asserted. Mentioning the not-p possibility q prevents it from being properly ignored. So, if q is not ruled out by S’s evidence, then “S knows that p” is false. This accounts for the tension in (9) and (10), as mentioning the epistemic possibilities that the speaker does not own a cat or does not own any animals is sufficient to prevent them from being properly ignored. This makes the speaker’s claim that she knows she owns a cat false, which is why these CKAs seem self-defeating, even though they are not strictly inconsistent.
Other views, such as that of Dougherty & Rysiew (2009), hold that CKAs are often true, but conversationally inappropriate to assert. On their view, p is epistemically possible for a subject just in case the subject’s evidence (consisting of her mental states) does not entail not-p. So, nearly all contingent propositions are epistemically possible. Because of this, mentioning that a contingent proposition is epistemically possible would be a strange thing to do in most conversations, akin to noting the obvious truth that there is a metaphysical possibility that one’s beliefs are false. As a result, on this view, asserting that a proposition is epistemically possible pragmatically implicates something more, such as that one has some compelling reason for taking seriously the possibility that not-p, and so one is not confident that p. As a result, CKAs like (9) and (10) are often true, as they assert that the speaker knows that p and does not have entailing evidence for p. However, they are conversationally inappropriate to assert. Unless the speaker has some good reason to suppose that she does not own a cat, it is inappropriate to assert that she might not. However, if she does have such a reason, then she should not say that she knows that she owns a cat, because doing so implicates confidence in and adequate evidence for the proposition that she owns a cat which are incompatible with having that kind of reason.
5. Alternatives to the Standard View of Epistemic Modals
The standard view of epistemic modals introduced in section 1 holds that epistemic modals like “might” and “must” are used to express epistemic modal claims attributing either epistemic possibility or epistemic necessity to propositions. This standard view is committed to two important theses, each of which has been challenged.
First, on the standard view, epistemic modals affect the semantic content of sentences. So, a sentence of the form “it might be that p” differs in its semantic content from simply “p”. For example, an utterance of proposition (2), “Terry may not do well on the test”, does not simply express the proposition that Terry will not do well on the test. Instead, it expresses the epistemic modal claim that the proposition that Terry will do well on the test is epistemically possible on the relevant information. This difference in meaning yields a difference in truth conditions, such that it is possible for someone to truly assert (2) even if Terry will in fact do well on the test (if, for example, the speaker does not know whether Terry will do well).
Second, sentences containing epistemic modals typically serve to describe the world, in the sense that they describe some proposition as having some particular modal status. Thus, whatever effect epistemic modals have on the meaning of sentences, they typically result in the expression of a descriptive claim about the world that can be evaluated for truth.
a. Embedded Epistemic Modals
The most significant challenge to the standard view is that epistemic modal sentences behave strangely when embedded in other sentences, which the standard view does not predict. As Yalcin (2007) first pointed out, conjunctions including epistemic modals yield unusual results when embedded in other kinds of sentences. Consider a case in which it is raining outside, but you have not looked out of the window. For you, then, it is epistemically possible that it is not raining, even though it is in fact raining. Nevertheless, the conjunction of “it is raining” and “it might not be raining” sounds odd when embedded in certain kinds of sentences. For example, the imperative sentence (11) sounds odd:
(11) Suppose that it is raining and it might not be raining.
(11) seems to give a command that is in some way defective; the conjunction in question cannot be coherently supposed. However, this seemingly cannot be because the conjuncts “it is raining” and “it might not be raining” are logically inconsistent. If they were, then “it might not be raining” would entail “it is not raining”, but the mere epistemic possibility that it is not raining seemingly cannot entail that it is in fact not raining. So, on the standard view, there is no obvious reason that (11) should be defective, as it simply asks you to suppose that two compatible claims are both true.
Similarly, (12) sounds odd:
(12) If it is raining and it might not be raining, then it is raining.
This oddness is unexpected, since, given the usual semantics for conditionals and the standard view of epistemic modals, (12) should be trivially true. Any material conditional of the form “If A and B, then A” should be obviously true, and yet the truth value of (12) is not obvious. This is not because (12) seems false, but because there seems to be something wrong with the antecedent of (12). On the standard view, though, there is no obvious reason that this should be the case. Each conjunct expresses a claim about the world, and whatever claim is expressed by the second conjunct, the consequent clearly follows from the first conjunct alone. In response to these puzzles, Yalcin (2007) develops a semantics according to which sentences like “it is raining and it might not be raining” really are strictly contradictory, but this is not the only way to account for the oddness of sentences like (11) and (12).
b. Hedging
An alternative to the standard view is that the modals of epistemic possibility (“may”, “might”, “perhaps”, etc.) are used to “hedge” or express reduced confidence about the expressed proposition, rather than to attribute any modal status. As Coates (1983 131) describes this kind of view: “MAY and MIGHT are the modals of Epistemic Possibility, expressing the speaker’s lack of confidence in the proposition expressed” (p. ?). On this kind of view, epistemic modals do not affect the semantic content of a sentence but are instead used to indicate the speaker’s uncertainty about the truth of that content. For example, on Schnieder’s (2010) view, (2) and (2¢) have the same semantic content:
(2) Terry may not do well on the test.
(2′) Terry will not do well on the test.
However, while a speaker who utters (2′) thereby asserts that Terry will not do well on the test, a speaker who utters (2) makes no assertion at all but instead engages in a different kind of speech act that presents that speaker as being uncertain about the proposition that Terry will not do well on the test. Thus, though the two sentences have the same semantic content, the epistemic modal in (2) results in an expression of the speaker’s uncertainty, rather than in an epistemic modal claim that describes a proposition as having some modal status. Views of this kind are therefore incompatible with both theses of the standard view, since epistemic modals do not affect the semantic content of a sentence, and sentences like (2) are not used to describe the world.
Hedging views can offer some account of the oddness of embedded epistemic modals, since on these views the speaker is using “may” or “might” to express uncertainty in situations in which it is inappropriate to do so. If, for example, it is only appropriate to suppose something that could be asserted, then (11) should sound odd, since “it might be raining” cannot be used to make an assertion. Thus, they seem to have some advantage over the standard view.
However, a significant objection to the underlying idea that epistemic modals do not affect semantic content, raised by Papafragou (2006), is that adding an epistemic modal to a sentence seems to change its truth conditions in many ordinary cases. Suppose, for example, that my grandmother is on vacation in South America, and I cannot recall her exact itinerary. On a hedging view, (3) and (3¢) have the same semantic content, and thus the same truth conditions:
(3) My grandmother might be in Venezuela.
(3′) My grandmother is in Venezuela.
If my grandmother is in fact in Brazil, then (3′) is false. So, if epistemic modals do not affect truth conditions, then (3) must also be false. But since I cannot remember her itinerary, I seem to speak truly when I utter (3). This difference in truth values requires a difference in semantic content, contrary to what hedging views predict. Similarly, if epistemic modals do not affect semantic content, then the proposition expressed by claim (4) (that is, “the special theory of relativity might be true, and it might be false”) would be a contradiction.
This raises two problems. First, intuitively, a speaker could use (4) to assert something true (if, for example, she did not know whether the special theory of relativity was true or false). Second, if (4) is not used to assert anything but instead used to express the speaker’s uncertainty about the semantic content of the sentence, then an utterance of (4) would express uncertainty about the truth value of a contradiction. But, at least in ordinary circumstances, that would be a strange epistemic state for a speaker to express.
Defenders of hedging views have options for responding to these objections, however. Perhaps, for example, my utterance of (3) seems appropriate when my grandmother is in Brazil not because it is true, but because it is sincere—I am presenting myself as being uncertain that my grandmother is not in Venezuela, and in fact I am uncertain of that proposition. This would explain why an utterance of (3) can be intuitively correct in some sense, even though the only proposition expressed in that utterance is false. And perhaps (4) is not an expression of uncertainty about a contradiction but instead a combination of two different speech acts: one expressing uncertainty that the special theory of relativity is true and another expressing uncertainty that it is false.
c. Other Views of Epistemic Modals
Another alternative to the standard view is to accept the first thesis that epistemic modals affect semantic content but deny the second thesis that they are used descriptively to attribute some epistemic modal status to a proposition. For example, a view of this kind is considered in Yalcin (2011): “To say that a proposition is possible, or that it might be the case, is to express the compatibility of the proposition with one’s state of mind, with the intention of engendering coordination on this property with one’s interlocutor” (p. 312). On this view, (3) and (3¢) do not have the same truth conditions, because (3) does not have truth conditions at all—it does not describe the world as being any particular way and so does not attribute any modal status to any proposition. Similarly, Willer’s (2013) dynamic semantics for epistemic modals does not assign truth conditions to epistemic modal claims but instead relations between information states, such that uttering (3) aims to change mere possibilities that are compatible with an agent’s evidence into “live” possibilities that the agent takes seriously in inquiry. On Swanson’s (2016) view, the content of an epistemic modal sentence is not a proposition but a constraint on credences, such that a speaker uttering (3) thereby advises their audience to adopt a set of credences that does not rule out or overlook the possibility that Terry will not do well on the test. On all of these views, epistemic modals affect the semantic content of the sentences in which they occur, but the resulting contents are not propositions with truth values. Thus, though they each handle embedded epistemic modals differently, none of these views are committed to the same seemingly implausible verdicts about sentences like (11) and (12) as the standard view.
6. References and Further Reading
Barnett, D. 2009. Yalcin on ‘Might’. Mind 118: 771-75.
A proposed solution to the embedding problem for epistemic modals.
Coates, J. 1983. The Semantics of Modal Auxiliaries. London: Croom Helm.
An account of the semantics of modals in English, including a discussion of hedging with epistemic modals.
DeRose, K. 1991. Epistemic Possibilities. The Philosophical Review 100: 581-605.
An overview of context-dependent accounts of epistemic modals, and a defense of (Investigation).
DeRose, K. 1998. Simple ‘Might’s, Indicative Possibilities and the Open Future. The Philosophical Quarterly 48: 67-82.
An argument that simple “might” and “possible” sentences are used to make epistemic modal claims.
Dougherty, T. and P. Rysiew. 2009. Fallibilism, Epistemic Possibility, and Concessive Knowledge Attributions. Philosophy and Phenomenological Research 78: 123-32.
Arguments for evidence being the relevant type of information, entailment being the relevant relation, and concessive knowledge attributions being typically true, but pragmatically inappropriate.
Egan, A. 2007. Epistemic Modals, Relativism, and Assertion. Philosophical Studies 133: 1-22.
A defense of relativism about epistemic modals, as well as a discussion of an objection to relativism based on the role of assertions.
Egan, A., J. Hawthorne, and B. Weatherson. 2005. Epistemic Modals in Context. In Contextualism in Philosophy, eds. G. Peter and P. Preyer. Oxford: Oxford University Press, 131-70
An extended discussion of contextualism and a defense of relativism.
Hacking, I. 1967. Possibility. The Philosophical Review 76: 143-68.
On the salvage case as motivation for a view like (Investigation).
Hawthorne, J. 2004. Knowledge and Lotteries. Oxford: Oxford University Press.
On epistemic possibility and its relation to knowledge.
Hintikka, J. 1962. Knowledge and Belief: An Introduction to the Logic of the Two Notions. Ithaca: Cornell University Press.
Development of a logic of epistemic modality, using a knowledge-based account of epistemic possibility.
Huemer, M. 2007. Epistemic Possibility. Synthese 156: 119-42.
Overview of problems for several different accounts of epistemic possibility, concluding in a defense of the dismissing view (section 3d).
Kripke, S. 1972. Naming and Necessity. Cambridge, MA: Harvard University Press.
Distinguishes epistemic from metaphysical modality using the Hesperus/Phosphorous example.
Lewis, D. 1996. Elusive Knowledge. Australasian Journal of Philosophy 74: 549-67.
Motivation for (K2), as well as Lewis’s account of CKAs.
MacFarlane, J. 2011. Epistemic Modals Are Assessment-Sensitive. In Epistemic Modality, eds. A. Egan and B. Weatherson. New York: Oxford University Press, 144-79.
Problems for context-dependent theories and a defense of relativism.
Moore, G. E. (ed.) 1962. Commonplace Book, 1919-1953. New York: Macmillan.
Notes on knowledge, epistemic possibility, and the standard view of epistemic modals.
Papafragou, A. 2006. Epistemic Modality and Truth Conditions. Lingua 116: 1688-702.
An explanation and critique of hedging views of epistemic modals.
Schnieder, B. 2010. Expressivism Concerning Epistemic Modals. The Philosophical Quarterly 60: 601-615.
An explanation and defense of a hedging view of epistemic modals.
Stanley, J. 2005. Fallibilism and Concessive Knowledge Attributions. Analysis 65: 126-31.
An argument that CKAs are self-contradictory and that something like (K1) holds.
Swanson, E. 2016. The Application of Constraint Semantics to the Language of Subjective Uncertainty. Journal of Philosophical Logic 45: 121-46.
An alternative to the standard view of epistemic modals formulated in terms of constraints on an agent’s credences.
Teller, P. 1972. Epistemic Possibility. Philosophia 2: 303-20.
Overview of some accounts of epistemic possibility, including the problem of necessary truths for entailment and probability views.
von Fintel, K and A. Gillies. 2008. CIA Leaks. The Philosophical Review 117: 77-98
A detailed overview of the motivations for context-dependence and relativism about epistemic modal claims.
Willer, M. 2013. Dynamics of Epistemic Modality. ThePhilosophical Review 122: 45-92.
A dynamic semantics for epistemic modals that rejects the standard view.
Wright, C. 2007. New Age Relativism and Epistemic Possibility: The Question of Evidence.Philosophical Issues 17: 262-283.
A series of objections to relativism and concerns about the motivations for it.
Yalcin, S. 2007. Epistemic Modals. Mind 116: 983-1026.
A presentation of the embedding problem for epistemic modals and a semantics designed to solve it.
Yalcin, S. 2011. Nonfactualism about Epistemic Modality. In Epistemic Modality, eds. A. Egan and B. Weatherson. New York: Oxford University Press, 295-333.
Arguments against the standard view of epistemic modals and development of a nonfactualist account.
Author Information
Brandon Carey
Email: brandon.carey@csus.edu
California State University, Sacramento
U. S. A.
Locke: Epistemology
John Locke (1632-1704), one of the founders of British Empiricism, is famous for insisting that all our ideas come from experience and for emphasizing the need for empirical evidence. He develops his empiricist epistemology in An Essay Concerning Human understanding, which greatly influenced later empiricists such as George Berkeley and David Hume. In this article, Locke’s Essay is used to explain his criticism of innate knowledge and to explain his empiricist epistemology.
The great divide in Early Modern epistemology is rationalism versus empiricism. The Continental Rationalists believe that we are born with innate ideas or innate knowledge, and they emphasize what we can know through reasoning. By contrast, Locke and other British Empiricists believe that all of our ideas come from experience, and they are more skeptical about what reason can tell us about the world; instead, they think we must rely on experience and empirical observation.
Locke’s empiricism can be seen as a step forward in the development of the modern scientific worldview. Modern science bases its conclusions on empirical observation and always remains open to rejecting or revising a scientific theory based on further observations. Locke would have us do the same. He argues that the only way of learning about the natural world is to rely on experience and, further, that any general conclusions we draw from our limited observations will be uncertain. Although this is commonly understood now, this was not obvious to Locke’s contemporaries. As an enthusiastic supporter of the scientific revolution, Locke and his empiricist epistemology can be seen as part of the same broader movement toward relying on empirical evidence.
Locke’s religious epistemology is also paradigmatic of the ideals of the Enlightenment. The Enlightenment is known as the Age of Reason because of the emphasis on reason and evidence. Locke insists that even religious beliefs should be based on evidence, and he tries to show how religious belief can be supported by evidence. In this way, Locke defends an Enlightenment ideal of rational religion.
The overriding theme of Locke’s epistemology is the need for evidence, and particularly empirical evidence. This article explains Locke’s criticism of innate knowledge and shows how he thinks we can acquire all our knowledge from reasoning and experience.
Many philosophers, including the continental rationalists, have thought that we are born with innate ideas and innate knowledge. Locke criticizes the arguments for innate ideas and knowledge, arguing that any innate ideas or knowledge would be universal but it is obvious from experience that not everyone has these ideas or knowledge. He also offers an alternative explanation, consistent with his empiricism, for how we come to have all our ideas and knowledge. So, he thinks, rationalists fail to prove that we have innate ideas or knowledge.
a. No Innate Ideas
Although Locke holds that all ideas come from experience, many of his contemporaries did not agree.
For example, in the Third Meditation, Descartes argues that the idea of an infinite and perfect God is innate. He argues that we cannot get the idea of an infinite God from our limited experience, and the only possible explanation for how we came to have this idea is that God created us so that we have the innate idea of God already in our minds. Other rationalists make similar arguments for other ideas. Following Noam Chomsky, this is sometimes called a Poverty of Stimulus Argument.
Locke has two responses to the Poverty of Stimulus Arguments for innate ideas. First, Locke argues that some people do not even have the ideas that the rationalists claim are innate. For example, some cultures have never heard of the theistic conception of God and so have never formed this kind of idea of God (1.4.8). In reply, some might claim that the idea of God is in the mind even if we are not conscious of that idea. For example, Plato suggests we are born with the idea of equality but we become conscious of this idea only after seeing equal things and thus “recollect” the idea; Leibniz suggests innate ideas are “petite perceptions” that are present even though we do not notice them. However, Locke argues that saying an idea is “in the mind” when we are not aware of it is unintelligible. An idea is whatever we are aware of, and so if we are not aware of an idea, then it is not “in the mind” at all (1.2.5).
Second, the Poverty of Stimulus Argument claims that certain ideas cannot come from experience, but Locke explains how a wide variety of our ideas do come from experience. For example, in response to Descartes’ claim that the idea of God cannot come from experience, Locke explains how the idea of God can be derived from experience. First, we get the idea of knowledge and power, and so forth, by reflecting on ourselves (2.1.4). Second, we can take the idea of having some power and imagine a being that has all power, and we can take the idea of some knowledge and imagine a being that has all knowledge (2.23.33). In this way, we can use our ideas from experience to form an idea of an infinitely powerful and omniscient God. Since he can explain how we got the idea of God from experience, there is little reason to believe Descartes’ claim that the idea of God is innate.
b. Empiricist Theory of Ideas
Locke’s criticism of innate ideas would be incomplete without an alternative explanation for how we get the ideas we have, including the ideas that the rationalists claim are innate. This section, then, describes how Locke thinks we form ideas.
Locke famously says the mind is like a blank piece of paper and that it has ideas only by experience (Essay 2.1.2). There are two kinds of experience: sensation and reflection. Sensation is sense perception of the qualities of external objects. It is by sensation that we receive ideas such as red, cold, hot, sweet, and other “sensible qualities” (2.1.3). Reflection is the perception of “the internal operations of our minds” (2.1.2). Consider, for example, the experience of making a decision. We weigh the pros and cons and then decide to do x instead of y. Making the decision is an act of the mind. But notice that there is something it feels like to deliberate about the options and then decide what to do. That is what Locke means by reflection: it is the experience we have when we notice what is going on in our own minds. By reflection we come to have the ideas of “perception, thinking, doubting, believing, reasoning, knowing, willing, and all the different actings of our own minds” (2.1.4).
The central tenet of Locke’s empiricism is that all of our ideas come from one of these two sources. For many of our ideas, it is obvious how we got them from experience: we got the idea of red from seeing something red, the idea of sweet by tasting something sweet, and so on. But it is less obvious how other ideas come from experience. We can form some new ideas without ever seeing such an object in sense perception, such as the idea of a unicorn or the idea of a gold mountain. Other abstract concepts, such as the idea of justice, seem like something we cannot observe. For Locke’s empiricism to be plausible, then, he needs to explain how we derived these kinds of ideas from experience
Locke divides ideas into simple ideas and complex ideas. A simple idea has “one uniform appearance” and “enter[s] by the senses simple and unmixed” (2.2.1), whereas a complex idea is made up of several simple ideas combined together (2.12.1). For example, snow is both white and cold. The color of the snow has one uniform appearance (i.e., appearing white), and so white is one simple idea that is included in the complex idea of snow. The coldness of snow is another simple idea included in the idea of snow. The idea of snow is a complex idea, then, because it includes several simple ideas.
Locke claims that all simple ideas come from experience (2.2.2), but we can combine simple ideas in new ways. Having already gained from experience the idea of gold and the idea of a mountain, we can combine these together to form the idea of a gold-mountain. This idea depends on our past experience, but we can form the idea of a gold-mountain without ever seeing one. According to Locke’s empiricist theory of ideas, then, all of our complex ideas are combinations of simple ideas we gained from experience (2.12.2).
Abstract ideas also depend on experience. Consider first the abstract idea of white. We form the idea of white by observing several white things: we see that milk, chalk, and snow all have the same sensible quality and call that “white” (2.11.9). We form the idea of white by separating the ideas specific to milk (for example, being a liquid, having a certain flavor), or specific to snow (for example, being cold and fluffy), and attending only to what is the same: namely, being white. We can do the same process of abstraction for complex ideas. For example, we can see a blue triangle, a red triangle, and so on. By focusing on what is the same (having three straight sides) and setting aside what is different (for example, the color, the angles) we can form an abstract idea of triangle.
The Poverty of Stimulus argument for innate ideas claims that some of our ideas cannot come from experience and hence they are innate. However, Locke tries to explain how all of our ideas are derived, directly or indirectly, from experience. All simple ideas come from sensation or reflection, and we can then form new complex ideas by combining simple ideas in new ways. We can form abstract ideas by separating out what is specific to the idea of particular objects and retaining what is the same between several different ideas. Although these complex ideas are not always the objects of experience, they still are derived from experience because they depend on simple ideas that we receive from experience. If these explanations are successful, then we have little reason to believe our ideas are innate; instead, Locke concludes, all our ideas depend on experience.
c. No Innate Knowledge
Locke is also an empiricist about knowledge. Yet many philosophers at the time argued that some knowledge is innate. Locke responds to two such arguments: The Argument from Universal Consent and The Priority Argument. He argues that neither of these arguments are successful.
i. The Argument from Universal Consent
Many philosophers at the time disagreed with Locke’s empiricism. Some asserted that knowledge of God and knowledge of morality are innate, and others claimed that knowledge of a few basic axioms such as the Law of Identity and the Law of Non-Contradiction are innate. Take p as any proposition such as those supposed to be innate. One argument, which Locke criticizes, uses universal consent to try to prove that some knowledge is innate:
Argument from Universal Consent:
Every person believes that p.
If every person believes that p, then knowledge of p is innate.
So, knowledge of p is innate.
Locke argues that both premises of the Argument from Universal Consent are false because there is no proposition which every person consents to, and, even if there were universal consent about a proposition, this does not prove that it is innate.
Locke says Premise 1 is false because no proposition is universally believed (2.2.5). For example, if p = “God exists,” then we know not everyone believes in God. And although philosophers sometimes assert that basic logical principles such as the Law of Non-Contradiction are innate, most people have not ever even thought about that principle, much less believed it to be true. We can know from experience, then, that premise 1 as stated is false. Perhaps premise 1 can be saved from Locke’s criticism by insisting that everyone who rationally thinksabout it will believe the proposition (1.2.6) (Leibniz defends this view). Even if premise 1 can be revised so that it is not obviously false, Locke still thinks the argument fails because premise 2 is false.
Premise 2 is false for two reasons. First, premise 2 would prove too much. Usually, proponents of innate knowledge think there are only a few basic principles that are innately known (1.2.9-10). Locke argues, though, that every rational person who thinks about it would consent to all of the theorems of geometry, and “a million of other such propositions” (1.2.18), and thus premise 2 would count way too many truths as “innate” knowledge (1.2.13-23). Second, some things are obviously true, and so the fact that everyone believes them does not prove that they are innate (4.7.9). In Book 4, Locke sets out to explain how all of our knowledge comes from reason and experience and, if he is successful, then universal consent by rational adults would not imply that the knowledge is innate (1.2.1). Hence, premise 2 is false.
Sometimes innate instincts are mistaken for an example of innate knowledge. For example, we are born with a natural desire to eat and drink (2.21.34), and this might be misconstrued as innate knowledge that we should eat and drink. But this natural desire should not be confused with knowledge that we should eat and drink. Traditionally, knowledge has been defined as justified-true-belief, whereas Locke describes knowledge as a kind of perception of the truth. On either conception, knowledge requires us to be aware of a reason for believing the truth. Locke can grant that a newborn infant has an innate desire for food while denying that the infant knows that it is good to eat food. Innate instinct or other natural capacities, then, are not the same as innate knowledge.
Locke’s criticism of innate knowledge can be put in the form of a dilemma (2.2.5). Either innate knowledge is something which we are aware of at birth or it is something we become aware of only after thinking about it. Locke objects that if we are unaware of innate knowledge, then we can hardly be said to know it. But if we become aware of the “innate” knowledge only after thinking about it, then “innate” knowledge just means that we have the capacity to know it. In that case, though, all knowledge would be innate, which is not typically what the rationalist wants to claim.
ii. The Priority Thesis
Some assert that we have innate knowledge of a few basic logical principles, or “maxims”, and that this is how we are able to come to know other things. Call this the Priority Thesis. Locke criticizes the Priority Thesis and explains how we can attain certain knowledge without it.
Locke and the advocates of the Priority Thesis disagree both about (i) what is known first and (ii) what is known on the basis of other things. According to the Priority Thesis, we first have innate knowledge of general maxims and then we have knowledge of particular things on the basis of these general maxims. Locke disagrees on both counts. He thinks we first have knowledge of particular things and therefore denies that we know them because of the general maxims.
Some rationalists, such as Plato and Leibniz, hold that knowledge of particulars requires prior knowledge of abstract logical concepts or principles. For example, knowing that “white is white” and “white is not black” is thought to depend on prior knowledge of the general Law of Identity. On this view, we can know that “white is white” only because we recognize it as an instance of the general maxim that every object is identical to itself.
Locke rejects the Priority Thesis. First, he uses children as empirical evidence that people have knowledge of particulars before knowledge of general maxims (1.2.23; 4.7.9). He therefore denies that we need knowledge of maxims before we can have knowledge of particulars, as the Priority Thesis asserts. Second, he thinks he can explain how we get knowledge of particulars without the help of the general maxim. For example, “white is white” and “white is not black” are self-evident claims: we do not need any other information other than the relevant ideas to know that it is true, and once we have those ideas it is immediately obvious the propositions are true. Locke argues that we cannot be more certain of any general maxim than these obvious truths about the particulars, nor would knowing these general maxims “add anything” to our knowledge about them (4.7.9). In short, Locke thinks that knowledge of the particulars does not depend on knowledge of general maxims.
Locke argues a priori knowledge should not be confused with innate knowledge (4.7.9); innate knowledge is knowledge we are born with, whereas a priori knowledge is knowledge we acquire by reflecting on our ideas. For example, we can have a priori knowledge that “white is white.” The idea of white must come from experience. But once we have that idea, we do not need look at a bunch of white things in order to confirm that all the white things are white. Instead, we know by thinking about it that whatever is white must be white. Rationalists, such as Leibniz, sometimes argue that we become fully conscious of innate knowledge only by using a priori reasoning. Again, though, Locke argues that if it is not conscious then it is not something we really know. If we only come to know it when we engage in a priori reasoning, then we should say that we learned it by reasoning rather than positing some unconscious knowledge that was there all along. However, consistent with his empiricism, Locke denies that we can know that objects exist and what their properties are just by a priori reasoning.
Locke rejects innate knowledge. Instead, he thinks we must acquire knowledge from reasoning and experience.
2. Empiricist Account of Knowledge
In Book 4 of the Essay, Locke develops his empiricist account of knowledge. Empiricism emphasizes knowledge from empirical observation, but some knowledge depends only on a reflection of our ideas received from experience. This section explains the role of reason and empirical observation in Locke’s theory of knowledge.
a. Types of Knowledge
Locke categorizes knowledge in two ways: by what we know and by how we know it. As for what we can know, he says there are “four sorts” of things we can know (4.1.3):
identity or diversity
relation
coexistence or necessary connection
real existence
Knowledge of identity is knowing that something is the same, such as “white is white,” and knowledge of diversity is knowing that something is different, such as “white is not black” (4.1.4). Locke thinks this is the most obvious kind of knowledge and all other knowledge depends on this.
Knowledge of relation seems to be knowledge of necessary relations. He denies we can have universal knowledge of contingent relations (4.3.29). Technically, the categories of identity and necessary connections are both necessary relations, but they are important and distinctive enough to merit their own categories (4.1.7). Knowledge of relation, then, is a catch-all category that includes knowledge of any necessary relation.
Knowledge of coexistence or necessary connection concerns the properties of objects (4.1.6). If we perceive a necessary connection between properties A and B, then we would know that “all A are B.” However, Locke thinks this a priori knowledge of necessary connection is incredibly limited: we can know that figure requires extension, and causing motion by impulse requires solidity, but little else (4.3.14). In general, Locke denies that we can have a priori knowledge of the properties of objects (4.3.14; 4.3.25-26; 4.12.9). Alternatively, if we observe that a particular object x has properties A and B, then we know that A and B “coexist” in x (4.3.14). For example, we can observe that the same piece of gold is yellow and heavy.
Finally, knowledge of real existence is knowledge that an object exists (4.1.7 and 4.11.1). Knowledge of existence includes knowledge of the existence of the self, of God, and of material objects.
Locke also divides knowledge by how we know things (4.2.14):
intuitive knowledge
demonstrative knowledge
sensitive knowledge
Intuitive knowledge comes from an immediate a priori perception of a necessary connection (4.2.1). Demonstrative knowledge is based on a demonstration, which is the perception of an a priori connection that is perceived by going through multiple steps (4.2.2). For example, the intuitions that “A is B” and “B is C” can be combined into a demonstration to prove that “A is C.” Finally, the sensation of objects provides “sensitive” knowledge (or knowledge from sensation) that those objects exist and have certain properties (4.2.14).
Locke describes intuitive, demonstrative, and sensitive knowledge as “three degrees of knowledge” (4.2.14). Intuitive knowledge is the most certain. It includes only things that the mind immediately sees are true without relying on any other information (4.2.1). The next degree of certainty is demonstrative knowledge, which consists in a chain of intuitively known propositions (4.2.2-6). This is less certain than intuitive knowledge because the truth is not as immediately obvious. Finally, sensitive knowledge is the third degree of knowledge.
There is considerable scholarly disagreement about Locke’s account of sensitive knowledge, or whether it even is knowledge. According to the standard interpretation, Locke thinks that sensitive knowledge is certain, though less certain than demonstrative knowledge. Alternatively, Samuel Rickless argues that, for Locke, sensitive knowledge is not, strictly speaking, knowledge at all. For knowledge requires certainty and Locke makes it clear that sensitive knowledge is less certain than intuitive and demonstrative knowledge. Locke also introduces sensitive knowledge by saying it is less certain and only “passes under the name knowledge” (4.2.14), perhaps implying that it is called “knowledge” even though it is not technically knowledge. However, in favor of the standard interpretation, he does call sensitive knowledge “knowledge” and describes sensitive knowledge as a kind of certainty (4.2.14). This encyclopedia article follows the standard interpretation.
Putting together what we know with how we know it: we can have intuitive knowledge of identity, and some necessary relations, and of our own existence (4.9.1); we can have demonstrative knowledge of some necessary relations (for example, in geometry), and of God’s existence (4.10.1-6); and we can then have sensitive knowledge of the existence of material objects and the co-existence of the properties of those objects (for example, this gold I see is yellow).
b. A Priori Knowledge
Both early modern rationalist and empiricist philosophers accept a priori knowledge. For example, they agree that we can have a priori knowledge of mathematics. Rationalists disagree about what we can know a priori. They tend to think we can discover claims about the nature of reality by a priori reasoning, whereas Locke and the empiricists think that we must instead rely on experience to learn about the natural world. This section explains what kinds of a priori knowledge Locke thinks we can and cannot have.
Locke defines knowledge as the perception of an agreement (or disagreement) between ideas (4.1.2). This definition of knowledge fits naturally, if not exclusively, within an account of a priori knowledge. Such knowledge relies solely on a reflection of our ideas; we can know it is true just by thinking about it.
Some a priori knowledge is (what Kant would later call) analytic. For example, knowledge of claims like “gold is gold” does not depend on empirical observation. We immediately and intuitively perceive that it is true. For this reason, Locke calls knowledge of identity “trifling” (4.8.2-3). Less obviously, we can have a priori knowledge that “gold is yellow.” According to Locke’s theory of ideas, the complex idea of gold is composed of the simple ideas of yellow, heavy, and so on. Thus, saying “gold is yellow” gives us no new information about gold and is, therefore, analytic (4.8.4).
We can also have (what Kant would later call) synthetic a priori knowledge. Synthetic propositions are “instructive” (4.8.3) because they give us new information. For example, a person might have the idea of a triangle as a shape with three sides without realizing that a triangle also has interior angles of 180 degrees. So, the proposition “a triangle has interior angles equal to 180 degrees” goes beyond the idea to tell us something new about a triangle (4.8.8), and thus is synthetic. Yet it can be proven, as a theorem in geometry, that the latter proposition is true. Further, this proof is a priori since the proof relies only on our idea of a triangle and not the observation of any particular triangle. So, we can know of synthetic a priori propositions.
Locke thinks we can have synthetic a priori knowledge of mathematics and morality. As mentioned above, any theorem in geometry will be proven a priori yet the theorem gives us new information. Locke claims we can prove moral truths in the same way (4.3.18 and 4.4.7). Yet, while moral theory is generally done by reflecting on our ideas, few have agreed with Locke that we can have knowledge as certain and precise about morality as we do of mathematics.
However, Locke denies that we can have synthetic a priori knowledge of the properties of material objects (4.3.25-26). Such knowledge would need to be instructive, and so tell us something new about the properties of the object, and yet be discovered without the help of experience. Locke argues, though, the knowledge of the properties of objects depends on observation. For example, if “gold” is defined as a yellow, heavy, fusible material substance, we might wonder whether gold is also malleable. But we do not perceive an a priori connection between this idea of gold and being malleable. Instead, we must go find out whether gold is malleable or not: “Experience here must teach me, what reason cannot” (4.12.9).
c. Sensitive Knowledge
Locke holds that empirical observation can give us knowledge of material objects and their properties, but he denies that we can have any sensitive knowledge (or knowledge from sensation) about material objects that goes beyond our experience.
Sensation gives us sensitive knowledge of the existence of external material objects (4.2.14). However, we can know that x exists whilewe perceive it but we cannot know that it continues to exist past the time we observe it: “this knowledge extends as far as the present testimony of the senses…and no farther” (4.11.9). For example, suppose we saw a man five minutes ago. We can be certain he existed while we saw him, but we “cannot be certain, that the same man exists now, since…by a thousand ways he may cease to be, since [we] had the testimony of the senses for his existence” (4.11.9). So, empirical observation can give us knowledge, but such knowledge is strictly limited to what we observe.
Empirical observation can also give us knowledge of the properties of objects (or knowledge of “coexistence”). For example, we learn from experience that the gold we have observed is malleable. We might reasonably conclude from this that all gold, including the gold we have not observed, is malleable. But we might turn out to be wrong about this. We have to rely on experience and see. So, for this reason, Locke considers judgments that go beyond what we directly observe “probability” rather than certain knowledge (4.3.14; 4.12.10).
Despite Locke’s insistence that we can have knowledge from experience, there are potential problems for his view. First, it is not clear how, if at all, sensitive knowledge is consistent with his general definition of knowledge. Second, Locke holds that we can only ever perceive the ideas of objects, and not the objects themselves, and so this raises questions about whether we can really know if external objects exist.
i. The Interpretive Problem
Locke’s description of sensitive knowledge (or knowledge from sensation) seems to conflict with his definition of knowledge. Some have thought that he is simply inconsistent, while others try to show how sensitive knowledge is consistent with his definition of knowledge.
The problem arises from Locke’s definition of knowledge. He defines knowledge as the perception of an agreement between ideas (4.1.2). But the perception of a connection between ideas appears to be an a priori way of acquiring knowledge rather than knowledge from experience. If we could perceive a connection between the idea of a particular man (for example, John) and existence, then we could know a priori, just by reflecting on our ideas, that the man exists. But this is mistaken. The only way to know if a particular person exists is from experience (see 4.11.1-2, 9). So, perceiving connections between ideas appears to be ill-suited for knowledge of existence.
Yet suppose we do see that John exists. There seems to be only one relevant idea needed for this knowledge: the sensation of John. It seems any other idea is unnecessary to know, on the basis of observation, that John exists. With what other idea, then, does sensitive knowledge depend? This is where interpretations of Locke diverge.
Some interpreters, such as James Gibson, hold that Locke is simply inconsistent. He defines knowledge as the perception of an agreement between two ideas, but sensitive knowledge is not the perception of two ideas; sensitive knowledge consists only in the sensation of an object. The advantage of this interpretation is that it just accepts at face value his definition of knowledge and his description of sensitive knowledge in the Essay. However, perhaps there is a consistent interpretation available. Moreover, Locke elsewhere identifies the two ideas that are supposed to agree, and so he thinks his account of sensitive knowledge fits his general definition.
Other interpreters turn to the passage in which Locke identifies the two ideas that in sensitive knowledge are supposed to agree. Locke explains:
Now the two ideas, that in this case are perceived to agree, and thereby do produce knowledge, are the idea of actual sensation (which is an action whereof I have a clear and distinct idea) and the idea of actual existence of something without me that causes that sensation. (Works v. 4, p. 360)
On one interpretation of this passage, by Lex Newman and others, the idea of the object (the sensation) is perceived to agree with the idea of existence. Locke describes the first idea as “the idea of actual sensation” and, on this interpretation, that means the sensation of the object. The second idea is “the idea of actual existence.” Excluding the parenthetical comment, this is a natural way to interpret the two ideas Locke identifies here. On this view, then, we know an object x exists when we perceive that the sensation of x agrees with the idea of existence.
On a second interpretation of the passage, by Jennifer Nagel and Nathan Rockwood, the idea of the object (the sensation) is perceived to agree with the idea of reflection of having a sensation. Here the “the idea of actual sensation” is not the sensation of the object; rather, it is a second-order awareness of having a sensation or, in other words, identifying the sensation as a sensation. This is because Locke describes the first idea as “the idea of actual sensation” and then follows that with the parenthetical comment “(which is an action…)”; an external object is not an action, but having a sensation is an action. So, perhaps the first idea should be taken as an idea of having a sensation. If so, then that makes the second idea the sensation of the object. This better captures Locke’s description of the first idea as an idea of an action. In any case, on this interpretation, we know an object x exists when we have the sensation of x and identify that sensation as a sensation.
However, the interpretive problem is resolved, there remains a worry that Locke’s view inevitably leads to skepticism.
ii. The Skeptical Problem
The skeptical problem for Locke is that perceiving ideas does not seem like the kind of thing that can give us knowledge of actual objects.
Locke has a representational theory of perception. When we perceive an object, we are immediately aware of the idea of the object rather than the external object itself. The idea is like a mental picture. For example, after seeing the photograph of Locke above, we can close our ideas and picture an image of John Locke in our minds. This mental picture is an idea. According to Locke, even if he were right here before our very eyes, we would directly “see” only the idea of Locke rather than Locke himself. However, an idea of an object represents that object. Just as looking at a picture can give us information about the thing it represents, “seeing” the idea of Locke allows us to become indirectly aware of Locke himself.
Locke’s representational theory of perception entails that there is a veil of perception. On this view, there are two things: the object itself and the idea of the object. Locke thinks we can never directly observe the object itself. There is, then, a “veil” between the ideas we are immediately aware of and the objects themselves. This raises questions about whether our sensation of objects really do correspond to external objects.
Berkeley and Thomas Reid, among others, object that Locke’s representational theory of perception inevitably leads to skepticism. Locke admits that, just as a portrait of a man does not guarantee that the man portrayed exists, the idea of a man does not guarantee that the man exists (4.11.1). So, goes the objection, since on Locke’s view we can only perceive the idea, and not the object itself, we can never know for sure the external object really exists.
While others frequently accuse Locke’s view of inevitably leading to skepticism, he is not a skeptic. Locke offers four “concurrent reasons” to believe sensations correspond to external objects (4.11.3). First, we cannot have an idea of something without first having the sensation of it, suggesting that it has an external (rather than an internal) cause (4.11.4). Second, sensations are involuntary. If we are outside in the daylight, we might wish the sun would go away, but it does not. Hence, it appears the sensation of the sun has an external cause (4.11.5). Third, some veridical sensations cause pain in a way that merely dreaming or hallucinating does not (4.11.6). For example, if we are unsure if the sensation of a fire is a dream, we can stick our hand into the fire and “may perhaps be wakened into a certainty greater than [we] could wish, that it is something more than bare imagination” (4.11.8). Fourth, the senses confirm each other: we can often see and feel an object, and in this way the testimony of one sense confirms that of the other (4.11.7). In each case, Locke argues that sensation is caused by an external object and thus the external object exists.
Perhaps the external cause of sensation can provide a way for Locke to escape skepticism. As seen above, he argues that sensation has an external cause. We thus have some reason to believe that our ideas correspond to external objects. Even if there is a veil of perception, then, sensations might nonetheless give us a reason to believe in external objects.
d. The Limits of Knowledge
One of the stated purposes of the Essay is to make clear the boundary between, on the one hand, knowledge and certainty, and, on the other hand, opinion and faith (1.1.2).
For Locke, knowledge requires certainty. As explained above, we can attain certainty by perceiving an a priori necessary connection between ideas (either by intuition or demonstration) or by empirical observation. In each case, Locke thinks, the evidence is sufficient for certainty. For example, we can perceive an a priori necessary connection between the idea of red and the idea of a color, and thus we see that it must be true that “red is a color.” Given the evidence, there is no possibility of error and hence we are certain. More controversially, Locke also thinks that direct empirical observation is sufficient evidence for certainty (4.2.14).
Any belief that falls short of certainty is not knowledge. Suppose we have seen many different ravens in a wide variety of places over a long period of time, and so far, all the observed ravens have been black. Still, we do not perceive an a priori necessary connection between the idea of a raven and the idea of black. It is possible, though perhaps unlikely, that we will discover ravens of a different color in the future. Yet no matter how high the probability is that all ravens are black, we cannot be certain that all ravens are black. So, Locke concludes, we cannot know for sure that all ravens are black (4.3.14). There is a sharp boundary between knowledge and belief that falls short of knowledge: knowledge is certain, whereas other beliefs are not.
Locke describes knowledge as “perception” whereas judgment is a “presumption” (4.14.4). To perceive that p is true guarantees the truth of that proposition. But we can presume the truth of p even if p is false. For example, given that all the ravens we have seen thus far have been black, it would be reasonable for us to presume that “all ravens are black” even though we are not certain this is true. Thus, judgment involves some epistemic risk of being wrong, whereas knowledge requires certainty.
In his account of empirical knowledge, Locke takes the knowledge-judgment distinction to the extreme in two important ways. First, while we can know that an object exists while we observe it, this knowledge does not extend at all beyond what we immediately observe. For example, suppose we see John walk into his office. While we see John, we know that John exists. Ordinarily, if we just saw John walk into his office a mere few seconds ago, we would say that we “know” that John exists and is currently in his office. But Locke does not use the word “know” in that way. He reserves “knowledge” for certainty. And, intuitively, we are more certain that John exists when we see him than when we do not see him any longer. So, the moment John shuts the door, we no longer know he exists. Locke concedes it is overwhelmingly likely that John continues to exist after shutting the door, but “I have not that certainty of it, which we strictly call knowledge; though the likelihood of it puts me past doubt, …this is but probability, not knowledge” (4.11.9). So, we can know something exists only for the time that we observe it.
Second, any scientific claim that goes beyond what is immediately observed cannot be known to be true. We know that observed ravens have all been black. From this we may presume, but cannot know, that unobserved ravens are all black (4.12.9). We know that friction of observed (macroscopic) objects causes heat. By analogy, we might reasonably guess, but cannot know, that friction among unobserved particles heats up the air (4.16.12). In general, experience of observed objects cannot give us knowledge of any unobserved objects. This distinction between knowledge of the observed versus uncertainty (or even skepticism) about the unobserved remains important for contemporary empiricism in the philosophy of science.
Thus, Locke, and empiricists after him, sharply distinguish between the observed and the unobserved. Further, he maintains that we can knowledge of the observed but never of the unobserved. When our evidence falls short of certainty, Locke holds that probable evidence ought to guide our beliefs.
3. Judgment (Rational Belief)
Locke holds that all rational beliefs required evidence. For knowledge, that evidence must give us certainty: there must not be the possibility of error given the evidence. But beliefs can be rational even if they fall short of certainty so long as the belief is based on what is probably true given the evidence.
There are two conditions for rational judgment (4.15.5). First, we should believe what is most likely to be true. Second, our confidence should be proportional to the evidence. The degrees of probability, given the evidence, depends on our own knowledge and experience and the testimony of others (4.15.4).
The three kinds of rational judgment that Locke is most concerned with are beliefs based on science, testimony, and faith.
a. Science
Locke was an active participant in the scientific revolution and his empiricism was an influential step towards the modern view of science. He was close associates with Robert Boyle and Isaac Newton, and was also a member of the Royal Society, a group of the leading scientists of the day. He describes himself not as one of the “master-builders” who are making scientific discoveries, but as an “under-labourer” who would be “clearing Ground a little, and removing some of the Rubbish, that lies in the way to knowledge” about the natural world (Epistle to the Reader, p. 9-10). Locke contributes to the scientific revolution, then, by developing an empiricist epistemology consistent with the principles of modern science. His emphasis on the need for empirical observation and the uncertainty of general conclusions of science helped shaped the modern view of science as being a reliable, though fallible, source of information about the natural world.
The prevailing view of science at the time was the Aristotelian view. According to Aristotle, a “science” is a system of knowledge with a few basic principles that are known to be true and then all other propositions in the science are deduced from these basic principles. (On this view, Euclid’s geometry is the paradigmatic science.) The result of Aristotelian science, if successful, is a set of necessary truths that are known with perfect certainty. While Locke is willing to grant that God and angels might have such knowledge of nature, he emphatically denies that us mere mortals are capable of this kind of certainty about a science of nature (4.3.14, 26; 4.12.9-10).
In Locke’s view, an Aristotelian science of nature would require knowledge of the “real essence” of material objects. The real essence of an object is a set of metaphysically fundamental properties that make an object what it is (3.6.2). Locke draws a helpful analogy to geometry (4.6.11). We can start with the definition of a triangle as a shape with three sides (its real essence) and then we can deduce other properties of a triangle from that definition (such as having interior angles of a 180 degrees). Locke thinks of the real essence of material objects in the same kind of way. For example, the real essence of gold is a fundamental set of properties, and this set of properties entails that gold is yellow, heavy, fusible, and so on. Now, if we knew the real essence of gold, then we could deduce its other properties in the same way that we can deduce the properties of a triangle. But, unlike with a triangle, we do not know the real essence of gold, or of any other material object. Locke thinks the real essence of material objects is determined by the structure of imperceptibly small particles. Since, at that time, we could not see the atomic structure of different material substances, Locke thinks we cannot know the real essences of things. For this reason, he denies that we can have an Aristotelian science of nature (4.3.25-26).
The big innovation in Locke’s philosophy of science is the introduction of the concept of a “nominal essence” (3.6.2). It is often assumed that when we classify things into natural kinds that this tracks some real division in nature. Locke is not so sure. We group objects together, and call them by the same name, because of superficial similarities. For example, we see several material substances that are yellow, heavy, and fusible, and we decide to call that kind of stuff “gold.” The set of properties we use to classify gold as gold (that is, being yellow, heavy, and fusible) is the “nominal” essence of gold (“nominal” meaning here “in name only”). Locke does not think that the nominal essence is the same as the real essence. The real essence is, at the time, the unobservable chemical structure of the object, whereas the nominal essence is the set of observable qualities we use to classify objects. Locke therefore recognizes that there is something artificial about the natural kinds identified by scientists.
In general, Locke denies that we can have synthetic a priori knowledge of material objects. Because we can have knowledge of only the nominal essence of an object, and not its real essence, we are unable to make a priori inferences about what other properties an object has. For example, if “gold” is defined as a yellow, heavy, fusible material substance, then “gold is yellow” would be analytic. The claim “gold is malleable” would be synthetic, because it gives us new information about the properties of gold. There is no a priori necessary connection between the defining properties of gold and malleability. Therefore, we cannot know with certainty that all gold is malleable. For this reason, Locke says, “we are not capable of scientifical knowledge; nor shall [we] ever be able to discover general, instructive, unquestionable truths concerning” material objects (4.3.26).
Most of Locke’s comments about science emphasize that we cannot have knowledge, but this does not mean that beliefs based on empirical science are unjustified. In Locke’s view, a claim is “more or less probable” depending on “the frequency and constancy of experience” (4.15.6). The more frequently it is observed that “A is B” the more likely it would be that, on any particular occasion, “A is B” (4.16.6-9). For example, since all the gold we have observed has been malleable, it is likely that all gold, even the gold we have not observed, is also malleable. In this way, we can use empirical evidence to make probable inferences about what we have not yet observed.
For Locke, then, all knowledge and rational beliefs about material objects must be grounded in empirical observation, either by observation or probable inferences made from observation.
b. Testimony
Testimony can be a credible source of evidence. Locke develops an early and influential account of when testimony should be believed and when it should be doubted.
In Locke’s view, we cannot know something on the basis of testimony. Knowledge requires certainty, but there is always the possibility that someone’s testimony is mistaken: perhaps the person is lying, or honestly stating her belief but is mistaken. So, although credible testimony is likely to be true, it is not guaranteed to be true, and hence we cannot be certain that it is so.
Yet credible testimony is often likely to be true. A high school math teacher knows a theorem in geometry is true because she has gone through the steps of the proof. She might then tell her students that the theorem is true. If they believe her on the basis of her testimony, rather than going through the steps of the proof themselves, then they do not know the theorem is true; yet they would have a rationally justified belief because it is likely to be true given the teacher’s testimony (4.15.1).
Whether someone’s testimony should be believed depends on (i) how well it conforms with our knowledge and past experience and (ii) the credibility of the testimony. The credibility of testimony depends on several factors (4.15.4):
the number of people testifying (more witnesses provide more evidence)
the integrity of the people
the “skill” of the witnesses (that is, how well they know what they are talking about)
the intention of the witnesses
the consistency of the testimony
contrary testimonies (if any)
We can be confident in a testimony that conforms with our own past experience and the reported experience of others (4.16.6). As noted above, the more frequently something is observed the more likely it happened on a given occasion. For example, in our past experience fire has always been warm. When we have the testimony that, on a specific occasion, a fire was warm, then we should believe it with the utmost confidence.
We should be less confident in the testimony when it conflicts with our past experience (4.16.9). Locke relates the story of the King of Siam who, knowing only the warm climate of south-east Asia, is told by a traveler that in Europe it gets so cold that water becomes solid (4.15.5). On the one hand, the King has credible testimony that water becomes solid. On the other hand, this conflicts with his past experience and the reported experience of those he knows. Locke implies that the King rationally doubted the testimony because, in this case, the evidence from the King’s experience is greater than the evidence from testimony. Experience does not always outweigh the evidence from testimony. The evidence from testimony depends in part on the number of people: “as the relators are more in number, and of more credit, and have no interest to speak contrary to the truth, so that matter of fact is like to find more or less belief.” So, if there is enough evidence from testimony, then that could in principle outweigh the evidence from experience.
That the evidence from testimony can sometimes outweigh the evidence from experience is particularly relevant to the testimony of miracles. Hume famously argues that because the testimony of miracles conflict with our ordinary experience we should never believe the testimony of a miracle. Locke, however, remains open to the possibility that the evidence from testimony could outweigh the evidence from experience. Indeed, he argues that we should believe in revelation, particularly in the Bible, because of the testimony of miracles (4.16.13-14; 4.19.15).
Although Locke thinks that testimony can provide good evidence, it does not always provide good evidence. We should not believe “the opinion of others” just because they say something is true. One difference between the testimony Locke accepts as credible and the testimony he rejects is that credible testimony begins with knowledge, whereas the testimony of “the opinion of others” is merely speculation of things “beyond the discovery of the senses” and so are “not capable of any such testimony” (4.16.5). In taking this attitude, Locke follows the Enlightenment sentiment of rejecting authority. Speculative theories should not be believed based on testimony. Instead, we should base our opinions on observation and our own reasoning.
Testimony provides credible evidence when the testimony makes it likely that a claim is true. When the person is in a position to know, either from reasoning or from experience, it can be reasonable for us to believe the person’s testimony.
c. Faith
Locke insists that all of our beliefs should be based on evidence. While this seems obviously true for most beliefs, some people want to make an exception for religion. Faith is sometimes thought to be belief that goes beyond the evidence. However, Locke thinks that, if faith is to be rational, even faith must be supported by evidence.
Some religious claims can be proven by the use of reason or natural theology. For example, Locke makes a cosmological argument for the existence of God (4.10.1-6). He thinks that, given this proof from natural theology, we can know that God exists. This kind of belief in God is knowledge and not faith, since faith implies some uncertainty.
Many religious claims cannot be proven by the use of natural reason; we must instead rely on revelation. Locke defines faith as believing something because God revealed it (4.18.2). We do not perceive the truth, as we do in knowledge, but instead presume that it is true because God has told us so in a revelation. Just as human testimony can provide evidence, revelation, as God’s testimony, can provide evidence. Yet revelation is better evidence than human testimony because human testimony is fallible whereas divine revelation is infallible (4.18.10).
We should believe whatever God has revealed, but we first must have good reason to believe that God revealed it. This makes faith dependent on reason. For reason must judge whether something is a genuine revelation from God (4.18.10). Further, we must be sure that we interpret the revelation correctly (4.18.5). Since whatever God reveals is guaranteed to be true, if we have good evidence that God revealed that p, then that provides us with evidence that p is true. In this way, Locke can insist that all religious beliefs require evidence and yet believe in the truths of revealed religion.
Locke only admits publicly available evidence as evidence for revelation. He criticizes “enthusiasm” which is, as he describes it, believing (what they claim is) a revelation without evidence that it is a revelation (4.19.4). The enthusiast believes God has revealed that p only because it seems to the person to have come from God. Some religious epistemologists take religious experience as evidence for religious belief, but Locke is skeptical of religious experience. Locke demands that this subjective feeling that God revealed p be backed up with concrete evidence that it really did come from God (4.19.11). Instead of relying on private religious experience, Locke appeals to miracles as publicly available evidence supporting revelation (4.16.13; 4.19.14-15).
Locke also limits the kind of things that can be believed on the basis of revelation. Some propositions are accordingto reasons, others are contrary to reason, and still others are above reason. Only the things above reason can appropriately be believed on the basis of revelation (4.18.5, 7).
Claims that are “according to reason” should not be believed on the basis of revelation because we already have all the evidence we need. By “according to reason” Locke means those propositions that we have a priori knowledge of, either from intuition or demonstration. If we already have a priori knowledge that p, then we need no further evidence. In that case, revelation would be useless because we already have certainty without it.
Claims that are “contrary to reason” should not be believed on the basis of revelation because we know with certainty that they are false. By “contrary to reason” Locke means those propositions that we have a priori knowledge are false. If it is self-evident that p is false, then we cannot rationally believe it under the pretense that it was revealed by God. For God only reveals true things, and if we know with certainty p is false, then we know for sure that God did not reveal it.
Faith, then, concerns those things that are “above reason.” By “above reason” Locke means those propositions that cannot be proven one way or the other by a priori reasoning. For example, the Bible teaches that some of the angels rebelled and were kicked out of heaven, and it predicts that the dead will rise again at the last day. We cannot know with certainty these claims are true, nor that they are false. We must instead rely on revelation: “faith gave the determination, where reason came short; and revelation discovered which side the truth lay” (4.18.9).
There is some disagreement about how much weight Locke gives revelation when there is conflicting evidence. For example, suppose we have good reason to believe that God revealed that p to Jesus, but that given our other evidence p appears to be false. Locke says “evident revelation ought to determine our assent even against probability” (4.18.9). On one interpretation, if there is good reason to believe God revealed that p we should believe p no matter how likely it is that p is false given our other evidence. On another interpretation, Locke carefully sticks with his evidentialism: if the evidence that God revealed p outweighs the evidence that p is false, then we should believe that p; but if the evidence that p is false outweighs the evidence that God revealed it, then we should not believe p (nor that God revealed it).
According to Locke, God created us as rational beings so that we would form rational beliefs based on the available evidence, and he thinks religious beliefs are no exception.
4. Conclusion
Locke makes a number of important contributions to the history of epistemology.
He makes the most sustained criticism of innate ideas and innate knowledge, which convinced generations of later empiricists such as Berkeley and Hume. He argues that there is no universal knowledge we are all born with and, instead, all our ideas and knowledge depend on experience.
He then develops an explanation for how we acquire knowledge that is consistent with his empiricism. All knowledge is either known a priori or based on empirical evidence. Later empiricists, such as Hume and the logical positivists, follow Locke in thinking some knowledge is known a priori whereas other knowledge is based on empirical evidence. However, Locke allows for synthetic a priori knowledge whereas Hume and the logical positivists hold that all a priori knowledge is analytic.
Locke’s emphasis on the need for the empirical evidence also supported the developments of the scientific revolution. Locke argues that a priori knowledge of nature is not possible, a thesis for which Hume would later become more famous. Rather than a priori knowledge of nature, Locke emphasizes the need for empirical evidence. Although inferences from empirical observations cannot give us certainty, Locke thought that they can give us evidence about what is most likely to be true. So, science should be both based on empirical evidence and acknowledge its uncertainty. In this way, Locke helped shift attitudes about science away from the Aristotelian view towards the modern conception of empirical science that is always open to revision upon further observation.
Locke gives one of the earliest careful treatments of how testimony can serve as evidence. He argues that testimony should be evaluated by its internal consistency and consistency with other things we know, including past observations about similar cases. Hume later takes this view of testimony and then, as he famously argues, claims that since the testimony of miracles conflicts with past experience we should not believe the testimony of miracles.
Finally, Locke insists that religious belief should be based on evidence. Locke himself thought that there was sufficient evidence from the testimony of miracles and revelation to support his belief in Christianity. However, many have criticized Locke’s evidentialism for undermining the rationality of religious belief. Critics such as Hume agreed that religious belief needs to be supported by evidence, but they argue there is no good evidence, and hence religious belief is not rational. Others, such as William Paley, attempted to provide the evidence needed to support religious belief. Needless to say, whether there is good evidence for religion or not was just as controversial then as it is now, yet many agree with Locke that religious belief requires evidence.
The guiding principle of Locke’s epistemology is the need for evidence. We can acquire evidence from a priori reasoning, from empirical observation, or most often from inferences from empirical observation. Limiting our beliefs to those supported by evidence, Locke thinks, is the most reliable way to get at the truth.
5. References and Further Reading
a. Locke’s Works
Locke, John. 1690/1975 An Essay Concerning Human Understanding (ed. Peter Nidditch). Oxford University Press, 1975.
References to the Essay are cited by book, chapter, and section. For example, 2.1.2 is book 2, chapter 1, section 2.
Locke, John, The Works of John Locke in ten volumes (ed. Thomas Tegg). London.
b. Recommended Reading
Anstey, Peter. 2011. John Locke & Natural Philosophy. Oxford: Oxford University Press.
This book is on Locke’s view of science and its relationship to Robert Boyle and Isaac Newton.
Ayers, Michael. 1993. Locke: Epistemology and Ontology. New York: Routledge
Volume 1 of this two-volume work is dedicated to Locke’s epistemology. It includes chapters on Locke’s theory of ideas, theory of perception, probable judgment, and knowledge.
Gibson. James. 1917. Locke’s Theory of Knowledge and its Historical Relations. Cambridge: Cambridge University Press.
This book gives a thorough overview of Locke’s epistemology with chapters showing how Locke’s view relates to other early modern philosophers such as Descartes, Leibniz, and Kant. The comparison of Locke and Kant on a priori knowledge is particularly helpful.
Gordon-Roth, Jessica and Weinberg, Shelley. 2021. The Lockean Mind. New York: Routledge.
A survey of all aspects of Locke’s philosophy by different scholars. It has several articles on Locke’s epistemology, including on Locke’s criticism of innate knowledge, account of knowledge, account of probable judgment, and religious belief, as well as his theory of ideas and theory of perception.
Jacovides, Michael. 2017. Locke’s Image of the World. Oxford: Oxford University Press.
This book is on Locke’s view of science and how Locke was influenced by and exerted an influence on scientific developments happening at the time.
Nagel, Jennifer. 2014. Knowledge: A Very Short Introduction. Oxford: Oxford University Press.
This introduction to epistemology includes a chapter on the early modern rationalism-empiricism debate and a chapter on Locke’s view of testimony.
Nagel, Jennifer. 2016. “Sensitive Knowledge: Locke on Skepticism and Sensation,” A Companion to Locke.
A discussion of Locke’s account of sensitive knowledge and response to skepticism.
Newman, Lex. 2007. “Locke on Knowledge.” Cambridge Companion to Locke’s Essay. Cambridge: Cambridge University Press.
This is an accessible and excellent overview of Locke’s theory of knowledge.
Newman, Lex. 2007. Cambridge Companion to Locke’s Essay. Cambridge: Cambridge University Press.
A survey of Locke’s Essay by different scholars. It has several articles on Locke’s epistemology, including his criticism of innate knowledge, theory of ideas, account of knowledge, probable judgment, and religious belief.
Chappell, Vere. 1994. The Cambridge Companion to Locke. Cambridge: Cambridge University Press.
A survey of all aspects of Locke’s philosophy by different scholars. It includes articles on Locke’s theory of ideas, account of knowledge, and religious belief.
Rickless, Samuel. 2007. “Locke’s Polemic Against Nativism.” Cambridge Companion to Locke’s Essay. Cambridge: Cambridge University Press.
This is an accessible and excellent overview of Locke’s criticism of innate knowledge.
This book is an introduction and overview of Locke’s philosophy, which includes chapters on Locke’s criticism of innate knowledge, account of knowledge, and religious belief.
Rockwood, Nathan. 2018. “Locke on Empirical Knowledge.” History of Philosophy Quarterly, v. 35, n. 4.
The article explains Locke’s view on how empirical observation can justify knowledge that material objects exist and have specific properties.
Rockwood, Nathan. 2020. “Locke on Reason, Revelation, and Miracles.” The Lockean Mind (ed. Jessica Gordon-Roth and Shelley Weinberg). New York: Routledge.
This article is a good introduction to Locke’s religious epistemology.
Wolterstorff, Nicholas. 1996. John Locke and the Ethics of Belief. Cambridge: Cambridge University Press.
This book gives an overview of Locke’s epistemology generally and specifically his account of religious belief.
Charles L. Dodgson (also known as Lewis Carroll), 1832-1898, was a British mathematician, logician, and the author of the ‘Alice’ books, Alice’s Adventures in Wonderland and Through the Looking Glass and What Alice Found There. His fame derives principally from his literary works, but in the twentieth century some of his mathematical and logical ideas found important applications. His approach to them led him to invent various methods that lend themselves to mechanical reasoning. He was not a traditional mathematician. Rather, he applied mathematical and logical solutions to problems that interested him. As a natural logician at a time when logic was not considered to be a part of mathematics, he successfully worked in both fields. Everything he published in mathematics reflected a logical way of thinking, particularly his works on geometry. Dodgson held an abiding interest in Euclid’s geometry. Of the ten books on mathematics that he wrote, including his two logic books, five dealt with geometry. From his study of geometry, he developed a strong affinity for determining the validity of arguments not only in mathematics but in daily life too. Dodgson felt strongly about logic as a basis for cogent thought in all areas of life—yet he did not realize he had developed concepts that would be explored or expanded upon in the twentieth century. Dodgson’s approach to solving logic problems led him to invent various methods, particularly the method of diagrams and the method of trees. As a method for a large number of sets, Carroll diagrams are easier than Venn diagrams to draw because they are self-similar. His uncommon exposition of elementary logic has amused modern authors who continue to take quotations from his logic books. The mathematician and logician Hugh MacColl’s views on logic were influenced by reading Dodgson’s SymbolicLogic, Part I. Their exchanges show that both had a deep interest in the precise use of words. And both saw no harm in attributing arbitrary meanings to words, as long as that meaning is precise and the attribution agreed upon. Dodgson’s reputation as the author of the ‘Alice’ books cast him primarily as an author of children’s books and prevented his logic books from being treated seriously. The barrier created by the fame Carroll deservedly earned from his ‘Alice’ books combined with a writing style more literary than mathematical, prevented the community of British logicians from properly recognizing him as a significant logician during his lifetime.
Charles Lutwidge Dodgson (1832-1898), better known by his pen name Lewis Carroll that he adopted in 1856, entered Christ Church, Oxford University in England in 1852. After passing Responsions, the first of the three required examinations and achieving a first Class in Mathematics and a Second Class in Classics in Moderations, the second required examination, he earned a bachelor’s degree in 1854, placing first on the list of First-Class Honours in Mathematics, and earning Third Class Honours in the required Classics. He received the Master of Arts degree in 1857. He remained at Christ Church College for the rest of his life.
He began teaching individual students privately in differential calculus, conics, Euclidean geometry, algebra, and trigonometry. In 1856 the Dean of Christ Church, the Reverend Henry Liddell, appointed him the Mathematical Lecturer, a post he held for 25 years, before resigning it in 1881.
In 1856 he took up photography, eventually taking about 3,000 photos, many of eminent people in government, science, the arts, and theatre. Prime Minister Salisbury, Michael Faraday, and John Ruskin were some of his subjects. He became one of the most eminent photographers of his time. He also was a prolific letter writer, keeping a register of the letters he received and sent, 98,721 of them, in the last thirty-five years of his life.
Taking holy orders was a requirement for all faculty. He chose to be a deacon rather than a priest so that he could devote his time to teaching and continue to go to the theatre in London which was his favorite pastime. He was ordained a Deacon in 1861. Dodgson developed a deeply religious view in his life. His father, Archdeacon Charles Dodgson, had been considered a strong candidate for the post of Archbishop of Canterbury before he married.
His first publications (pamphlets and books from 1860 to 1864) were designed to help students: A Syllabus of Plane Algebraic Geometry, Systematically Arranged with Formal Definitions, Postulates and Axioms; Notes on the First Two Books of Euclid; Notes on the First Part of Algebra; The Formulae of Plane Trigonometry; The Enunciations of Euclid I, II; General List of [Mathematical] Subjects, and Cycle for Working Examples; A Guide to the Mathematical Student.
In the mid-1860s Dodgson became active in college life, writing humorous mathematical ‘screeds’ to argue on various issues at Christ Church, voting on the election of Students (Fellows), and on physical alterations to the College’s buildings and grounds. These activities piqued his interest in ranking and voting methods. He became a member of the Governing Board in 1867 and remained on it for his entire life. In 1868 he acquired an apartment in the NW corner of Tom Quad, part of Christ Church, where he constructed a studio for his photography on its roof. His apartment was the choicest, most expensive one in the College.
Becoming active in political affairs outside the College in the 1880s, he sent many letters to the Editors of The St. James’s Gazette, the Pall Mall Gazette, and other newspapers presenting his position on various issues of national importance. Through his photography, he became friendly with Lord Salisbury, who became Prime Minister in 1881. Their social relationship, begun in 1870, lasted throughout Dodgson’s life and spurred him to consider the problem of fairness both in representation and apportionment, culminating in his pamphlet of 1884, The Principles of Parliamentary Representation.
His publications, pamphlets and two books, during the remainder of the 1860s reflect these interests as well as those of mathematics, and those that provide evidence of his considerable literary abilities: The Dynamics of a Particle with an Excursus on the New Method of Evaluation as Applied to Π; Alice’s Adventures in Wonderland; An Elementary Treatise on Determinants with Their Applications to Simultaneous Linear Equations and Algebraical Geometry; The Fifth Book of Euclid Treated Algebraically, so Far as It Relates to Commensurable Magnitudes, with Notes; Algebraical formulae for the Use of Candidates for Responsions; Phantasmagoria and Other Poems.
His publications in the 1870s continued in the same vein: Algebraical Formulae and Rules for the Use of Candidates for Responsions; Arithmetical Formulae and Rules for the Use of Candidates for Responsions; Through The looking-Glass, And What Alice Found There; The Enunciations of Euclid, Books I–VI; Examples in Arithmetic; A Discussion of The Various Methods Of Procedure in Conducting Elections; Suggestions As To The Best Method Of Taking Vote; Euclid Book V, Proved Algebraically So Far As It Relates To Commensurable Magnitudes, with Notes; The Hunting Of The Snark; A Method Of Taking Votes On More Than Two Issues; Euclid And His Modern Rivals.
After resigning his position as Mathematical Lecturer, Dodgson now had more time for writing. In the first half of the 1880s Dodgson published Lawn Tennis Tournaments, The Principles of Parliamentary Representation, A Tangled Tale, Alice’s Adventures Underground (facsimile edition).
But the second half of the 1880s saw a tectonic shift with his first book on logic: The Game of Logic, as well as his cipher, Memoria Technica, two more books, Curiosa Mathematica, Part I: A New Theory of Parallels, and Sylvie and Bruno. In 1887 he published the first of three articles in Nature, “To Find the Day of the Week for any Given date”.
In the last decade of his life more books were published. The Nursery Alice appeared in 1890. Curiosa Mathematica, Part II: Pillow Problems, and Sylvieand Bruno Concluded appeared in 1893. His only other publications in logic came out between 1894 and 1896. These were the two articles in Mind, “A Logical Paradox”, “What the Tortoise Said to Achilles”, and a book, Symbolic Logic, Part I: Elementary. From 1892 to 1897 he worked on a chapter of a projected book on games and puzzles that was never published. It included his “Rule for Finding Easter-Day for Any Date till A. D. 2499”. His final publications were: “Brief Method of Dividing a Given Number By 9 Or 11”, (1897) and “Abridged Long Division” (1898). Both appeared in the journal Nature.
2. The Logic Setting in His Time
The treatment of logic in England began to fundamentally change when George Boole published a short book in 1847 called The Mathematical Analysis of Logic. In it he developed the notion that logical relations could be expressed by algebraic formulas. Boole, using his laws of calculation, was able to represent algebraically all of the methods of reasoning in traditional classical logic. And in a book that he published in 1854, An Investigation of the Laws of Thought, Boole set out for himself the goal of creating a completely general method in logic.
Paralleling Boole’s work was that of De Morgan, whose book, Formal Logic, appeared at about the same time as Boole’s in 1847. De Morgan became interested in developing the logic of relations to complement Boole’s logic of classes. His purpose was to exhibit the most general form of a syllogism. His belief that the laws of algebra can be stated formally without giving a particular interpretation such as the number system, influenced Boole.
Although Boole and his followers understood that they were just algebraizing logic, that is, rewriting syllogisms in a new notational system rather than inventing a new logical calculus, they correctly claimed that all valid arguments cannot be reduced to these forms. Venn understood this; he published an article in Mind in 1876 that included the following problem as an illustration of the inadequacies of Aristotelian forms of reasoning and the superiority of Boolean methods. Venn had given the problem whose conclusion is: no shareholders are bondholders, as a test question to Cambridge University undergraduates. He remarked that of the 150 or so students, only five or six were able to solve the following simple problem:
A certain Company had a Board of Directors. Every Director held either Bonds or Shares; but no Director held both. Every Bondholder was on the Board. Deduce all that can logically be deduced, in as few propositions as possible.
For Dodgson and his contemporaries, the central problem of the logic of classes, known as the elimination problem, was to determine the maximum amount of information obtainable from a given set of propositions. In his 1854 book, Boole made the solution to this problem considerably more complex when he provided the mechanism of a purely symbolic treatment which allowed propositions to have any number of terms, thereby introducing the possibility of an overwhelming number of computations.
Logical arguments using rules of inference are a major component of both geometry and logic. To Dodgson, logic and geometry shared the characteristics of truth and certainty, qualities that held him in thrall. From the mid 1880s on, he shifted his focus from the truth given by geometrical theorems (true statements) to the validity of logical arguments, the rules that guarantee that only true conclusions can be inferred from true premises, and he pushed the envelope of the standard forms of the prevailing logic of his time, which was Aristotelian.
Dodgson began writing explicitly about logic in the 1870s when he began his magnus opus, Symbolic Logic, the first part appearing in 1896. Dodgson’s formulation of formal logic came late in his life following his publications on Euclid’s geometry in the 1860s and 1870s. In mathematics generally, and in geometry particularly, one begins with a set of axioms and certain inference rules to infer that if one proposition is true, then so is another proposition. To Dodgson, geometry and logic shared the characteristic of certainty, a quality that always interested him. But by the early 1890s he had shifted his focus away from the truth given by geometrical theorems to the validity of logical arguments.
Dodgson worked alone but he was not at all isolated from the community of logicians of his time. He corresponded with a number of British logicians These include: James Welton, author of the two volume A Manual of Logic; John Cook Wilson, Professor of Logic at Oxford from 1889 until his death in 1915; Thomas Fowler, Wykeham Professor of Logic at Oxford (1873 to 1889) and author of The Elements of Deductive Logic; William Ernest Johnson, a collaborator of John Neville Keynes at Cambridge and author of “The Logical Calculus,” a series of three articles that appeared in Mind in 1892; Herbert William Blunt; Henry Sidgwick, Professor of Moral Philosophy at Cambridge; John Venn, author of the influential book, Symbolic Logic; as well as F. H. Bradley, author of The Principles of Logic; and Stewart. He also cited the book, Studies in Logic, edited by Peirce and includes pieces by Peirce’s students: Marquand, Ladd – Franklin, Oscar Howard Mitchell, and B. I. Gilman. We know from Venn’s review of Studies in Logic appearing in the October 1883 edition of Mind, soon after Peirce’s book was published, that Peirce was well-known to the British symbolists, and that they were aware of Peirce’s publications.
Marquand’s contributions, a short article, “A Machine for Producing Syllogistic Variations”, and his “Note on an Eight-Term Logic Machine”, contain ideas that Dodgson captured in his Register of Attributes, a tool he constructed to organize the premises when he applied his tree method to soriteses, (A soritesis an argument having many premises and a single conclusion. It can be resolved as a list of syllogisms, the conclusion of each becoming a premise of the next syllogism.) Dodgson had used ideas associated with a logic machine even earlier in The Game of Logic.
The sale of Dodgson’s library at his death included works on logic by Boole, Venn, Allan Marquand, Mitchell, Ladd-Franklin, Benjamin Ives Gilman, Peirce, John Neville Keynes, Rudolph Hermann Lotze (in English translation by Bernard Bosanquet) James William Gilbart, De Morgan, Bernard Bosanquet, Francis H. Bradley, John Stuart Mill, William Stirling Hamilton, William Whewell, and Jevons, among others. Some of these works influenced his own writing and also provided material he needed in his dealings with Oxford adversaries,
3. Logic and Geometry
On an implicit level, Dodgson wrote about logic throughout his entire professional career. Everything he published in mathematics reflected a logical way of thinking, particularly his works on geometry. Dodgson’s heightened concern with logic followed his publications on Euclid’s geometry from the 1860s and 1870s.
From the mid -1880s on, he shifted his focus from the truth given by geometrical theorems (true statements) to the validity of logical arguments, the rules that guarantee that only true conclusions can be inferred from true premises. On p. xi of the preface to the third edition (1890) of his book about geometry, Curiosa Mathematica Part I: A New Theory of Parallels, he pointed out that the validity of a syllogism is independent of the truth of its premises. He gave this example:
I have sent for you, my dear Ducks, said the worthy Mrs. Bond, ‘to enquire with what sauce you would like to be eaten?’ ‘But we don’t want to be killed!’ cried the Ducks. ‘You arewandering from the point’ was Mrs. Bond’s perfectly logical reply.
Dodgson held an abiding interest in Euclid’s geometry. Of the ten books on mathematics that he wrote, including his two logic books, five dealt with geometry. From his study of geometry, he developed a strong affinity for determining the validity of arguments not only in mathematics but in daily life too. Arguably, Dodgson’s formulation of formal logic came late in his life as the culmination of his publications on Euclid’s geometry in the 1860s and 1870s. Exactly one month before he died, in an unpublished letter Dodgson wrote to Dugald Stewart criticizing a manuscript Stewart had given to him for his opinion, he commented:
Logic, under that view, would become to me, a science of such uncertainty that I shd [should] take no further interest in it. It is its absolute certainty which at present fascinates me. (Dodgson, Berol Collection, New York University, 14 December 1897)
We also know that Dodgson was proficient in proving theorems by the contradiction method in his many publications on geometry. Just as logic informed his geometric work, so geometry informed his logic writings. In his logic book, he used geometric notation and terms, for example, the reverse paragraph symbol for the main connective, a syllogism, the implication relation, and the corresponding symbol ∴ for ‘therefore.’
a. Syllogisms, Soriteses, and Puzzle Problems
In classical Aristotelian logic there are four forms of propositions:
A: All x is y E: No x is y I: Some x is y O: some x is not y.
These Boole wrote as:
x(1 – y) = 0
xy = 0
xy ≠ 0
x(1 – y) ≠ 0.
The x, y, z symbols denote classes; and Boole used ordinary algebraic laws governing calculations with numbers to interpret his system of classes and the permissible operations on them. He assumed that each of these laws such as xy = yx, expresses a proposition that is true. Boole also developed rules to deal with elimination problems. If the equation f(x) = 0 denotes the information available about a class x, and we want to find the relations that hold between x and the other classes (y, z, and so forth) to which x is related which is symbolized by the expression f(x), Boole, using his laws of calculation, was able to represent algebraically all of the methods of reasoning in traditional classical logic. For example, syllogistic reasoning involves reducing two class equations (premises) to one equation (conclusion), eliminating the middle term, and then solving the equation of the conclusion for the subject term. The mechanical nature of these steps is apparent.
Dodgson, like most of his peers, used classical forms, such as the syllogism and sorites, to solve logical problems. These forms of traditional Aristotelian logic were the basis of the system of logical reasoning that prevailed in England up to the first quarter of the twentieth century. But Dodgson went much further, creating logical puzzle problems, some of which contained arguments intended to confuse the reader, while others could be described as paradoxical because they seemed to prove what was thought to be false. With these purposes in mind, he wanted to show that the classical syllogistic form, the prevailing logic system of his time, permits much more general reasoning than what was commonly believed.
Medieval Aristotelian logicians had formulated classifications of either fifteen, nineteen, or twenty-four valid syllogisms, depending on a number of assumptions. And in part II of Symbolic Logic, Bartley includes three more valid syllogistic formulas that Dodgson had constructed primarily to handle syllogisms that contain “not-all” statements.
Syllogistic reasoning, from the time of Aristotle until George Boole’s work in logic in the mid nineteenth century, was the essential method of all logical reasoning. In a syllogism, there are three terms (classes) in its three statements: subject, predicate (an expression that attributes properties), and the middle term which occurs once in each premise. There are several classification systems for syllogisms involving the relative position of the repeated middle term (which determines its figure, or case—there are four cases) and the way that a syllogism can be constructed within a figure (which determines its mood).
Dodgson created the first part of his visual proof system, a diagrammatic system, beginning in 1887 in a small book titled The Game of Logic. His diagrammatic system could detect fallacies, a subject that greatly interested him. He defined a fallacy as an “argument which deceives us, by seeming to prove what it does not really prove….” (Bartley 1977, p. 129)
The “game” employs two and three set diagrams only. His diagrams can represent both universal and existential statements. This textbook, intended for young people, has many examples and their solutions.
With a view to extending his proof method, Dodgson went on to expand his set of diagrams, eventually creating diagrams for eight sets (classes), and describing the construction of nine set and ten set diagrams.
He believed that mental activities and mental recreations, like games and particularly puzzles, were enjoyable and conferred a sense of power to those who make the effort to solve them. In an advertisement for the fourth edition of Symbolic Logic, Part I. Elementary addressed to teachers he wrote:
I claim, for Symbolic Logic, a very high place among recreations that have the nature of games or puzzles….Symbolic Logic has one unique feature, as compared with games and puzzles, which entitles it, I hold, to rank above them all….He may apply his skill to any and every subject of human thought; in every one of them it will help him to get clear ideas, to make orderly arrangement of his knowledge, and more important than all, to detect and unravel the fallacies he will meet with in every subject he may interest himself in. (Bartley 1977, p. 46)
Dodgson felt strongly about logic as a basis for cogent thought in all areas of life – yet he did not realize he had developed concepts that would be explored or expanded upon in the twentieth century. Although he recognized his innovations as significant, the fact that he presented them primarily in a didactic context, as opposed to a research context, has affected how they were perceived and evaluated in his time and even after Bartley’s publication.
Carroll’s idea of syllogistic construction differed from both the classical and the medieval as well as from his contemporaries. Some reasons he gave for consolidating the nineteen different forms appearing in current textbooks included the following: the syllogistic rules were too specialized; many conclusions were incomplete; and many legitimate syllogistic forms were ignored. Although Boole believed that the solutions that were found when his methods were used were complete, it has been shown this was not always the case.
Carroll made several changes to syllogistic constructions compared to what was currently accepted in his time. The result is the fifteen valid syllogisms, although he did not actually list them, that Carroll recognized. A syllogism is an argument having two premises and a single conclusion, with each proposition being one of four kinds, A: ‘all…are…’; E: ‘no…is…’; I: ‘some…are…’; O: ‘some…are not…’. There are three terms (classes) in the three statements: subject, predicate (an expression that attributes properties) and the middle term which occurs once in each premise. The number of his valid syllogisms ranges between eighteen and twenty-four.
In his earlier book, The Game of Logic, Carroll created a diagrammatic system to solve syllogisms. Ten years later, in Symbolic Logic, Part I, he extended the method of diagrams to handle the construction of up to ten classes (sets) depicting their relationships and the corresponding propositions. This visual logic method, which employs triliteral and biliteral diagrams, is a proof system for categorical syllogisms whose propositions are of the A, E, I type. He subsumed the O type under I, that is, ‘some x are not-y’ is equivalent to ‘some x are y and some x are not-y.’ But he did not use the method as a proof system beyond syllogisms. For the more complex soriteses, he settled on the ‘methods of barred premises and barred groups,’ and his final visual method, the method of trees, which remained unpublished until 1977 when it appeared in W. W. Bartley III’s book, Lewis Carroll’s Symbolic Logic. In Bartley’s construction of part II of Symbolic Logic, using Dodgson’s extant papers, letters, and manuscripts, the main topics in the eight books are: fallacies, logical charts, the two methods of barred premises and of trees, and puzzle problems. In part I of Symbolic Logic Dodgson used just three formulas, which he called Figures or Forms to designate the classical syllogisms. In the fourth edition of Symbolic Logic, Part I. Elementary, Dodgson pointed this out in an Appendix, Addressed to Teachers where he wrote:
As to Syllogisms, I find their [in textbooks] nineteen forms, with about a score of others which they have ignored, can all be arranged under three forms, each with a very simple Rule of its own. (Bartley 1977, p. 250)
In Symbolic Logic, Part I which appeared in four editions in 1896, Dodgson, represented syllogisms as in this example:
No x are mʹ;
All m are y.
∴ No x are yʹ
in the form of conditional statements using a subscript form that is written symbolically as: xmʹ0 † m1yʹ0 (reverse ¶) xyʹ0 (Bartley 1977, p. 122) with the reverse paragraph sign signifying the connecting implication relation, which he defined as: the propositions on the left side “would, if true, prove” the proposition on the right side. (Bartley 1977, p. 119) Dodgson’s algebraic notation is a modification of Boole’s which he thought was unwieldy.
Why did Dodgson choose to write his logic books under his pseudonym? Bartley suggests a combination of motives: He wanted the material to appeal to a large general audience, particularly to young people, a task made easier using the wide acclaim accorded him as the writer, Lewis Carroll. Then too, there was the financial motive; books authored by Lewis Carroll could generate greater revenue than books by the mathematician Charles Dodgson. By 1896, Dodgson was very much concerned about his mortality and the responsibility he bore for the future care of his family, especially his unmarried sisters. But there were other reasons why he wanted the exposure his pseudonym would offer. A deeply religious man, Dodgson considered his mathematical abilities to be a gift that he should use in the service of God. In a letter to his mathematically talented sister, Louisa, dated 28 September 1896, he wrote:
[W]hereas there is no living man who could (or at any rate would take the trouble to) & finish, & publish, the 2nd Part of the Logic. Also I have the Logic book in my head….So I have decided to get Part II finished first….The book will be a great novelty, & will help, I fully believe, to make the study of Logic far easier than it now is: & it will, I also believe, be a help to religious thoughts, by giving clearness of conception & of expression, which may enable many people to face, & conquer, many religious difficulties for themselves. So I do really regard it as work for God. (Bartley 1977, pp. 366-371)
b. Venn and Carroll Diagrams
In their diagrammatic methods, both Venn and Carroll used simple symmetric figures, and they valued visual clarity and ease of drawing as the most important attributes. Like Boole and Jevons, both were in the tradition of calculus ratiocinator, that is, mechanical deduction. Each of them used a system of symbolic forms isomorphic to their diagrammatic forms.
Both Venn diagrams and Carroll diagrams are maximal, in the sense that no additional logic information like inclusive disjunctions is representable by them. But Carroll diagrams are easier to draw for a large number of sets because of their self-similarity and algorithmic construction. This regularity makes it simpler to locate and thereby erase cells corresponding to classes destroyed by the premises of an argument. Although both Venn and Carroll diagrams can represent existential statements, Carroll diagrams are capable of easily handling more complex problems than Venn’s system can without compromising the visual clarity of the diagram. Carroll only hinted at the superiority of his method when he compared his own solution to a syllogism with one that Venn had supplied. (Carroll 1958, pp. 182-183)
In both Dodgson’s and Venn’s systems, existential propositions can be represented. The use of a small plus sign, ‘+’ in a region to indicate that it is not empty did not appear until 1894, and Dodgson reported it in his symbolic logic book. However, Dodgson may have been the first to use it. A MS worksheet on logic problems, probably from 1885, contains a variant of a triliteral diagram that has a ‘+’ representing a nonempty region. But in his published work, Dodgson preferred the symbol ‘1′ for a nonempty region and the symbol ’0’ to designate an empty region.
Both Venn and Carroll diagrams can represent exclusive disjunctions; neither can represent inclusive disjunctive statements like x + y when x and y have something in common. Exclusive disjunctions are important in syllogistic logic because existential statements like, ‘some x are y’ can be written as the disjunction, xyz or xyz¢; and the statement, ‘some y are z¢’ can be written as the disjunction, xyz¢ or x¢yz¢. Actually, it isn’t possible to represent general disjunctive information in a diagram without adding an arbitrary additional syntactic device, and that addition would result in a loss in the visual power of the diagram. Carroll also represented the universal set by enclosing the diagram, a feature Venn did not think important enough to bother with, but one that is essential in depicting the universe of discourse, a key concept in modern logic discussed by Boole and developed further by him.
Carroll’s fifteen syllogisms can be represented by Venn and even Euler diagrams, but not with the visual clarity of Carroll Diagrams. Carroll himself showed this when he presented a solution to a syllogism by Euler’s method, one that involves eighteen diagrams, and a solution that Venn provided for the same syllogism where, possibly for the first time, since it does not appear in the second edition of his symbolic logic book, Venn used a small ‘+’ to indicate a non-empty region. (Carroll 1958, pp. 180-182)
Anthony Macula constructed an iterative method to produce new Carroll diagrams, that he called (k+n)-grams where k > 4 and a multiple of four and n = 1, 2, 3, 4, by putting the 2k partitions of a k-gram into each of the partitions of an n-gram, respectively. The algorithm constructs a (k+n)-gram for any such k by iteration. It’s now easy to see that Dodgson’s description in part I of Symbolic Logic of a 9-set diagram as composed of two 8-set diagrams, one for the inside and one for the outside of the eighth set, is the result of placing the partitions of an 8-gram into each of the two partitions of a 1-gram. And the 10-set diagram, that he described as an arrangement of four octo-literal diagrams in a square, is the result of putting the partitions of an 8-gram into each of the four partitions of a 2-gram. We observe that when k > 4, the construction of a new (k+n)-gram reverses the order of the insertion of the partitions because the insertions are multiples of 4-grams into n-grams. (Carroll 1958, pp. 178-9; Macula 1995, pp. 269-274)
Although Venn’s system is isomorphic to Boole’s logic of classes, it is not isomorphic to a Boolean algebra because there is no way to illustrate inclusive disjunctive statements, that is, statements other than those that can be expressed in terms of the removal of classes as in the previous example, and in other exclusive disjunctive expressions like: x’w(yz’ + y’z),that is to say what is not x but is w, and is also either, y but not z, or z but not y. (Venn 1881, p. 102) Existential statements can be represented in Venn diagrams, and he provided the mechanism in the second edition of Symbolic Logic (actually two different representations: horizontal line shading, integers). The choice of a small plus sign in a region ‘+’ to indicate that it is not empty appears to have been made after 1894 and was reported by Carroll in his symbolic logic book. (Venn 1971, pp.131-132; Carroll 1958, p. 174)
In 1959, Trenchard More, Jr. proved what Venn knew to be true, that Venn diagrams can be constructed for any number of simply connected regions. His construction preserves the property Venn deemed essential, that each subregion is simply connected and represents a different combination of overlapping of all the simply connected regions bounded by the Jordan curves. But the diagrams resulting from More’s construction are quite complex and involve what More called a ‘weaving curve’. (More 1959, pp. 303-304)
For a large number of sets, Carroll diagrams are easier to draw because they are self-similar, that is, each diagram remains invariant under a change of scale, discontinuous, and capable of being constructed algorithmically. Their regularity makes it simpler to locate and erase cells that must be destroyed by the premises of a syllogistic argument, a task that is difficult to accomplish in Venn diagrams for five or more classes. For example, a five-set diagram results from placing a vertical line segment in each of the sixteen partitions of a four-set diagram, and a six-set diagram is obtained by putting the 22 partitions of a suitably reduced two-set diagram into each of the sixteen partitions of a four-set diagram. Seven-set and eight-set diagrams are similarly constructed. We see that each k-gram (a k-set diagram) has 2k partitions, for example, a five-set diagram has thirty-two partitions, while an 8-set diagram has two hundred fifty-six.
c. Dodgson’s ‘Methods’
Dodgson’s approach to solving logic problems led him to invent various methods. In Symbolic Logic, Part I these are the method of underscoring, the method of subscripts, and the method of diagrams. In part II they are the methods of barred premises and barred groups, although he did not refer to them as ‘methods’, and, most importantly, the method of trees. In Book (chapter) XII of part two of Symbolic Logic, instead of just exhibiting the solution tree piecemeal for a particular problem he gives a “soliloquy” as he works it through, accompanied by “stage directions” showing what he is doing to enable the reader to construct the tree in an amusing way. Bartley provides many examples of sorites problems solved by the tree method in Book xii of part II of symbolic logic. And several intricate puzzle problems solved by the tree method appear in Book xiii of part II of Symbolic Logic.
While his distinction as a logician relies on these visual innovations, Dodgson’s methods depend essentially on his idiosyncratic algebraic notation which he called the Method of Subscripts. He used letters for terms which can represent classes or attributes. (In part II of Symbolic Logic, letters are used to represent statements as well.) The subscript 0 on a letter denotes the negation of the existence of the object; the subscript 1 denotes the object’s existence. When there are two letters in an expression, it doesn’t matter which of them is first or which of them is subscripted because each subscript takes effect back to the beginning of the expression, that is, from right to left.
Bartley observed that existential import is implicit in Dodgson’s Method of Subscripts. Using this notation, Dodgson had no other way to separate subject from predicate, for example, xy1z′0 which expresses all xy are z, implies that there are some xy. But we can interpret this as either no xy are not z, or all xy are z, which are equivalent in modern logic usage. However, Dodgson may not have held this idea as a philosophical belief.
As George Englebretsen points out, “A good notation makes hidden things obvious…Carroll saw his own notation as at least simpler than Boole’s.” (Englebretsen 2007, p. 145)
When did Dodgson first use his tree method? Certainly, earlier than 16 July 1894 when he wrote in his diary that he had worked a problem of forty premises. This is the date when he constructed his last formal method which he called the Method of Trees. The essential characteristic of this method is that it uses a reductio ad absurdum approach, a standard proof method in geometry, where in order to prove that a set of retinends (the terms in the conclusion) is a nullity (empty), we start by assuming instead that it is an entity, then by a process of deduction, arriving at a contradiction of this assumption which proves that the set of retinends is indeed a nullity. He needed the new formal method to solve these more complicated problems because he understood that his diagram method would no longer suffice. The essential feature of the tree method is that when a conclusion following from a set of premises is assumed to be false, then if reasoning from it together with all the premises results in a contradiction, the original argument is proved to be valid. This is the earliest modern use of a truth tree employed to reason efficiently in the logic of classes.
On 4 August 1894 he connected his tree method with his Method of Underscoring, writing in his diary, “I have just discovered how to turn a genealogy into a scored Sorites.” (Abeles 1990, p. 30) It appears he planned to do further work with this method and its natural extensions, barred premises, and barred groups.
Three months later, he recorded:
Made a discovery in Logic,…the conversion of a “genealogical’ proof into a regular series of Sorites….Today I hit on the plan of working each column up to the junction – then begin anew with the Prem. just above and work into it the results of the columns, in whatever order works best….This is the only way I know for arranging, as Sorites, a No. of Prems much in excess of the No. of Elims, where every Attribute appears 2 or 3 times in each column of the Table. My example was the last one in the new edition of Keynes. (Wakeling 2005, p.155)
In another letter to Louisa Dodgson, dated 13 November 1896, in which he answered questions she had raised about one of his problems that she was attempting to solve, we again see that Dodgson’s use of his visual methods progressed from his method of diagrams to his method of trees. He wrote:
As to your 4 questions,…The best way to look at the thing is to suppose the Retinends to be Attributes of the Univ. Then imagine a Diagram, assigned to that Univ., and divided, by repeated Dichotomy, for all the Attributes, so as to have 2n Cells, for n Attributes. (A cheerful Diagram to draw, with, say, 50 Attributes!
(There would be about 1000,000,000,000 Cells.) If the Tree vanishes, it shows that every Cell is: empty. (Weaver Collection, reproduced in Abeles 2005, p. 40)
Dodgson considered the tree method to be superior to the barred premises ‘method’. He wrote:
We shall find that the Method of Trees saves us a great deal of the trouble entailed by the earlier process. In that earlier process we were obliged to keep a careful watch on all the Barred Premisses so as to be sure not to use any such premiss until all its “Bars” had appeared in that Sorites. In this new Method, the Barred Premises all take care of themselves. (Bartley 1977, p. 287)
Before creating his tree method, Dodson used his ‘method’ of Barred Premises to guide the generation of the most promising (ordered) lists of the premises and partial conclusions to produce the complete conclusion of a sorites. He realized that too many of these lists would not lead to a proper conclusion, so he abandoned this approach in favor of his tree method. But modern automated reasoning programs can use a direct approach, suitably guided to prevent the proving of spurious partial results that are irrelevant to obtaining the complete result.
When Dodgson used his ‘method’ of Barred Premises to verify a tree, he guided the generation of the ordered lists by employing an ordering strategy known now as unit preference which selects first the propositions with the fewest number of terms. In his own words:
“[W]hen there are two Branches, of which one is headed by a single Letter, and the other by a Pair, to take the single Letter first, turn it into a Sorites, and record its Partial Conclusion: then take the double-Letter Branch: turn it also into a Sorites.” (Bartley 1977, p. 295)
When verifying a tree, he also employed a rule to eliminate superfluous premises (those premises that don’t eliminate anything). His rule was to ignore such a premise, even if it caused a branching of the tree. But in the absence of more powerful inference rules and additional strategies first developed in the twentieth century, he had no way to approach the solution of these multiliteral problems more efficiently.
The tree method is an extension of truth tables, migrating to trees from the tables is easy to do. (For a complete discussion of this topic, see Anellis 2004.) Using truth tables to verify inconsistency is straight forward, but very inefficient, as anyone who has worked with truth tables involving eight or more cases knows. Instead, the truth tree method examines sets of cases simultaneously, thereby making it efficient to test the validity of arguments involving a very large number of sentences by hand or with a computer. To test the validity of an argument consisting of two premises and a conclusion, equivalently determining whether the set of the two premise sentences and the denial of the conclusion sentence is inconsistent, by the method of truth tables involving say, three terms, requires calculating the truth values in eight cases to determine whether or not there is any case where the values of all three terms are true. But a finished closed tree establishes that validity of the argument by showing there are no cases in which the three sentences are true. However, if any path in a finished tree cannot be closed, the argument is invalid because an open path represents a set of counterexamples.
The modern tree method, as a decision procedure for classical propositional logic and for first order logic, originates in Gentzen’s work on natural deduction, particularly his formulation of the sequent calculus known as LK. But the route is not a direct one; the chief contributors being Evert W. Beth, Jaakko Hintikka, Raymond Smullyan, and Richard Jeffrey.
On 16 July 1894 Dodgson connected his tree method with his earlier work, the Method of Diagrams. He wrote, ‘It occurred to me to try a complex Sorites by the method I have been using for ascertaining what cells, if any, survive for possible occupation when certain nullities are given’ (Bartley 1977, p. 279)
The Journal Editor, in a Note to the article, Lewis Carroll’s Method of Trees: Its Origins in ‘Studies in Logic,’ remarked:
The trees developed by Carroll in 1894, which anticipate concepts later articulated by Beth in his development of deductive and semantic tableaux, have their roots in the work of Charles Peirce, Peirce’s students and colleagues, and in particular in Peirce’s own existential graphs.” (Anellis 1990, p. 22)
In a comprehensive article of his own, he suggested that “Perhaps this valuable contribution to proof theory [Dodgson’s tree method] ought to be called the Hintikka-Smullyan tree method, or even the Dodgson-Hintikka-Smullyan tree….” (Anellis 1990, p. 62).
In the eight books or chapters of Symbolic Logic, Part I. Elementary, Carroll introduces the concepts of things and their attributes, propositions and their types, diagrams and the diagrammatic method, syllogisms and their types, the more complex soriteses, and the two methods of subscripts and of underscoring.
When Dodgson used ‘the method barred premises’ to verify a tree, he guided the generation of the ordered lists by employing an ordering strategy known now as unit preference which selects first the propositions with the fewest number of terms. He also employed a rule to eliminate superfluous premises (those premises that do not eliminate anything) when verifying a tree. His rule was to ignore such a premise, even if it caused a branching of the tree. But in the absence of more powerful inference rules and additional strategies he had no way to approach the solution of these multi-literal problems more efficiently.
While contemporaries such as Venn used diagrams for representing logical problems in logic, Dodgson took a visual approach to doing so to a new level with his Method of Trees. It was one of two additional methods of formal logic he presented in part II of Symbolic Logic. The first, a direct approach to the solution of multi-literal soriteses that he called barred premises, is an extension of his underscoring method. A barred premise is one in which a term t occurs in one premise and its negative tN occurs in two or more premises, and conversely. For example, if a premise contains the term a and the two eliminand terms bc, then abc is a nullity implying that a has the pair of attributes: bcN or bNc or bNcN, that is, a is barred by the nullity from having attributes bc.
Dodgson extended this idea to what he called a barred group: when a term t occurs in two or more premises and tN also occurs in two or more premises. His rule for working with barred premises requires that all the premises barring a given premise be used first. Dodgson did not define this method explicitly, so we will call these definitions and the rule for working with them his Method of Barred Premises. It is an early formal technique to guide the order of use of the premises of a sorites to arrive at the conclusion.
It appears he planned to do further work with his tree method and method of barred groups. In an unpublished letter whose first page is missing, probably from late in 1896 or early in 1897, he wrote, most probably to his sister Louisa:
I have been thinking of that matter of “Barred Groups”…. It belongs to a most fascinating branch of the Subject, which I mean to call “The Theory of Inference”:…. Here is one theorem. I believe that, if you construct a Sorites, which will eliminate all along, and will give the aggregate of the Retinends as a Nullity, and if you introduce in it the same letter, 2 or 3 times, as an Eliminand, and its Contradictory the same number of times, and eliminate it each time it occurs, you will find, if you solve it as a Tree, that you don’t use all the Premisses! (Weaver Collection, undated; reproduced Abeles 2005, p.40)
An example, called the ‘The Pigs and Balloons Problem,’ is in Bartley on pp. 378-80. There Dodgson created a Register of Attributes showing the eliminands (a term that appears in both rows of the Register, that is, in positive and in negative form in two premises). When a term appears in both rows and in one row in more than two premises, we have the case of barred premises. All other terms are retinends.
His almost obsessive concern with exactness introduced a certain stiffness into many of his serious mathematical writings, but the humor he uses is infectious and infuses these works, particularly those on logic, with an appealing lightness. That his use of humor set his work apart is apparent in reviews of Symbolic Logic, Part I. Elementary that appeared during his lifetime.
An anonymous reviewer of the book wrote in The Educational Times that “[T]his very uncommon exposition of elementary logic appears to have tickled the fancy of folk.” (July 1, 1896, 316) The quotations that continue to be cited by modern authors, particularly from his logic books, reinforce this view. However, the reaction of the mathematician, Hugh MacColl, the anonymous reviewer of Symbolic Logic, Part I. Elementary in The Athenaeum, was mixed. He described Carroll’s diagrammatic method for solving logical problems as elegant, but he was critical of Carroll’s notation (subscript method) and use of existential import which asserts the existence of the subject in A propositions. For example, the proposition, “All philosophers are logical,” implies the existence of at least one philosopher. MacColl added, ‘[W]e cannot say what important surprises parts ii. and iii. of his system may have in store for us when they make their appearance.’ (October 17, 1896, pp. 520 – 521)
Hugh MacColl’s views on logic were influenced by reading Dodgson’s SymbolicLogic, Part I. Both MacColl and Dodgson were active contributors to the ‘Mathematical Questions and Solutions’ section of TheEducational Times. And at least once, they were concerned with the same question in probability. MacColl submitted a solution to Dodgson’s logical problem, Question 14122, a version of the Barbershop Paradox published posthumously.
In addition to clear exposition and the unusual style that characterize his books, there seems to be one more essential affinity that supported MacColl’s attraction to Carroll’s work. Their exchanges show that both had a deep interest in the precise use of words. And both saw no harm in attributing arbitrary meanings to words, as long as the meaning is precise and the attribution agreed upon.
It seems clear that between August and December of 1894, Dodgson may have been considering a direction that was more formally developed later by Hugh MacColl as early as 1896-97, and expanded in his 1906 book, Symbolic Logic and Its Applications, where he defined strict implication, in which the content of the antecedent and consequent have a bearing on the validity of the conditional, twenty years before modal logic began to be placed on a modern footing beginning with the work of the American philosopher and logician, Clarence Irving Lewis.
4. The Automation of Deduction
The beginning of the automation of deduction goes back to the 1920s with the work of Thoralf Skolem who studied the problem of the existence of a model satisfying a given formula, and who introduced functions to handle universal and existential quantifiers. Other logicians such as David Hilbert, Wilhelm Ackermann, Leopold Löwenheim, Jacques Herbrand, Emil Post, and a little later, Alonzo Church, Kurt Gödel, and Alan Turing introduced additional important ideas. One of the most important, a consequence of Hilbert’s metamathematical framework, was the notion that formalized logic systems can be the subject of mathematical investigation. But it was not until the 1950s that computer programs, using a tree as the essential data structure, were used to prove mathematical theorems.
The focus of these early programs was on proofs of theorems of propositional and predicate logic. Describing the 1957 ‘logic machine’ of Newell, Shaw, and Simon, Martin Davis noted that a directed path in a tree gave the proof of a valid argument where its premises and conclusion were represented as nodes, and an edge joining two premise nodes represented a valid derivation according to a given set of rules for deriving the proofs.
The modern tree method, as a decision procedure for classical propositional logic and for first order logic, originated in Gerhard Gentzen’s work on natural deduction, particularly his formulation of the sequent calculus known as LK. But the route was not a direct one, the main contributors being Evert Beth, Richard Jeffrey, Jaakko Hintikka and Raymond Smullyan. In 1955, Beth presented a tableau method he had devised consisting of two trees that would enable a systematic search for a refutation of a given (true) sequent. A tree is a left-sided Beth tableau in which all the formulae are true. The rules for decomposing the tree, that is, the inference rules, are equivalent to Gentzen’s rules in his sequent calculus.
Bartley had this to say about Dodgson’s tree method for reaching valid conclusions from sorites and puzzle problems:
Carroll’s procedure bears a striking resemblance to the trees employed . . .according to a method of ‘Semantic Tableaux’ published in 1955 by the Dutch logician, E. W. Beth. The basic ideas are identical. (Bartley 1977, p. 32)
Dodgson was the first person in modern times to apply a mechanical procedure, his tree method, to demonstrate the validity of the conclusion of certain complex problems. The tree method is a direct extension of truth tables and Dodgson had worked with an incomplete truth table in one of the solutions he gave to his Barbershop Problem in September 1894. Bartley writes, “The matrix is used…for the components; but the analysis and assignment of truth values to the compounds are conducted in prose commentary on the table.” (Bartley 1977, p. 465n.)
On 4 August, he connected the tree method with a scored sorites:
I have just discovered how to turn a genealogy into a scored Sorites: the difficulty is to deal with forks. Say ‘all a is b or c’ = ‘all A is b’ and ‘all α is c,’ where the two sets A, α make up a. Then prove each column separately. (Wakeling, 2005, p. 158)
On 30 October, using a problem from a new edition of Keynes book, Studies and Exercises in Formal Logic, he discovered how to navigate a tree representing a sorites with 21 premises having 10 attributes of which 8 are eliminated. (Wakeling 2005, p. 181)
When an open branch is divided into two branches and a term, here bʹ, appears in one of the branches and its negation is added to the other branch, we have an example of the use of the cut rule. Dodgson has anticipated a method that was not fully worked out until the 1930s. He wrote:
It is worthwhile to note that in each case, we tack on to one of the single Letters, the Contradictory of the other: this fact should be remembered as a rule….We have now got a Rule of Procedure, to be observed whenever we are obliged to divide our Tree into two Branches. (Bartley 1977, p. 287)
He continued to discover new ways to improve his handling of trees, recording in his diary on November 12/13, 1896, “Discovered [a] method of combining 2 Trees, which prove abcʹ0 † abdʹ0, into one proving ab(cd)ʹ0, by using the Axiom cd(cd)ʹ0.” (Wakeling 2005, p. 279)
In an exchange of letters in October and November of 1896 to John Cook Wilson, Wykeham Professor of Logic at Oxford, Dodgson modified an eighteen-premise version of a problem containing superfluous premises to one with fifteen premises. Bartley includes both versions as well as their solutions by the tree method.
In an unpublished letter dated 25 September 1896 to the Cook Wilson, in connection with a sorites problem Dodgson wrote:
What you say about ‘superfluous Premisses’ interests me very much. It is a matter that beats me, at present . . . &, if you can formulate any proof enabling you to say ‘all the Premises are certainly needed to prove the Conclusion,’ I shall be very glad to see it. (Dodgson, Sparrow Collection, 25 September 1896. Courtesy Morton N. Cohen)
The difficulty of establishing a theorem to determine superfluous premises troubled him. It was a problem he was unable to solve.
5. Dodgson’s Logic Circle
John Venn was another English logician whose work Dodgson was familiar with and with whom he had contact. Venn, a supporter of Boole’s approach to logic, published the first edition of his Symbolic Logic in 1881. It included his now familiar diagrams to depict the relations between classes so that the truth or falsity of propositions employing them could be established.
In 1892 William E. Johnson had published the first of three papers in Mind titled “The Logical Calculus” where he distinguished the term, conditional from the term, hypothetical. Dodgson, like most logicians of his time did not make this distinction, using the term hypothetical for both situations. Johnson’s view was that a conditional expresses a relation between two phenomena, while a hypothetical expresses a relation between two propositions of independent import. So, a conditional connects two terms, while a hypothetical connects two propositions. John Neville Keynes, with whose work Dodgson was quite familiar, agreed with Johnson’s view. Venn, however, although he, too, knew Johnson’s work, held a very different view of hypotheticals, contending that because they are of a non-formal nature, they really should not be considered part of symbolic logic.
William Stanley Jevons was another supporter of Boole whose books, Pure Logic; or, the Logic of Quality Apart From Quantity (1864) and The Principles of Science: A Treatise on Logic and Scientific Method (1874) Dodgson owned. Jevons introduced a logical alphabet for class logic in 1869, and the following year he exhibited a machine that used it for solving problems in logic mechanically, which he called the logical piano, to the Royal Society in London
Dodgson was very familiar with Keynes’ Studies and Exercises in Formal Logic in its second edition from 1887, quoting directly from it in chapter II of Book X in part II of Symbolic Logic. Keynes included Dodgson’s Barbershop Paradox as an exercise in chapter IX of the 1906 edition of his book. (Keynes 1906, pp. 273 – 274)
a. The ‘Alice’ Effect
In an exchange of letters between Venn and Dodgson in 1894, and from the reviews that appeared soon after the publication of both The Game of Logic and Symbolic Logic, Part I, we see that Dodgson’s reputation as the author of the ‘Alice’ books cast him primarily as an author of children’s books and prevented his logic books from being treated seriously. The barrier created by the fame Carroll deservedly earned from his Alice books combined with a writing style more literary than mathematical, prevented the community of British logicians from properly recognizing him as a significant logician.
His own more literary style of writing contributed to this impression. That this was his reputation is apparent in reviews of Symbolic Logic, Part I during his lifetime. Certainly, most of his contemporaries were unaware of the importance of his diagrammatic method for solving syllogisms that he first presented in The Game of Logic. In an unpublished letter to Venn dated 11 August 1894, he wrote:
‘You are quite welcome to make any use you like of the problem I sent you, & (of course) to refer to the article in ‘Mind’ – [A Logical Paradox, N. S. v. 3, 1894, pp. 436-438 concerning an example of hypothetical propositions] Your letter has, I see crossed one from me, in which I sent you ‘Nemo’s algebraical illustration. I hope you may be able to find room for it in your next book. Perhaps you could add it, as a note, at the end of the book, & give it, at p. 442, a reference thereto? I shall be grateful if you will not mention to anyone my real name, in connection with my pseudonym. I look forward with pleasure to studying the new edition of your book.” (Venn Papers, Gonville and Caius Libraries, Cambridge University)
And on p. 442 of the second revised edition of his Symbolic Logic Venn wrote:
[T]hat the phrase ‘x implies y’ does not imply that the facts concerned are known to be connected, or that the one proposition is formally inferrible from the other. This particular aspect of the question will very likely be familiar to some of my readers from a problem recently circulated, for comparison of opinions, amongst logicians. As the proposer is, to the general reader, better known in a very different branch of literature, I will call it Alice’s Problem.
6. Logic Paradoxes
a. The Barbershop Paradox
An appendix to Book XXI contains eight versions of Dodgson’s Barbershop Paradox, one of which was published in Mind as “A Logical Paradox”. In another appendix to this book Bartley discusses Carroll’s other contribution to Mind, “What the Tortoise said to Achilles.” These two appendices make the issues Carroll dealt with in these published articles—along with the commentaries they engendered from modern logicians and philosophers—much more accessible.
The Barbershop problem was Dodgson’s first publication in the journal Mind. It is the transcription of a dispute which opposed him to John Cook Wilson. Bertrand Russell used the barber shop problem in his Principles of Mathematics to illustrate his principle that a false proposition implies all others. Venn was one of the first to discuss it in print, in the second edition of his Symbolic Logic. Bartley includes eight versions of Dodgson’s Barbershop Paradox, one of which was published in Mind, together with extensive commentary.
In the Barbershop Paradox, there are two rules governing the movements of three barbers Allen, Brown, and Carr. The first is that when Carr goes out, then if Allen goes out, Brown stays in. The second rule is that when Carr goes out, Brown goes out. The challenge is to use these rules to determine Carr’s possible movements. In a lively two-year correspondence from late 1892 preserved in the Bodleian Library, Dodgson and Cook Wilson honed their differing views on the Barbershop Paradox. Wilson believed that all propositions are categorical and therefore hypotheticals could not be propositions.
The unsettled nature of the topic of hypotheticals during Dodgson’s lifetime is apparent at the beginning of the Note that Carroll wrote at the end of his article:
This paradox…is, I have reason to believe, a very real difficulty in the Theory of Hypotheticals. The disputed point has been for some time under discussion by several practised logicians, to whom I have submitted it; and the various and conflicting opinions, which my correspondence with them has elicited, convince me that the subject needs further consideration, in order that logical teachers and writers may come to some agreement as to what Hypotheticals are, and how they ought to be treated. (Carroll 1894, p. 438)
Bartley remarks in his book that the Barbershop Paradox is not a genuine logical paradox as is the Liar Paradox. Generally, a paradox is a statement that appears to be either self-contradictory or contrary to expectations.
The many versions the Barbershop Paradox that Dodgson developed demonstrate an evolution of his thoughts on hypotheticals and material implication in which the connection between the antecedent and the consequent of the conditional (if (antecedent), then (consequent)) is formal, that is, it does not depend on their truth values. This is a result of Boole’s logic. Six versions of the Barbershop Paradox provide insight into Dodgson’s thinking about the problem as it evolved. Bartley published five of these six as well as three others, two of which are examples; one is almost the same as one of the others Bartley published. Additionally, there are three earlier versions that Bartley did not publish; all are from March 1894.
Earlier versions of the “Barbershop Paradox” show the change in the way Dodgson represented conditionals. In the earlier versions, he expressed a hypothetical proposition in terms of classes, that is, if A is B, then C is D. Only later did he designate A, B, C, and D as propositions.
A version of the Barbershop Paradox that was not recognized as such by Bartley, Question 14122, was published in February 1899 in The Educational Times after Dodgson’s death and reprinted in Mathematical Questions and Solutions the next year. Two different solutions appeared that same year, one by Harold Worthington Curjel, a member of the London Mathematical Society, the other by Hugh MacColl. (For a more detailed discussion of the Barbershop Paradox, see A. Moktefi’s publications.)
An article titled, “A Logical Paradox”, published in 1894 in Mind, generated responses in the form of subsequent articles published in Mind by many of the eminent logicians of Dodgson’s time, including Hugh MacColl, E. E. Constance Jones, Lecturer in Logic at Girton, one of the Women’s colleges at Cambridge, Alfred Sidgwick, the author of Fallacies. A View of Logic from the Practical Side, as well as Johnson, Keynes, Cook Wilson, and Russell.
A letter dated 11 August 1894 from Dodgson to John Venn resulted in Venn including a version of the Barbershop Paradox in the second edition (1884) of his book, Symbolic Logic. Keynes included a version of the Barbershop Paradox in his book, and Bradley discussed it in a book of Selected Correspondence.
Bertrand Russell gave what is now the generally accepted conclusion to this problem in his 1903 book, The Principles of Mathematics. If p represents ”Carr is out’; q represents ‘Allen is out’; r represents ‘Brown is out,’ then the Barbershop Paradox can be written as (1) q implies r; (2) p implies that q implies not-r. Russell asserted that the only correct inference from (1) and (2) is: if p is true, q is false, that is, if Carr is out, Allen is in. (Russell 1903, p. 18)
b. Achilles and the Tortoise
Dodgson published a more consequential paradox in Mind the following year: “What the Tortoise Said to Achilles.” Although it did not generate any responses during Dodgson’s lifetime, many responses were received after his death, and it remains an unsolved problem to the current day. (See Moktefi and Abeles 2016.)
This is the paradox:
Things that are equal to the same thing are equal to each other,
The two sides of this triangle are things that are equal to the same.
The two sides of this triangle are equal to each other.
Dodgson was the first to recognize that when making a logical inference, the rule that permits drawing a conclusion from the premises cannot be considered to be a further premise without generating an infinite regress.
Both the Barbershop and Achilles paradoxes involve conditionals and Dodgson employed material implication to argue them, but he was uncomfortable with it. He struggled with several additional issues surrounding hypotheticals. In the Note to the published version of the Barbershop Paradox in July 1894, Dodgson asked several questions, the first being whether a hypothetical can be legitimate when its premise is false; the second being whether two hypotheticals whose forms are ‘if A then B’ and ‘if A then not-B’ can be compatible.
Bartley published a second edition of Symbolic Logic, Part II in 1986 in which he included solutions to some of Carroll’s more significant problems and puzzles, additional galley proof discoveries, and a new interpretation, by Mark R. Richards, of Carroll’s logical charts.
By 1897, Dodgson may have been rethinking his use of existential import. Bartley cites a diary entry from 1896, and an undated letter to Cook Wilson as evidence (Bartley 1977, pp. 34–35.) However, there is even more evidence, incomplete in Bartley’s book, to support this break with the idea of existential import. Book (chapter) XXII contains Dodgson’s solutions to problems posed by other logicians. One of these solutions to a problem posed by Augustus De Morgan that concerns the existence of their subjects appears in an unaddressed letter dated 15 March 1897. (Bartley 1977, pp. 480–481) From Dodgson’s response to this letter six days later, we now know it was sent to his sister, Louisa, responding to her solution of the problem. In this unpublished letter, Dodgson suggested:
[I]f you take into account the question of existence and assume that each Proposition implies the existence of its Subject, & therefore of its Predicate, then you certainly do get differences between them: each implies certain existences not implied by the others. But this complicates the matter: & I think it makes a neater problem to agree (as I shall propose to do in my solution of it) that the Propositions shall not be understood to imply existence of these relationships, but shall only be understood to assert that, if such & such relationships did exist, then certain results would follow. (Dodgson, Berol Collection, New York University, 21 March 1897)
7. Dodgson and Modern Mathematics
In part II of Symbolic Logic, Dodgson’s approach led him to invent various methods that lend themselves to mechanical reasoning. These are the ‘methods’ of barred premises and barred groups and, most importantly, the method of trees. Although Dodgson worked with a restricted form of the logic of classes and used rather awkward notation and odd names, the methods he introduced foreshadowed modern concepts and techniques in automated reasoning like truth trees, binary resolution, unit preference and set of support strategies, and refutation completeness.
His system of logic diagrams is a sound and complete proof system for syllogisms. The soundness of a proof system ensures that only true conclusions can be deduced. (A proof system is sound if and only if the conclusions we can derive from the premises are logical consequences of them.) Conversely, its completeness guarantees that all true conclusions can be deduced. (A proof system is complete if and only if whenever a set of premises logically implies a conclusion, we can derive that conclusion from those premises.)
Several of the methods Dodgson used in his Symbolic Logic contain kernels of concepts and techniques that have been employed in automatic theorem proving beginning in the twentieth century. The focus of these early programs was on proofs of theorems of propositional and predicate logic.
His only inference rule, underscoring, which takes two propositions, selects a term in each of the same subject or predicate having opposite signs, and yields another proposition, is an example of binary resolution, the most important of these early proof methods in automated deduction.
Although Dodgson did not take the next step, attaching the idea of inconsistency to the set of premises and conclusion, this method for handling multi-literal syllogisms in the first figure is a formal test for inconsistency that qualifies as a finite refutation of the set of eliminands and retinends. His construction of a tree uses one inference rule (algorithm), binary resolution, and he guides the tree’s development with a restriction strategy, now known as a set of support, that applies binary resolution at each subsequent step of the deduction only if the preceding step has been deduced from a subset of the premises and denial of the conclusion, that is, from the set of retinends. This strategy improves the efficiency of reasoning by preventing the establishment of fruitless paths. And this tree test is both sound and complete, that is, if the initial set of the premises and conclusion is consistent, there will be an open path through the tree rendering it sound; if there is an open path in the finished tree, the initial set of the premises and conclusion is consistent, rendering it complete.
A comparison of the two parts of Symbolic Logic reveals the progress Dodgson made toward an automated approach to the solution of multiply connected syllogistic problems (soriteses), and puzzle problems bearing intriguing names such as “The Problem of Grocers on Bicycles”, and “The Pigs and Balloons Problem”.
Many modern automated reasoning programs employ a reductio ad absurdum argument, while other reasoning programs that are used to find additional information do not seek to establish a contradiction. In 1985, one of Dodgson’s puzzle problems, the “Problem of the School Boys”, was modified by Ewing Lusk and Ross Overbeek to be compatible with the direct generation of statements (in clausal form) by an automated reasoning program. Their program first produced a weaker conclusion before generating the same stronger conclusion Dodgson produced using his tree method. The solution by Lusk and Overbeek in 1985 to Dodgson’s ‘Salt and Mustard Problem’ and by A G. Cohn 1989 to the same problem five years later used a many sorted logic to illustrate the power of two of these programs.
In computer science, a database has a state which is a value for each of its elements. A trigger can test a condition that can be specified by a when clause, that is, a certain action will be executed only if the rule is triggerred and the condition holds when the triggering event occurs.
Dodgson defined the term, Cosmophase, as “[t]he state of the Universe at some particular moment: and I regard any Proposition, which is true at that moment, as an Attribute of that Cosmophase.” (Bartley 1977, p. 481) Curiously, Dodgson’s definition of a Cosmophase fits nicely into this modern framework.
8. Carroll as Popularizer
Dodgson was both a popularizer and an educator of both mathematics and logic. He began teaching mathematics at St. Aldate’s School across from Christ Church in 1856. He considered The Game of Logic and to a greater degree, Symbolic Logic, Part I. Elementary, to be far superior to those in current use, and to be useful in teaching students between the ages of twelve and fourteen. The objective of the game, played with a board and counters, was to solve syllogisms. He believed his entire Symbolic Logic book, including the projected parts II and III would appeal to pupils up to the age of twenty, and hence be useful at the university level.
While he was the Mathematical Lecturer at Christ Church, he often gave free private instruction to family groups of parents, their children and their children’s friends in their private homes on such mathematical topics as ciphers, particularly his Memoria Technica cipher, arithmetical and algebraical puzzles, and an algorithmic method to find the day of the week for any given date. He originally created the Memoria Technica cipher in 1875 to calculate logarithms but found many more uses for it as a general aid for remembering, writing a simplified version of it for teaching purposes in 1888.
The topics he chose to teach privately focused on memory aids, number tricks, computational shortcuts, and problems suited to rapid mental calculation, developing this last topic into a book, Curiosa Mathematica, Part 2: Pillow Problems Thought Out During Wakeful Hours (1894) that was published in 1893. He continued to provide instruction in this way on logic topics. He also gave logic lessons in his rooms at Christ Church. In June 1886 he gave lectures at Lady Margaret Hall, Oxford and in May 1887 at the Oxford High School for Girls. There he lectured to both students and, separately, their teachers. He gave lectures at St. Hugh’s Hall, another of the women’s colleges at Oxford, in May and June of 1894. In January 1897 he began a course of lectures on symbolic logic at Abbot’s Hospital in Guildford.
He used material that he eventually incorporated into his book, TheGame of Logic, a work he had essentially completed in July 1886, but that did not appear until November in an edition Dodgson rejected for being substandard. The second (published) edition came out in February of the following year. Dodgson hoped the book would appeal to young people as an amusing mental recreation. He found this book, and even more so, his Symbolic Logic, Part I. Elementary essential in teaching students. He believed his own book on symbolic logic was far superior to those in current use.
On 21 August 1894, answering a letter from a former child friend, Mary Brown, now aged thirty-two, he wrote:
You ask what books I have done…. At present I’m hard at work (and have been for months) on my Logic-book. (It really has been on hand for a dozen years: the “months” refer to preparing for the Press.) It is Symbolic Logic, in 3 Parts – and Part I is to be easy enough for boys and girls of (say) 12 or 14. I greatly hope it will get into High Schools, etc. I’ve been teaching it at Oxford to a class of girls at the High School, another class of the mistresses(!), and another class of girls at one of the Ladies’ Colleges. (Cohen 1979, p. 1031)
In a letter dated 25 November 1894 to his sister, Elizabeth, he wrote:
One great use of the study of Logic (which I am doing my best to popularise) would be to help people who have religious difficulties to deal with, by making them see the absolute necessity of having clear definitions, so that, before entering on the discussion of any of these puzzling matters, they may have a clear idea what it is they are talking about. (Cohen 1979, p. 1041)
The statements of almost all the problems in both parts of his symbolic logic books are amusing to read. This attribute derives from the announced purpose of the books, to popularize the subject. But Dodgson naturally incorporated humor into much of his serious mathematical writing, infusing this work with the mark of his literary genius.
Edward Wakeling notes that his logic teaching took three forms: a series of lessons in a school, lessons to a small group of friends or families he knew or teaching a single confident, intelligent and alert child-friend. This last method was his favorite. Edith Rix, to whom he dedicated A Tangled Tale (1885) in the form of an eight-line acrostic poem in which the second letter of each line spells her name, was his first logic pupil. Dodgson wrote many letters to her concerning problems in logic. She was, it is reported he said, the cleverest woman he ever knew.
In the Appendix Addressed to Teachers from part I of Symbolic Logic, fourth edition, Carroll indicated some of the topics he planned for part II. These include “[T]he very puzzling subjects of Hypotheticals, Dilemmas, and Paradoxes.” (Bartley 1977, p. 229) Dodgson was generally interested in the quality of arguments, particularly those that could confuse. Paradoxes fall in this category because they appear to prove what is known to be false. And paradoxes certainly challenged him to create ingenious methods to solve them, such as his tree method.
Dodgson expressed his thoughts about how best to teach logic to young people in “A Fascinating Mental Recreation for the Young” when he wrote:
As to the first popular idea – that Logic is much too hard for ordinary folk, and specially for children, I can only say that I have taught the method of Symbolic Logic to many children, with entire success…High-School girls take to it readily. I have had classes of such girls, and also of the mistresses,….As to Symbolic Logic being dry, and uninteresting, I can only say, try it! I have amused myself with various scientific pursuits for some forty years and have found none to rival it for sustained and entrancing attractiveness. (Carroll 1896, reproduced in Abeles 2010, pp. 96-97)
9. Conclusion
The inspiration for much of what Dodgson wrote about logic came from his contacts with faculty members at other colleges in Oxford, in Cambridge and elsewhere. He communicated his work within a circle of colleagues and solicited their opinions. Unlike most of them, he did not seek membership in the professional mathematical and philosophical societies, nor did he attend their meetings or give lectures, with few exceptions. He was not a traditional mathematician. Rather, he applied mathematical and logical solutions to problems that interested him. As a natural logician at a time when logic was not considered to be a part of mathematics, he successfully worked in both fields.
Although the ingenuity of the puzzles and examples Dodgson created were generally applauded, Bartley’s claims about the significance of Dodgson’s work were questioned, so that its value in the development of logic was not fully appreciated when the book was first published. But subsequently, other scholars working on Carroll’s logic and mathematical writings such as Duncan Black, George Englebretsen, Amirouche Moktefi, Adrian Rice, Mark Richards, Eugene Seneta, Edward Wakeling and Robin Wilson have made important discoveries that have greatly enhanced Carroll’s reputation.
Why did scholars become interested in Dodgson’s serious work only in the second half of the twentieth century? In addition to Bartley’s publication of Carroll’s Symbolic Logic book, there are several more reasons. One of the most important is the role certain publishers played in making his work available. These include: Clarkson N. Potter, and Dover Press in the USA, and Kluwer in the Netherlands whose books were distributed both in the USA and in the UK. The articles in Martin Gardner’s popular ‘Mathematical Games’ section of Scientific American magazine also included several of Dodgson’s mathematical ideas and were invaluable sources of information for scholars. Another important reason is that only in the twentieth century did some of his mathematical and logical ideas find application, in the sense that his work foreshadowed their use. Dodgson’s mathematical and logical work was broadly based, but his influence on important developments in the twentieth century occurred primarily after his death.
10. References and Further Reading
a. Primary
Boole, G. An Investigation of the Laws of Thought. London, Macmillan, 1854.
Boole, G. The Mathematical Analysis of Logic. London, Macmillan, 1847.
Bradley, F.H. The Principles of Logic, London, Oxford University Press, 1883.
Carroll, C.L. The Game of Logic. Macmillan, London, 1887.
Carroll, C.L. Symbolic logic: Part I. London, Macmillan, 1896.
Carroll, C. L. The Game of Logic. Published with Symbolic Logic, Part I, as The Mathematical Recreations of Lewis Carroll, New York, Dover, 1958.
Carroll, L. “A Logical Paradox.” Mind v.3, n.11, 1894, pp. 436-438.
Carroll, L. What the Tortoise said to Achilles.” Mind, v.4, n.14, 1895, pp. 278-280.
Cohen, M. N. The Letters of Lewis Carroll. 2 vols. New York, Oxford University Press, 1979.
De Morgan, A. Formal Logic. London, Taylor & Walton, 1847.
De Morgan, A. On the Syllogism and Other Logical Writings. London, Routledge & Kegan Paul, 1966.
Dodgson, C. L. Euclid and his Modern Rivals. London, Macmillan, 1879.
Dodgson, C. L. Curiosa Mathematica. Part I: A New Theory of Parallels. London, Macmillan, 1888.
Dodgson, C. L. Curiosa Mathematica. Part II:Pillow Problems. London, Macmillan, 1893.
Jevons, W. S. Pure Logic, or the Logic of Quality Apart from Quantity, London, E. Stanford, 1864.
Johnson, W. E. “The Logical Calculus I, II, III”, Mind 1, pp. 3-30; II, pp. 235-250; III, pp. 340-357, 1892.
Keynes, J. N. Studies and Exercises in Formal Logic, 3rd ed.. London, Macmillan, 1894.
Russell, B. Principles of Mathematics. Cambridge, Cambridge University Press, 1903.
Sidgwick, A. Fallacies: A View of Logic from the Practical Side. London, Kegan, Paul, Trench, 1883.
Venn, J. Symbolic Logic. London, Macmillan, 1881.
Venn, J. Symbolic Logic, 2nd revised ed. London, Macmillan, 1894.
Wakeling, E., ed. Lewis Carroll’s Diaries. v. 6. Clifford, Herefordshire, The Lewis Carroll Society, 2001.
Wakeling, E., ed. Lewis Carroll’s Diaries. v. 8. Clifford, Herefordshire, The Lewis Carroll Society, 2004.
Wakeling, E., ed. Lewis Carroll’s Diaries. v. 9. Clifford, Herefordshire, The Lewis Carroll Society, 2005.
b. Secondary
Abeles, F.F. “Lewis Carroll’s Method of Trees: Its Origins in Studies in Logic.” Modern Logic, v. 1, n. 1, 1990, pp. 25-35.
Abeles, F. F., ed. The Mathematical Pamphlets of Charles Lutwidge Dodgson and Related Pieces. New York, Lewis Carroll Society of North America,1994.
Abeles, F. F. “Lewis Carroll’s Formal Logic.” History and Philosophy of Logic v. 26, 2005, pp.33-46.
Abeles, F. F. “From the Tree Method in Modern Logic to the Beginning of Automated Theorem Proving.” In: Shell-Gellash, A. and Jardine, D., eds. From Calculus to Computers. Washington DC, Mathematical Association of America, 2005, pp. 149-160.
Abeles, F. F. “Lewis Carroll’s Visual Logic.” History and Philosophy of Logic v. 28, 2007, pp. 1-17.
Abeles, F. F., ed. The Logic Pamphlets of Charles Lutwidge Dodgson and Related Pieces. New York, Lewis Carroll Society of North America, 2010.
Abeles, F. F. “Toward a Visual Proof System: Lewis Carroll’s Method of Trees.” Logica Universalis, v. 6, n. 3/4, 2012, pp. 521-534.
Abeles, F. F. “Mathematical Legacy.” In: Wilson, R. and Moktefi, A. eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll). Oxford, Oxford University Press, 2019, pp. 177-215.
Anellis, Irving. “From Semantic Tableaux to Smullyan Trees: the History of the Falsifiability Tree Method.” Modern Logic, v. 1, n. 1, 1990, pp. 36- 69.
Corcoron, J. “Information-Theoretic Logic.” In Martinez, C. et al. eds. Truth in Perspective, Aldershot, Ashgate, 1998, pp.113-135.
Englebretsen, G. “The Tortoise, the Turtle and Deductive Logic.” Jabberwocky, v. 3, 1974, pp.11-13.
Englebretsen, G. “The Properly Victorian Tortoise.” Jabberwocky, v. 23, 1993/1994, pp.12-13.
Englebretsen, G., “The Dodo and the DO: Lewis Carroll and the Dictum de Omni.” Proceedings of the Canadian Society for the History and Philosophy of Mathematics, v. 20, 2008, pp. 142-148.
Macula, A. “Lewis Carroll and the Enumeration of Minimal Covers.” Mathematics Magazine, v. 69, 1995, pp. 269-274.
MacColl, H. “Review of Symbolic Logic, Part I, by Lewis Carroll.” The Athenaeum, 17 October 1896, pp. 520-521.
Marion, M. and Moktefi, A. “La Logique Symbolique en Sébat à Oxford à la Fin du XIXe Siècle : Les Disputes Logiques de Lewis Carroll et John Cook Wilson.” Revue d’Histoire des Sciences, v. 67 n. 2, 2014, pp. 185-205.
Moktefi, A. “Beyond Syllogisms: Carroll’s (Marked) Quadriliteral Diagram.” In: Moktefi, A., Shin, S.-J.,eds. Visual Reasoning with Diagrams, Basel, Birkhäuser, 2013, pp. 55-72.
Moktefi, A. ”On the Social Utility of Symbolic Logic: Lewis Carroll against ‘The Logicians’.” Studia Metodologiczne 35, 2015, pp.133-150.
Moktefi, A., “Are Other People’s Books Difficult to Read? The Logic Books in Lewis Carroll’s Private Library.” Acta Baltica Historiae et Philosophiae Scientiarum, v. 5, n. 1, 2017, pp.28-49.
Moktefi, A. “Logic.” In: Wilson, R. J., Moktefi, A., eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll), Oxford, Oxford University Press, 2019, pp. 87-119.
Moktefi, A. and Abeles, F. F. “The Making of ‘What the Tortoise Said to Achilles’: Lewis Carroll’s Logical Investigations toward a Workable Theory of Hypotheticals.” The Carrollian, v. 28, 2016, pp. 14-47.
Moktefi, A. “Why Make Things Simple When You Can Make Them Complicated? An Appreciation of Lewis Carroll’s Symbolic Logic”, Logica Universalis, volume 15 (2021), pages359–379.
More, T., Jr. “On the Construction of Venn Diagrams.” J. of Symbolic Logic v. 24, n.4 , 1959, pp. 303-304.
Rice, Adrian. “Algebra.” In: Wilson, R. J., Moktefi, A., eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll), Oxford, Oxford University Press, 2019, pp. 57- 85.
Richards, M. Game of Logic. https://lewiscarrollresources.net/gameoflogic/.
Seneta, E. “Lewis Carroll as a Probabilist and Mathematician.” Mathematical Scientist, v. 9, 1984, pp. 79-84.
Seneta, E. “Victorian Probability and Lewis Carroll.” Journal of the Royal Statistical Society Series A-Statistics in Society, v. 175, n. 2, 2012, pp. 435-451.
Van Evra, J. “The Development of Logic as Reflected in the Fate of the Syllogism 1600-1900.” History and Philosophy of Logic, v. 21, 2000, pp. 115-134.
Wilson, R. “Geometry.” In: Wilson, R. and Moktefi, A. eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll). Oxford, Oxford University Press, 2019, pp. 31-55
Wilson, R. and Moktefi, A. eds. The Mathematical World of Charles L. Dodgson (Lewis Carroll). Oxford, Oxford University Press, 2019.
Author Information
Francine F. Abeles
Email: fabeles@kean.edu
Kean University
U. S. A.
Existence
Since Thales fell into the well while gazing at the stars, philosophers have invested considerable effort in trying to understand what, how and why things exist. Even though much ink has been spilled about those questions, this article focuses on the following three questions:
(1) What is the nature of existence?
(2) Are there different ways/modes of existing?
(3) Why does something exist instead of nothing?
First, we review the main attempts to answer (1) and (2). These are questions about existence as such. Then, we show how those attempts have been used to address question (3). This is an ontological question, that is, a question, not about existence as such, rather about what exists.
Questions (1) is addressed in Sections 1 and 2. In Section 1, we discuss the orthodox view of existence: Existence is not a property of individual objects (often called a first-order property); rather, it is a property of properties of individual objects (second-order property). In the orthodox view, this leads to the tight connection between existence and quantification, which is expressed by terms like ‘something’ or ‘everything’ in natural language—this tight connection is illustrated by common practice to refer to the particular quantifier (‘something’) as the existential quantifier. In Section 2, we discuss two recent views that disagree with the orthodox view: Meinongianism and universalism. Meinongianism is the view that claims that (the unrestricted) particular quantifier is separated from existence—it is existentially unloaded— and that existence is a first-order property. In other words, some objects in the domain of quantification lack the first-order property of existence. Universalism also takes existence as a first-order property, but disagrees with Meinongianism regarding the following two points: first, it takes existence as a universal property, namely, a property that everything has; second, it takes (the unrestricted) particular quantifier as existentially loaded.
Question (2) is the subject matter of Section 3. To begin with, we introduce ontological pluralism, that is, the view according to which some things exist in a different way from others. After a brief historical introduction, we present a theological reason, a phenomenological reason and philosophical reason to endorse such a controversial view. Moreover, we focus our attention on how Kris McDaniel develops his own account of ontological pluralism in relation to Heidegger’s philosophy. To conclude, we briefly analyze van Inwagen’s argument against ontological pluralism and some possible replies.
Section 4 gives an example how the views on the nature of existence contribute to considering ontological question, by wrestling with question (3). We begin by discussing van Inwagen’s statistical argument: we present his argument and summarize some of the critiques against it. Then, we present McDaniel’s approach to question (3): by relying on ontological pluralism, he argues that, instead of wondering about why there is something rather than nothing, it would be more profitable to ask why there are ‘concrete material things’ rather than no ‘concrete material things’. To conclude, we compare McDaniel’s view and the Meinongian one defended by Priest.
1. Existence as a Second-Order Property and Its Relation to Quantification
The orthodox view of existence, which is influenced by Frege and Quine’s view of existence, is summarized by the following two claims:
FQ1 Existence is not a first-order property of individual objects, rather a second-order property.
FQ2 Quantifiers are existentially loaded.
This section gives a brief explanation of these two claims.
To begin with, let us see how FQ1 is connected with the famous slogan that existence is not a predicate, which is often understood in the light of Kant’s claim that ‘being’ is not a real predicate. By a real predicate, he means a predicate that can be contained in a concept (or the definition of a concept) of an object. For example, the concept of the Empire State Building contains the following predicates: being a building, having a total height of 443.2 meters, and so on. These are real predicates. According to Kant, ‘being’ cannot be a part of any concept of any object. He says:
When I think a thing, through whichever and however many predicates I like (even in its thoroughgoing determination), not the least bit gets added to the thing when I posit in addition that this thing is. (Kant, 1781/1787, A600/B628, English translation, p. 567)
Then, what does ‘A is’ do? Kant distinguishes two different usages of ‘be’. First, being is used logically in the judgements of the form ‘A is B’, and “[i]n the logical use it [that is, being] is merely the copula of a judgment” (Kant 1781/1787, A 598/B 626, English translation, p. 567). On the other hand, when being is used in judgements of the form ‘A is’, such judgements state that all predicates in A are instantiated by an object. Since Kant regards being used in the latter way as existence, ‘A is’ is the same as ‘A exists’. So, according to Kant, the judgement ‘A exists’ tells us that some object instantiates all predicates in the concept A without adding any new predicate in A. (For more exegetical details about Kant’s notion of real predicates, see Bennett (1974) and Wiggins (1995) as classics, and Kannisto (2018) as recent work.)
From the contemporary viewpoint, one crucial feature of Kant’s view on existence is that it takes existence not as a first-order property (a property of individual objects) but as a second-order property (a property of properties of individual objects). Frege is one of the most prominent proponents of this view. To see his point, let us first examine his view on numbers. According to Frege, a statement about how many things there are is not about individual objects, but about a property (concept, in his terminology) of individual objects. For example,
If I say “Venus has 0 moons”, there is simply no moon nor agglomeration of moons for anything to be asserted of; but what happens is that a property is assigned to the concept “moon of Venus”, namely that of including nothing under it. (Frege, 1884, p. 59, our translation)
Furthermore, Frege claims that existence is essentially a matter of number. He says “[a]ffirmation of existence is in fact nothing but denial of the number nought” (Frege, 1884, p. 65, English translation, p. 65), that is, existence is the second-order property of being instantiated by at least one individual object. Or, more properly, an existential statement does not attribute a first-order property to individual objects; rather, it attributes a second-order property of being instantiated by at least one individual object to a first order property—in this sense, the apparent first-order property of existence is analyzed away. Thus, (4a) and (5a) are paraphrased as (or, their logical forms are) (4b) and (5b), respectively:
(4)
a. Dogs exist.
b. The property of being a dog is instantiated by at least one individual object.
(5)
a. Unicorns do not exist.
b. The property of being a unicorn is not instantiated by any individual object.
This way of understanding existence shows how existence is related to quantification. It is helpful for understanding the notion of quantification to compare it with the notion of reference. Reference is a way to talk about a particular object as having a property. For example, ‘Gottlob Frege’ is an expression to refer to a particular man, that is, Gottlob Frege, and by using a singular statement ‘Gottlob Frege is a mathematician’, we can talk about him as having the property of being a mathematician. Quantification is not a way to talk about a particular object, but a way to talk about quantities, that is, it is about how many things in a given domain have a property. Quantifiers are expressions for quantification. For example, ‘everything’ is a quantifier, and a statement ‘everything is fine’ says that all of the things in a given domain have the property of being fine. ‘Everything’ is a universal quantifier, since by using it we state that a property is universally instantiated by all things in the domain. ‘Something’ is also a quantifier, but it is a particular quantifier: By using it we only state that a property is instantiated by at least one particular thing in the domain, without specifying by which one(s) the property is instantiated.
By using the particular quantifier ∃, (4b) is restated as (6a), which is read as (6b), and (5b) as (7a), which is read as (7b).
(6)
a. ∃xdog(x)
b. Something is a dog.
(7)
a. ¬∃x(unicorn(x))
b. Nothing is a unicorn.
Since existential statements are properly paraphrased by using particular quantifiers as illustrated above, Frege holds that the particular quantifier is an existential quantifier. This connection is also endorsed in his ‘Dialog with Puenjer on Existence’ (1979), where he claims:
Every particular judgement is an existential judgement that can be converted into the ‘there is’ [‘es gibt’] form. E. G. ‘Some bodies are light’ is the same as ‘There are light bodies’ (Frege, 1979, p. 63.)
Frege thus endorses the view that the particular quantifier is existentially-loaded. (Even though this is a standard interpretation of Frege on quantifier and existence, it is an exegetical issue how much metaphysical significance we should find in Frege’s comments on existence. Priest claims that Frege’s usage of ‘exist’ is its idiomatic use in mathematics and thus “it is wrong to read heavy-duty metaphysics into this” (Priest, 2005/2016, p. 331).)
The view that existence is properly expressed by quantification is hard-wired in Quine’s (1948) criterion of ontological commitment, one of the most influential theories of metaontology in 20th century (a brief explanation of the technical notions appearing in this paragraph is found in the Appendix). According to his criterion, the ontological commitment of a theory is revealed by what the theory quantifies over: More precisely, a theory is committed to the existence of objects if and only if they must be values of variables bound by quantifiers appearing in a theory in order for the theory to be true (for Quine, a theory is a set of sentences of first-order predicate logic). For example, if a biological theory contains a sentence ‘∃x population–with–genetic–diversity(x)’ (that is, ‘there are populations with genetic diversity’), the theory is committed to the existence of such populations. Quine’s criterion is more popularized as the following slogan:
(8) To be is to be the value of a bound variable.
‘To be’ here is understood as ‘to exist’, given that this is a criterion of ontological, that is, existential commitment. In this way, Quine’s criterion of ontological commitment inseparably ties existence and quantification. To sum, the orthodox view holds FQ1 and FQ2:
FQ1 Existence is not a first-order property of individual objects, rather a second-order property.
FQ2 Quantifiers are existentially loaded.
In other words, the apparent first-order property of existence is analyzed away in terms of the second-order property of being instantiated by at least one object, and this second-order property is expressed by the particular quantifier.
2. Existence as a First-Order Property and Its Relation to Quantification
So far, our discussion has been about the orthodox view on the nature of existence. In this section, we review two unorthodoxies. First of all, it became popular in the early twenty-first century to deny FQ1. According to such a view, existence is a first-order property of individual objects. The proponents of this view can be further divided into two main camps. The first camp is what we call universalism, which holds that the first-order property of existence is universal in the sense that every object has it. Maintaining FQ2, the advocates of this camp usually use the unrestricted existential quantifier to define the first-order property of existence so that everything in the domain of quantification exists. The second camp is called Meinongianism, which rejects not only FQ1 but also FQ2. According to Meinongianism, existence is a non-universal first-order property in the sense that some objects lack it, and the domain of quantification contains such nonexistent objects in addition to existent ones.
a. Meinongianism
The main claim of Meinongianism is that some objects exist, but some don’t. Contemporary Meinongians cash out this claim by detaching existence from quantifiers. The domain of (at least unrestricted) quantification contains not only existent objects but also nonexistent ones. Thus, with all due respect to Quine, to be existent is not to be the value of a variable. This claim is usually accompanied with another unorthodoxy, that is, the view that existence is a first-order property of individual objects. Moreover, Meinongianism holds a specific version of this claim. Let us call a property instantiated by all objects a universal property, and one instantiated by only some objects a non-universal property. Then, Meinongians hold that existence is a first-order non-universal property.
It is not easy to characterize what existence as a first-order property is (we will address this question soon). However, whatever it is, we have some intuitive ideas on what exists. Merely possible objects like flying pigs or talking donkeys do not exist; impossible objects like the round square or the perpetual motion machine do not exist; fictional characters like Doyle’s Sherlock Holmes or Murakami’s Sheep Man do not exist; mythical objects like Zeus or Pegasus do not exist; and so on. There can be some disagreement on exactly what objects should be counted as nonexistent, but such disagreement does not undermine the fact that we (at least some of us) have the intuition that some objects do not exist. Meinongians take our intuition at face value: these objects lack the property of existence. But, Meinongians continue, this doesn’t prevent us from talking about or referring to them nor from quantifying over them. We can, as the sentences in this paragraph clearly illustrate.
Meinongians take existence as just one of many properties of individual objects—an object may have it and may not. The nonexistence of an object does not deprive the status of being a genuine object—objecthood—from the object. As a genuine object, a nonexistent object can have various properties like being possible, being a flying pig and so on. Even some object has both properties of being round and being square, and thus, of being an impossible object.
Quine says that such an “overpopulated universe is in many ways unlovely” (Quine, 1948, p. 4), and many other theorists seem to agree with him. Putting aside such aesthetic evaluations, there are two main objections to Meinongianism that have had great influence for establishing the standard view in contemporary philosophy that Meinongianism is wrong. One is due to Russell (1905), according to which a theory that admits that even inconsistent objects are genuine objects entails contradictions: the non-square square is square and not square, and this is a contradiction. The other objection is due to Quine (1948), which says that there is no identity condition for nonexistent objects, therefore, we should not admit any of such objects as genuine objects. However, contemporary Meinongians have provided several different replies to these objections, and these lead to different versions of Meinongianism (Nuclear Meinongianism (Parsons, 1980; Routley, 1980, Jacquette, 2015), Dual-copula Meinongianism (Zalta 1988), Modal Meinongianism (Priest 2005/2016, Berto 2013)). Since this is not the right place for surveying contemporary Meinongian theories in detail, we just point out that there are consistent Meinongian theories that provide us a well-defined identity condition for existent objects and nonexistent objects. For a comprehensive and useful survey of this topic, see Berto 2013.
Then, what is existence for Meinongians? Some Meinongians (in particular Parsons) simply rely on our intuitive notion of existence according to which some objects do not exist (Parsons, 1980, p. 10). In so doing, they do not try to define existence (Parsons, 1980, p. 11). On the other hand, some Meinongians have proposed several different definitions of existence.
To begin with, according to Lambert, “Meinong held that existent objects are objects having location in space-time” (Lambert 1983, p. 13). Partly echoing Meinong, Zalta says “[b]y ‘exists,’ we mean ‘has a location in space’” (Zalta, 1988, p. 21). Priest has a different definition: Owing to Plato’s Sophist, he claims that “to exist is to have the potential to interact causally” (Priest, 2005/2016, p. xxviii). Given these definitions, they typically treat abstract objects like numbers or propositions as being nonexistent: they don’t have spatial locations nor causal power. Routley (1980) proposes two alternative definitions, but their formulations heavily depend on the details of his theory of nonexistent objects, and thus, we do not discuss them here (see also Rapaport, 1984; Paoletti, 2013).
At this point, one may wonder whether Meinongians equate existence as concreteness. This is not the case. At least, Meinongians don’t need to commit themselves to the equation. First, Persons explicitly rejects to “define ‘exist’ to mean something like ‘has spatio-temporal location”’ (Parsons 1980, p. 10). Moreover, he claims that his distinction between existence and nonexistence is about concrete objects: some concrete objects exist, but some don’t (cf. ibid, p. 10). Second, Priest points out that existence and concreteness behave differently in modal contexts. For example, according to Priest (2005/2016), (9a) is true from the view point of Meinongianism, but (9b) is false. This is because (i) Holmes is concrete and (ii) if it is not concrete but abstract, it could never have been concrete, since being abstract is a necessary property.
(9)
a. Holmes doesn’t exist, but could have existed.
b. Holmes is not a concrete object, but could have been a concrete object.
ote that Linsky and Zalta consider a third option which could undermine this argument: an object is neither concrete nor abstract, but could have been concrete (see Linsky and Zalta 1994; 1996).
b. Universalism
Even though some have an intuition that some objects do not exist (and consistent theories of nonexistent objects are available), many contemporary philosophers believe that everything exists. We call this view universalism. The main tenet of contemporary universalism is that existence is a first-order universal property which every object has. Thus it rejects FQ1. In what follows, we see how universalists define existence and confirm that they still hold FQ2.
To begin with, let’s see Frege. Answering to the question what ‘exist’ does in the explicitly existential statements like ‘some men exist’, Frege claims that it does nothing, in the sense that the word ‘exist’ is a predicate that any object universally satisfies. He tries to make this point clear by comparing ‘exist’ with ‘identical with itself’. Assuming that ‘A exists’ means the same as ‘A is identical with itself’ for any A, He claims:
the judgements ‘This table exists’ and ‘This table is identical with itself’ are completely self-evident, and that consequently in these judgements no real content is being predicated of this table. (Frege, 1979, pp. 62-63)
Note that, from this, Frege concludes that the word ‘exist’ does not properly express the property of existence. Indeed, he claims that this is an example that shows how we are easily deceived by natural language (Frege, 1979, p. 67). The true notion of existence is not expressed by the predicate ‘exist’. Rather, as we have seen, according to him, it is expressed by a particular quantifier.
Some contemporary philosophers accept the first half of Frege’s claim and reject its second half (cf. Evans, 1982; Kripke, 2013; Plantinga, 1976; Salmon, 1987; and so on). For them, it is quite legitimate to use the first-order predicate ‘exist’ as a predicate universally satisfied. Moreover, some philosophers claim not only that ‘exist’ is the first-order predicate universally satisfied but also that it expresses the property of existence, a first-order universal property. Salmon says “the [first-order] property or concept of being identical with something… is the sense or content of the predicate ‘exists’’” (Salmon, 1987, p. 64). Evans is less straightforward. According to him, the reference of the predicate ‘exist’ is “a first-level concept, true of everything” (Evans, 1982, p. 345), where a first-level concept is understood as a function from individual objects to truth values, and its sense is shown by the formula ‘∀x x satisfies ‘exist”. Finally, Plantinga says:
Among the properties essential to all objects is existence. Some philosophers have argued that existence is not a property; these arguments, however, even when they are coherent, seem to show at most that existence is a special kind of property. And indeed it is special; like self-identity, existence is essential to each object, and necessarily so. For clearly enough, every object has existence in each world in which it exists. (Plantinga, 1976, p. 148)
In short, he makes the following two points: (i) existence is a first-order property; (ii) it is necessarily the case that everything exists. The claim (ii) should not be confused with the claim that everything exists in every possible world. Thus, the view is compatible with the fact that whether an object exists or not is, in many cases, a contingent matter: The Empire State Building exists, but may not; the 55th state of USA does not exist, but it may; and so on.
Kripke makes the same points with a definition of existence, while carefully distinguishing existence from self-identity (Kripke, 2013, especially pp. 36-38).
He suggests defining x’s existence as ∃yy = x (both x and y are individual variables), he and claims that every object satisfies it. Two comments should be made. First, it is clear that this definition is based on the equation of the extension of existence and the domain of quantification and thus on the endorsement of FQ2. Second, from this definition, it follows that “‘for every x, x exists’ will be a theorem of quantification theory” (Kripke, 2013, p. 37). Thus, it is necessarily the case that everything exists: □∀x∃y(y = x) holds. However, Kripke emphasizes that this doesn’t entail that everything necessarily exists. Indeed, ∀x□∃y(y = x) doesn’t hold, while everything necessarily self-identical, that is, ∀x□x = x holds. Existence and self-identity should not be equated with each other.
Finally let us review two main arguments for universalism. A main argument for universalism is one that appeals to the paradox of negative singular existentials (cf. Cartwright, 1960). Berto (2013, p. 6) summarizes this argument as follows:
(P1) To deny the existence of something, one refers to that thing;
(P2) If one refers to something, then that thing has to exist;
(C) To deny the existence of something, that thing has to exist.
Since (C) means that denying the existence of something is self-refuting, universalists claim, we cannot deny the existence of any object.
This argument has had huge influence on the debates in contemporary metaphysics and compels many contemporary metaphysicians, except Meinongians. Meinongians avoid concluding (C) by rejecting (P2). Rejecting FQ2, Meinongianim holds that the domain of quantification contains nonexistent objects. And we can refer to such nonexistent objects by using referential expressions like proper names.
Another major argument for universalism is proposed by Lewis (1990). According to Lewis, Meinongians have two different particular quantifiers: the existentially-unloaded one and the existentially-loaded one. On the other hand, the universalist has only one particular quantifier, which is existentially-loaded. Lewis claims that, contrary to what Meingnongians (in particular Routley) take for granted, the Meinongian existentially-unloaded quantifier is translated as the universalist existentially-loaded quantifier—Objecthood for Meinongians is existence for universalists. Moreover, under this translation, Lewis suspects that Meinongian distinction between what exists and what doesn’t is just a distinction between what is concrete and what is not.
However, as we have seen, a Meinongian needs not equate existence with concreteness. Moreover, the Meinongian can adopt a notion of objecthood which is different from the universalist notion of existence. As we have seen, Kripke defines existence as ∃y(x = y). On the other hand, Priest sees the logical equivalence of objecthood with self-identity: x is an object iff x = x (Priest, 2014a, p. 437). As Kripke points out, these two notions behave differently: if we define x’s existence as x’s self-identity, it follows that everything necessarily exists. As we have seen, since this contradicts the fact that many objects only contingently exist, Kripke rejects this definition of existence (Kripke, 2013, p. 38). On the other hand, if we logically equate x’s objecthood with x’s self-identity, it follows that everything is necessarily an object. This consequence is compatible with the fact that many objects contingently exist from the Meinongian point of view, since quantification ranges over not only existent but also non-existent objects. So far we have seen two contemporary alternatives to the orthodox views. Both Meinongianism and universalism hold that existence is a first-order property of individual objects. The main difference between them is about whether existence is a universal property that every object has. According to Meinongianism, it is not: Some objects lack the property of existence. On the other hand, universalism holds that existence is a universal property. At this point, one may wonder why we cannot synthesize these two theories by holding that there are two different kinds of existence; one is universal and the other is not. While we leave this particular question as an open question, in the next section we introduce the basic ontological framework to which this line of thought straightforwardly leads us: ontological pluralism.
3. How Many Ways of Being Existent?
The world is packed with entities. There are tables, chairs, monuments and dreams. There are holes, cracks and shadows. There is the Eiffel Tower in Paris, Leonardo’s Mona Lisa at the Louvre and the empty set as well. Needless to say, all these entities are very different from each other. At the end of the day, we can climb the Eiffel Tower and add a mustache to the Mona Lisa; however, neither of these activities are possible with the empty set. Facing such an abundant variety of entities, some philosophers think that, even though all these entities exist, they exist in different ways. The philosophical view according to which there are different ways of existence is known as ontological pluralism.
As Turner (2010) and McDaniel (2009; 2017) have already discussed, some historical figures have been interpreted as being committed to Ontological Pluralism. Some examples are: Aristotle (1984a; 1984b) to Saint Thomas (1993; 1961), from Meinong (1983; 1960) to Moore (1983; 1904), from Russell (1988) to Husserl (2001) and Heidegger (1962). Having said that, ontological pluralism does not simply represent an important idea in the history of philosophy. Far from being an archaeological piece in the museum of ideas, in the early twenty-first century, ontological pluralism has undergone an important revival in analytic philosophy through the works of McDaniel (2009; 2010; 2017) and Turner (2010; 2012). As Spencer points out, such a revival consists in a “defence” and “explication of the [historical] views” (Spencer 2012, p. 910).
If we look back at the history of philosophy, it is possible to find at least two motivations in support of ontological pluralism. The first one is theological. Famously, God has some features that no other entity seems to have: for instance, He is eternal, infinite and omniscient. Having said that, some theologians believe that God is so different in kind that it is impossible for any given feature to be truly ascribed to both God and His creatures. Unfortunately, this seems to be patently false. In fact, there is at least one feature that they must share, namely existence. At this point, philosophers and theologians have tried to overcome this conundrum by endorsing ontological pluralism and by admitting that God’s existence is different than the existence his creatures enjoy (compare McDaniel 2010, p. 693).
The second motivation is phenomenological. Phenomenologists are famous for claiming that all sorts of entities are given to us. In our everyday life, we experience tables, chairs, people and even logical concepts such as existential quantifiers and negation. Following the interpretation favoured by McDaniel (2009; 2010), Heidegger believes that, among all these entities, different ways of existence are given to us as well. For instance, we experience a first way of existence proper to pieces of equipment (that is, readiness-to-hand), a second way of existence proper to abstract entities (that is, subsistence) and a third way of existence proper to entities that are primarily characterized by spatio-temporal features (that is, presence-at-hand). If so, ontological pluralism might have a phenomenological ground (compare McDaniel 2010, p. 694).
More recently, analytic philosophers have added a third motivation in support of ontological pluralism. Consider a material object and the space-time region in which that material objects are located. There is a sense in which these two things, somehow, exist. However, a material object exists at a certain region of space-time and, therefore, its existence is relative to that region of space-time. This is not the case for space-time regions: their existence is not relative to another space-time region. Their existence is relative to nothing at all.
All this is supposed to show that, as suggested by McDaniel (2010) and summarized by Spencer (2012, p. 916), existence is systematically variably polyadic. On the one hand, existence can be either relative to something (see material objects) or relative to nothing (see space-time regions). This is what makes existence variably polyadic. On the other hand, there are many clusters of entities which systematically share the same kind of existence. For instance, material objects always exist at space-time regions, and space-time regions simply exist. This is what makes existence systematic. Now, according to some analytic philosophers, the fact that existence is systematically variably polyadic should nudge us to believe that material objects have one mode of existence (let’s say existence-at) and space-time regions have another mode of existence (let’s say simply-existence). Ontological pluralism is thus needed.
Until now, we have reviewed what ontological pluralism is and what are its main motivations. What about all the different ways in which ontological pluralism has been articulated, though? Well, needless to say, in the history of philosophy, we can find many different kinds of ontological pluralism. However, in this section, we will focus our attention on the one endorsed by McDaniel because it represents an original way of combining traditions and ideas that are not commonly merged. On the one hand, McDaniel appeals to some ideas rooted continental philosophy and, on the other hand, he employs some of the formal tools which are proper to contemporary logic.
McDaniel abandons the familiar landscape of analytic philosophy by arguing that, according to Heidegger, ‘existence’ is an analogical expression. (For an historical review of Being as an analogical expression which goes beyond Heidegger’s philosophy, see McDaniel, 2009, footnote 13.) In other words, “[‘existence’] has a generic sense, which, roughly, applies to objects of different sorts in virtue of these objects exemplifying very different features” (McDaniel 2009, p. 295). Following McDaniel’s favourite interpretation, Heidegger would certainly agree with the idea that there is a general concept of existence: exactly in virtue of its generality, this concept covers all entities whatsoever. However, Heidegger would also argue that some of these entities, in virtue of the features they exemplify, exist in a different way than others. As such, “there is multiplicity of modes of being [that is, existence]” (McDaniel 2009, p. 296). For instance, a hammer, a stone and a number all exist. However, a hammer is ready-to-hand, a number subsists and a stone is present-at-hand, as explained above.
Having said that, McDaniel tries to formulate this idea in the most precise way possible and, in so doing, he appeals to some of the resources offered by formal logic. According to McDaniel, the general sense of existence can be spelled out through the unrestricted existential quantifier: For any entity, x, that exists, we can truly say that ∃y(y = x) (compare McDaniel 2009, p. 301). Furthermore, McDaniel believes that the various modes of existence can be represented by restricted quantifiers, that is, quantifiers ranging over some proper subsets of the domain of the unrestricted one (McDaniel 2009, p. 302). This means that, according to what we have said until now, in order to properly articulate Heidegger’s phenomenology, we should employ at least three kinds of restricted quantifiers: (1) a ready-to-hand quantifier (∃ready-to-hand) which ranges only over pieces of equipment; (2) a present-at-hand quantifier (∃present-at-hand) which ranges only over entities that are uniquely characterized by spatio-temporal features and (3) a subsistential quantifier (∃subsistence) which ranges only over abstract entities.
Before continuing, it might be interesting to notice that McDaniel’s approach to ontological pluralism seems to be faithful to its Heideggerian roots at least in the following sense. Coherently with what Heidegger labels ontological difference, McDaniel’s ontological pluralism does not treat existence as an entity. Existence is neither a constant symbol (compare McDaniel 2009, p. 301) nor any special kind of property (compare McDaniel 2009, p. 301); otherwise, existence would appear to be, in the eyes of a philosopher engaging with first-order logic, something over which we can quantify and, therefore, an entity as well. Existence, following McDaniel, is captured by the quantifiers themselves and, for this reason, it cannot be taken to be an entity of any sort.
Regardless of the prestigious historical heritage that grounds ontological pluralism, in the contemporary analytic debate, this theory has not always been welcome. Even though many arguments had been moved against it (compare McManus 2009; Turner 2010), the one proposed by Peter van Inwagen (1998) resonated particularly loud. To begin with, van Inwagen underscores the deep connection between the activity of quantifying over entities and the activity of counting over entities. At the end of the day, when we say that, in the drawer of my desk, there are a pen, an eraser and a ruler, we are simply saying that there are three entities in the drawer of my desk. Now, in light of this connection, it seems fair to say that, if we believe that there are different ways of existing and that these different ways are captured by different quantifiers, there should be different ways of counting too. Moreover, if there are different ways of counting, there should be different kinds of numbers as well. However, with all due respect to the ontological pluralists, this seems evidently false. When we refer to three stones, three hammers and three geometrical figures, we do not use different numerical systems. In all these cases, the number three seems to be pretty much the same. Therefore, facing such an absurd conclusion, van Inwagen declares that there cannot be more than one way of existing.
Of course, the reply was not long in coming. First of all, while McDaniel (2009) agrees with van Inwagen that there is only one way of counting, which is represented by the unrestricted quantifier, he denies the validity of the inference that goes from the fact that there are many ways of existing to the fact that there are many ways of counting. Secondly, Turner (2010) argues that, from the fact that there are different ways of counting, it does not necessarily follow that there are different numbers. In order to argue so, he distinguishes between numbering relations (the relation between, for instance, two pens and the number two) and the numbers themselves. Against van Inwagen, Turner believes that there might be many different numbering relations and only one kind of numbers.
A final remark: Contemporary advocates of ontological pluralism presuppose the Quinean interpretation of quantification and hold that being with its modes is equated with existence with its modes. Someone might wonder whether this is an essential feature of ontological pluralism. This is not the case. For example, Meinong is an ontological pluralist to the extent that there are at least two different ways of being, that is, existence and subsistence, but he never equated existence with being.
4. Why Is There Something Rather than Nothing?
This article has been concerned with the notion of existence as such. It is natural to ask how different views on existence can influence ontological questions. In this section we examine how the views presented in the previous sections have been used to address a long-standing worry in the history of ontology: Why there is something rather than nothing? In particular, we discuss a Quinean strategy (that is, the strategy presented by van Inwagen), a strategy which employs ontological pluralism (that is, the strategy presented by McDaniel) and a Meinongian strategy (that is, the strategy presented by Priest).
Let’s begin with Peter van Inwagen, the champion of the so-called statistical argument (1996). Consider the actual world. This is the world we inhabit: in the actual world, the Eiffel Tower is in Paris, Leonardo painted the Mona Lisa and Duchamp added a moustache and a goatee to it. In the actual world, there is St. Andrews University, I miss the amazing lasagne of my grandmother and a terrible war was declared after the 11th of September 2001. This is the world we live in.
Of course, just by using our imagination, we can fantasize about other worlds that are not ours, even though they could have been. We can imagine a first possible world in which Duchamp painted the Mona Lisa and Leonardo added a moustache and a goatee to it. We can imagine a second possible world in which St. Andrews University does not exist, a third possible world in which I hate the horrible lasagne of my grandmother and a fourth possible world in which there was no war after the 11th of September 2001. In front of this uncontrolled proliferation of worlds, someone might wonder how many possible worlds we can actually conceive. Given the boundless power of our imagination, van Inwagen replies that we can have infinite possible worlds. Among them, one and only one is an empty world, that is, a world with no entities whatsoever.
At this point, it is important to recall that, since van Inwagen is a faithful Quinean, he understands existence in quantificational terms. This is the reason why, according to him, the theoretical framework introduced above can help us to understand why there is something (that is, there is at least an entity in the actual world) rather than nothing (that is, there are no entities in the actual world). Now, think about a lottery in which there are infinite tickets: only one of them is the lucky winner. In this case, the chance that a given ticket is the lucky winner is 0. By analogy, think about the infinite number of possible worlds described above: only one of them is the empty one. In this case, the chance that the actual world is the empty one is 0. Thus, the reason why there is something rather than nothing is that the empty world is, if not impossible, improbable as anything else.
Needless to say, many philosophers did not miss the opportunity to challenge this argument. Some of them have discussed the explanatory power of van Inwagen’s argument (see Sober 1983). Some others have debated van Inwagen’s assumption about the uniqueness of the empty world (see Grant 1981; Heil 2013; Carroll 1994). However, a very different approach has been proposed by Kris McDaniel (2013). He provocatively asks: What if it is not really philosophically important to establish why there is something rather than nothing? What if this riddle is just a silly question we might want to forget and move on to more interesting topics? Perhaps, McDaniel suggests, there are better questions to be asked. Let’s see why.
McDaniel is very serious about absences. This might be taken to be unsurprising since we all engage with them in a fairly liberal way. According to McDaniel, all of us are able to count absences, be gripped by grief because of them and ruminate on them. If we have a philosophical training, we might be able to classify them in different kinds as well. For instance, the shadow is one kind of absence (that is, the absence of light) while a hole is another kind of absence (that is, the absence of matter). In light of all these observations, McDaniel concludes that (a) an absence is something and (b) this something exists. He writes: “the absence of Fs exists if and only if there are no Fs” (McDaniel, 2013, p. 277).
Now, given what we wrote above, worrying about why there is something rather than nothing turns into a trivial matter. Consistent with the intuition defended by Baldwin (1996) and Priest (2014b), McDaniel argues that, when we use ‘nothing’ as a term, it is natural to think that it refers to a global absence: the absence of everything. If so, this absence is something and, consequently, it exists. This means that, even if there were nothing, the absence of everything would exist and, therefore, there would be something.
At this point, two remarks are necessary. First of all, McDaniel does not want to be committed to the line of thought presented above. At best, he cautiously claims that “it is not clearly incorrect” (McDaniel, 2013, p. 278). However, McDaniel is convinced that such a line of thought represents a warning light: in seeking the reason why there is something rather than nothing, we might easily discover that this worry is a pretty superficial one. According to McDaniel, we might avoid this danger by looking at the problem as an ontological pluralist and by moving our attention from the general meaning of existence (that is, the unrestricted quantifier) to some specific modes of existence (that is, the restricted quantifier). In other words, McDaniel suggests that it would be safer and more profitable to engage with what he labels a Narrow Question: why are there ‘concrete material things’ rather than no ‘concrete material things’?
The second remark is concerned with Meinongiansim. In a passing remark (2014b, p. 56), Priest runs a similar argument to the one presented by McDaniel. Like McDaniel, Priest takes ‘nothing’ to be a term which refers to a global absence. Furthermore, like McDaniel, he argues that there is something rather than nothing because, even when there is nothing, there is something, namely the absence of everything. It is interesting to notice that the difference between the two positions is all about their understanding of existence. According to McDaniel, existence in always spelled out in quantificational terms. As such, since there is something like the absence of everything, this something exists. Now, Priest believes that there is something like the absence of everything as well; however, contrary to McDaniel and given his Meinongian credo, he can still hold the position that it does not exist. From this point of view, given Priest’s Meinongian stance, wondering why there is something rather than nothing is not necessarily equivalent to wondering why something exists rather than nothing. This is all meant to show how different accounts of existence can generate different ways of understanding and answering one of the oldest philosophical questions: why is there something rather than nothing?
5. Conclusion
Here we are at the end of our survey about existence. Even though we covered a lot of material, it is fair to say that there are even more topics that, unfortunately, we don’t have space to address. For instance, in this article, we did not review the current philosophical debate devoted to understanding what kinds of things exist. A prominent example of this philosophical enterprise is represented by the dispute between nominalism and realism about properties. Furthermore, much more could have been said about the philosophical attempt of spelling out which things, in a given kind, exist. For instance, as we did not cover the debate on whether properties are abundant or sparse, we did not present the dispute about whether mathematical objects or fictional characters exist either. Finally, in order to suggest that there are many different ways in which ontological pluralism has been understood, we could not do more than present a long and, nonetheless, incomplete list of names.
Having said that, we hope that this article helps readers to navigate the extremely rich secondary literature about existence. We have tried to give a fairly complete overview of the most important ways of understanding such a complicated topic while, at the same time, trying to underscore the importance of the more unorthodox and under-considered philosophical accounts of existence. Given this abundance of intuitions, ideas, philosophical traditions, and ontological accounts, the hope is that the present work can represent, on the one hand, a helpful map to orient ourselves in this vast debate and, on the other hand, a nudge to explore new directions of research.
6. Appendix
In the formal language of logic, a quantifier like ∃ or ∀ is prefixed to a sentence together with a variable to form a new sentence. For example, dog(x) is a sentence, and prefixing ∃ together with a variable x to it results in a new quantificational sentence, that is, ∃xdog(x). A quantifier prefixed to a sentence together with a variable x binds every occurrence of x in the sentence in so far as it is not bound by another quantifier. So the variable x in dog(x) in ∃xdog(x) is bound by ∃ appearing at its beginning. Now, consider a formula that contains an unbound—free—variable x:
(10) dog(x) x is a dog
In (10), x is, so to speak, a blank which we can fill with different values. Note that it is nonsense to ask whether (10) intrinsically is true or not, as the question ‘is x plus 4 6?’ is an unanswerable question. Sentences like (10) are truth-evaluable only relative to a value that fills the blank—a value that is assigned to x: If we fill the blank with a particular dog, say, Fido, the result is true; if we do with a particular man, say, Frege, the result is false. The truth conditions of quantificational sentences like ∃xdog(x) or ∀xdog(x) are defined by using values of variables bound by quantifiers:
(11)
a.
∃xdog(x) is true iff for some value d, dog(x) is true relative to d assigned to x
(12)
b.
∀xdog(x) is true iff for any value d, dog(x) is true relative to d assigned to x
The domain of quantification is the set of things that can be values of variables of quantification.
7. References and Further Reading
Aquinas, T. (1993). Selected Philosophical Writings, Oxford University Press.
Aquinas, T. (1961). Commentary on the Metaphysics of Aristotle. Volume I, Henry Regnery Company.
Aristotle. (1984a). The Complete Works of Aristotle. Volume I, Oxford University Press.
Aristotle. (1984b). The Complete Works of Aristotle. Volume II, Oxford University Press.
Baldwin, T. (1996). ‘There might be nothing’, Analysis, 56: 231 ‒ 38.
Bennett, J. (1974). Kant’s Dialectic, Cambridge, Cambridge University Press.
Berto, F. (2013). Existence as A Real Property: The Ontology of Meinongianism, London: Springer.
Berto, F. and Plebani, M. (2015). Ontology and Metaontology: A contemporary Guide, London: Bloomsbury Academic.
Carroll, J. W. (1994). Laws of Nature, Cambridge: Cambridge University Press.
Cartwrite, R. (1960). ‘Negative existentials’, Journal of Philosophy, 57, 629-639.
Casati, F., and Fujikawa, N. (2019). ‘Nothingness, Meinongianism and inconsistent mereology’. Synthese, 196(9), 3739-3772.
Chisolm, R. (1960(. Realism and the Background of Phenomenology, The Free Press.
Evans, G. (1982). The Varieties of Reference, Oxford: Clarendon Press.
Findlay, J. N., (1963). Meinong’s Theory of Objects and Values, Second, Clarendon Press, Oxford University Press.
Fine, K. (1982). ‘The problem of non-existents’, Topoi, 1: 97-140.
Frege, G. (1884). Die Grundlagen der Arithmetik: Eine Logisch Mathematische Untersuchung ueber den Begriff der Zahl, Breslau: Verlag von Wilhelm Koebner. (English translation, The Foundation of Arithmetic: A Logico-Mathematical Enquiry into the Concept of Number, Second Revised Edition, translated by J. L. Austin, Oxford: Basil Blackwell, 1968).
Frege, G. (1979). Posthumous Writings, Oxford: Basil Blackwell.
Grant, E. (1981). Much ado about Nothing. Theories of Space and Vacuum from the Middle Ages to the Scientific Revolution, Cambridge: Cambridge University Press.
Heidegger, M. (1962). Being and Time, Harper & Row Publishing.
Heil, J. (2013). ‘Contingency’, in The Puzzle of Existence, Tyron Goldschmidt (ed.), New York: Routledge, 167 ‒ 181
Husserl, E. (2001). Logical Investigations. Volume II, Routledge Press.
Jacquette, D. (2015). Alexius Meaning: the Shepherd of Non-Being, (Synthese Library, vol 360), Springer.
Kannisto, T. (2018). ‘Kant and Frege on existence’, Synthese, 195: 34073432.
Kant, I. (1781/1787). Kritique der Reinen Vernunft. English translation: Guyer, R. and A. W. Wood (trans and eds). (1998), Critique of Pure reason, The Cambridge Edition of the works of Immanuel Kant, Cambridge: Cambridge University Press.
Kripke, S. (2013). Reference and Existence: The John Locke Lectures, Oxford: Oxford University Press.
Lambert, K. (1983). Meinong and the Principle of Independence: its Place in Meinong’s Theory of Objects and its Significance in Contemporary Philosophical Logic, Cambridge: Cambridge University Press.
Lewis, D. (1990). ‘Noneism or allism?’, Mind, 99, 23-31.
Linsky, B., and Zalta, E. (1994). ‘In defense of the simplest quantified modal logic’, Philosophical Perspectives 8: Logic and Language, J. Tomberlin (ed.), Atascadero: Ridgeview, 431-458.
Linsky, B., and Zalta, E. (1996). ‘In defense of the contingently concrete’, Philosophical Studies, 84: 283-294.
McDaniel, K. (2009). ‘Ways of being’ in Chalmers, Manley and Wasserman (eds.) Metametaphysics, Oxford University Press.
McDaniel, K. (2010). ‘A return to the analogy of being’, Philosophy and Phenomenological Research, 81: 688-717.
McDaniel, K. (2013). ‘Ontological pluralism, the gradation of being, and the question of why there is something rather than nothing’, in The Philosophy of Existence, Tyron Goldschmidt (ed.), New York: Routledge, 290-320. [33] McDaniel, K. (2017). The Fragmentation of Being. Oxford University Press.
Meinong, A. (1960). ‘On the Theory of Objects’ in Chisholm. 1960.
Meinong, A. (1983). On Assumptions, University of California Press.
Paoletti, M. (2013), ‘Commentary on Exploring Meinong’s Jungle and Beyond: an Investigation of Noneism and the Theory of Items’, Humana. Mente: Journal of Philosophical Studies, 25. 275-292.
Parsons, T. (1980). Nonexistent Objects, New Haven: Yale University Press.
Plantinga, A. (1976). ‘Actualism and possible worlds’, Theoria, 42, 139160.
Priest, G. (2014a). ‘Sein language’, The Monist, 97(4), 430-442.
Priest, G. (2014b). One: Being an Investigation into the Unity of Reality and of its Parts, including the Singular Object which is Nothingness, Oxford: Oxford University Press.
Priest, G. (2005/2016). Towards Non-Being: The Logic and Metaphysics of Intentionality, the 2nd Edition, Oxford: Oxford University Press.
Rapaport, W. (1984). ‘Review of Exploring Meinong’s JUngle and Beyond’, Philosophy and Phenomenological Research, 44(4), 539-552.
Quine, W. V. O. (1948). ‘On what there is’, in Review of Metaphysics, 48, 21-38, reprinted in Quine, W. V. O.(1953). From a Logical Point of View, Cambridge: Harvard University Press, pp. 1-19.
Routley, R. (1980). Exploring Meinong’s Jungle and Beyond, Canberra: RSSS, Australian National University.
Russell, B. (1905). ‘On denoting’, Mind, 14, No. 56, pp. 479-493.
Russell, B. (1988). The Problems of Philosophy, Prometheus Books.
Salmon, N. (1987). ‘Existence’, Philosophical Perspetives 1: 49-108.
Sober, E. (1983). ‘Equilibrium explanation’, Philosophical Studies, 43: 201210.
Sylvan, R. (1995). ‘Re-exploring item-theory’, Grazer Philosophische Studien, 50, 47-85.
Sylvan, R. (1997). Transcendental Metaphysics: From Radical to Deep Pluralism, Isle of Harris: White Horse Press.
Spencer, J. (2012). ‘Ways of being’, Philosophy Compass, 12: 910 ‒ 918.
Turner, J. (2010). ‘Ontological pluralism’, The Journal of Philosophy, 107: 5 ‒ 34.
Turner, J. (2012). ‘Logic and ontological pluralism’, Journal of Philosophical Logic, 41: 419-448.
Van Inwagen, P. (1996). ‘Why Is there anything at all?’, Proceedings of the Aristotelian Society, 70: 95-110.
Van Inwagen, Peter. (1998). ‘Meta-ontology’, Erkenntnis, 48: 233-250.
Wiggins, D. (1995). ‘The Kant-Frege-Russell view of existence: toward the rehabilitation of the second-level view’, in Modality, Morality and Belief, Essays in Honor of Ruth Barcan Marcus.
Naoya Fujikawa
Email: fjnaoya@gmail.com
The University of Tokyo
Japan
Roderick M. Chisholm: Epistemology
Roderick M. Chisholm, a luminary of 20th century philosophy, is best known for his contributions in epistemology and metaphysics. His groundbreaking theory of knowledge opened the door to the late 20th and early 21st century work on the analysis of knowledge, skepticism, foundationalism, internalism, the ethics of beliefs, and evidentialism, to name just a few topics. Chisholm’s analysis of knowledge was the basis of the Gettier problem.
Chisholm responds to skepticism as one of three alternatives to the ancient, insoluble problem of the wheel, which he termed the problem of the criterion—the vicious circle encountered in answering the two fundamental questions of epistemology: ‘What kinds of things can we know?’ and ‘What are the sources of knowledge?’. Answering either requires first answering the other. Chisholm adopts particularism, Thomas Reid’s and G. E. Moore’s ‘common sense’ approach, which proceeds by proposing a tentative answer to the first question, in order to answer the second question.
Chisholm provides an analysis of epistemic justification as a response to the Socratic question “What is the difference between knowledge and true opinion?” He explains justification as epistemic preferability, a primitive relationship based on the epistemic goals and ethics of belief. Chisholm defines terms of epistemic appraisal associated with various levels of justified belief to elucidate the level required for knowledge. The sufficiency of Chisholm’s analysis is examined in light of the Gettier problem.
Chisholm’s epistemology is the standard bearer of foundationalism, first proposed by René Descartes. In its defense, Chisholm proposes a unique answer to explain why empirical knowledge rests on foundational certainties about one’s mental/phenomenal experiences, that is, sense-data propositions.
Chisholm resolves the metaphysical objections to sense-data raised by philosophers such as Gilbert Ryle. Chisholm argues that under certain conditions, sense-data propositions about how things appear are self-presenting, certain, and directly evident—the foundation of empirical knowledge.
Chisholm defines a priori knowledge to explain how necessary truths are also foundational. This definition explains Kant’s claims about synthetic a priori propositions and provides insight into the status of Chisholm’s epistemic principles.
Finally, Chisholm answers the problem of empiricism that plagued philosophers since John Locke, the problem of accounting for the justification of beliefs about the external world (non-foundational propositions) from propositions about the contents of one’s mind (foundational propositions). Chisholm proposes epistemic principles explaining the roles of perception, memory, and coherence (confirmation and concurrence) to complete his account of justification.
Roderick M. Chisholm, one of the greatest philosophers of the 20th century (Hahn 1997), was not only a prolific writer best known for his works on epistemology (theory of knowledge) and metaphysics, but for his many students who became prominent philosophers. In epistemology, Chisholm is best known as the leading proponent of foundationalism, claiming that:
empirical knowledge is built on a foundation of the evidence of the senses; and
we have privileged epistemic access to the evidence of our senses.
Foundationalism has its roots in Rene Descartes’ classic work of early modern philosophy, Meditations on First Philosophy. Foundationalism was central to the debate concerning the nature of human knowledge between the Continental Rationalists (Descartes, Spinoza, and Leibniz), the British Empiricists (Locke, Berkeley, and Hume) and Kant. In the 20th century, Bertrand Russell and A. J. Ayer, luminaries of British Analytic Philosophy, and C. I. Lewis, the American Pragmatist, defended foundationalism; while Logical Positivists, including Hans Reichenbach, a member of the Vienna Circle, argued that foundationalism was untenable. After World War II, Chisholm entered this debate defending foundationalism from attacks by W. V. O. Quine and Wilfred Sellars; all three having been students of C. I. Lewis at Harvard University.
Chisholm’s writings on epistemology first appeared in a 1941 article and his comprehensive and detailed account of perceptual knowledge was first presented in his 1957 book Perceiving (Chisholm 1957). He refined his epistemology over the next forty years in response to counterexamples, objections, and questions raised by his colleagues, students, and critics. These refinements first appeared in Chisholm’s numerous published articles, and were incorporated into the three editions of Theory of Knowledge published in 1966, 1977 and 1989.
Chisholm’s epistemology was unique in not only addressing the “big questions”, but in presenting a detailed theory accounting for the structure of epistemic knowledge and justification.
a. The Fundamental Questions of Epistemology
Chisholm opens his final edition of Theory of Knowledge addressing the most basic problem of epistemology, the challenge of skepticism—the view that we do not know anything. (For an explanation of skepticism, see: https://iep.utm.edu/epistemo/#SH4b). Chisholm explains that to answer this challenge an answer is required to the question:
What do we know or what is the extent of our knowledge?
This, in turn, requires an answer to the question:
How are we to decide, in a particular case, whether we know or what are the criteria of knowing?
But, to answer this second question, about the criteria of knowing, we must answer the first question, that is, of what we know. The challenge of skepticism thus ensnares us in the ancient problem of “the diallelus”—the problem of “the wheel”, or, as Chisholm calls it, “the problem of the criterion”. (For a detailed explanation of this problem, see https://iep.utm.edu/criterio/.)
The problem of the criterion can only be resolved by adopting one of three views: skepticism, particularism, or methodism. Skepticism claims that we do not know the extent of our knowledge and we do not know the criteria of knowledge, hence, we do not or cannot know anything. Particularism claims that there are particular propositions or types of propositions that we know, so we have at least a partial answer to the first question, and we can use these paradigm cases of knowledge to develop criteria of knowing, answering the second question. Methodism claims that we can identify certain criteria of knowing, answering the second question, which in turn provides criteria which can be employed to determine the extent of our knowledge, answering the first question.
Chisholm asserts that deciding between these three views cannot be done without involving ourselves in a vicious circle or begging questions: assuming an answer to one or both of the questions posed. That is, Chisholm maintains that the problem of the criterion cannot be solved. Chisholm adopts a “common sense” brand of particularism following in the footsteps of Thomas Reid (the 18th century Scottish philosopher) and G. E. Moore (the 20th century English philosopher). The “common sense” assumption is that we know more or less what, upon careful reflection, we think that we know. To justify this working assumption of commonsense particularism, Chisholm sets out as a goal of epistemology to improve our beliefs by ranking them with respect to their relative reasonability. Doing this leads him to adopt an internalist conception of justified belief, presupposing “that one can know certain things about oneself without the need of outside assistance” (Chisholm 1989, pg. 5).
The breadth and depth of Chisholm’s epistemology require focusing here on his solution to four fundamental questions and problems in the theory of knowledge:
The analysis of knowledge or, in his terms, the problem of the Theatetus;
Why knowledge must rest on a foundation of sense-data (or why foundationalism)?
What is the nature of the data of the senses (the Directly Evident) and the truths of reason (the a priori), conferring on them privileged epistemic status to serve as a foundation of knowledge?
How is epistemic justification transmitted from sense-data to empirical propositions about the external world (the Indirectly Evident)?
The primary focus of this discussion will be Chisholm’s account of empirical knowledge (or a posteriori knowledge). In the process Chisholm’s account of the knowledge of necessary truths or a priori knowledge is examined.
b. Chisholm’s Philosophical Method
Chisholm introduces his epistemology by clearly articulating the specific philosophical puzzles or problems he proposes to solve, stating the problem unambiguously, presenting his solution clearly defining the terms in which his proposal is cast, considering a series of counterexamples or conceptual tests, and responding to each in detail. This approach not only characterized Chisholm’s philosophical writings, but also his pedagogical methodology. He conducted seminars attended by graduate students and faculty members, at Brown University (where he was on the faculty for virtually his entire academic career) and at the University of Massachusetts at Amherst (to which, for many years, he traveled 100 miles to conduct a weekly seminar). In addition to his colleagues at Brown, notable attendees at these seminars included Edmund Gettier, Herbert Heidelberger, Gareth Matthews, Bruce Aune, Vere Chappell, and his former graduate students Robert Sleigh and Fred Feldman.
Chisholm would present philosophical puzzles and his solutions, and the attendees would challenge his solutions by raising counterexamples, objections, and problems. Chisholm would respond to them, and then return the next week with a revised set of definitions and principles to be defended from the welcomed onslaught of a new set of counterexamples. Honoring this methodology, the Philosophical Lexicon defined a term:
chisholm, v. To make repeated small alterations in a definition or example. “He started with definition (d.8) and kept chisholming away at it until he ended up with (d.8””””).”
2. The Traditional Analysis of Knowledge
Chisholm opens the first edition of Theory of Knowledge (Chisholm 1966) considering Socrates’ claim in Plato’s Meno that, even though he does not know much, he knows that there is a difference between true opinion and knowledge. The Socratic challenge is to explain the difference between knowing a proposition and making a lucky guess. Plato’s answer, known as the Traditional Analysis of Knowledge (TAK), can be expressed as:
(TAK) S knows p =Df
1. S believes (or accepts) p;
2. p is true; and
3. S is justified in believing p.
(where S is the knower and p is a proposition or statement believed).
Thus, according to TAK, the difference between knowing that the Mets won last year’s World Series and making a lucky guess is having an account for, having a good reason for, or being justified in believing that they won the World Series.
Chisholm raises Socrates criticism of the traditional analysis of knowledge in Plato’s Theatetus:
We may say of this type of definition, then, what Socrates said of the attempt to define knowledge in terms of reason or explanation: “If, my boy, the command to add reason or explanation means learning to know and not merely getting an opinion…, our splendid definition of knowledge, would be a fine affair! For learning to know is acquiring knowledge, is it not?” (Theatetus 209E; cited in Chisholm 1966, pg. 7).
Chisholm explains that justified belief is ordinarily understood to presuppose the concept of knowledge. Therefore, the traditional analysis of knowledge appears to be circular, that is, defining knowledge in terms that are themselves defined in terms of knowledge. Chisholm sets out to explain a notion of epistemic justification which inoculates TAK from the charge of pernicious circularity.
a. Chisholm’s Analysis
Chisholm proposes to define knowledge as follows:
(TAK-C) S knows p =Df p is true, S accepts p, and p is evident for S. (Chisholm 1977 pg. 102)
He then undertakes to explain how his version of the traditional analysis avoids the circularity problem of the Theaetetus.
In this analysis the requirement of justified belief, the justification condition of knowledge, is replaced by a condition that the proposition is evident, where evident is a technical term of epistemic appraisal for which Chisholm provides a definition. Roughly speaking, a proposition is evident for a person on the condition that the evidence available to a person is sufficiently strong to constitute a good reason for believing the proposition.
Chisholm does not think that replacing the term justified belief with the term evident magically solves the circularity problem of the Theatetus. In fact, Chisholm concedes that his terms of epistemic appraisal, for example, evident, justified belief, know, more reasonable, certain, beyond reasonable doubt, acceptable, having some presumptions in its favor, gratuitous, and unacceptable, possibly form a closed circle of concepts that cannot be defined in non-epistemic terms. To avoid this seeming circularity, Chisholm first specifies a primitive term which expresses a relationship of epistemic preferability, more reasonable than, explaining this in terms of a specified set of epistemic goals or intellectual requirements of rationality. Next, he defines all of the terms of epistemic appraisal in terms of this primitive relationship. These technical terms, in turn, define various levels of epistemic justification. His final step is to identify the level of epistemic justification, that is, evident, as being the level of justification required for knowledge.
Chisholm fills in the details by providing an ethics of belief and a logic of epistemic terms. The full account of empirical knowledge is completed with a set of epistemic rules or principles, analogous to the rules of morality or logic, explaining the structure of the justification of empirical beliefs. Chisholm believed that the adequacy of a theory of knowledge is dependent on these principles.
The first condition of Chisholm’s analysis of knowledge requires the knower to accept (or, as more commonly expressed, believe) the proposition. Acceptance is one of three mutually exclusive propositional attitudes a person may take with respect to any proposition that he/she considers; the other two being: (1) denying the propositions, that is, accepting the denial (negation) of the proposition, and (2) withholding or suspending judgment about the proposition, that is, neither accepting nor denying the proposition. For example, a person who has considered the proposition that God exists is either (i) a theist who accepts the proposition that God exists, (ii) an atheist who denies that God exists (accepts that ‘God exists’ is false), or (iii) an agnostic who withholds or suspends judgment with respect to the proposition that God exists.
Chisholm draws a parallel between epistemology and ethics to explain epistemic appraisal. Ethics and epistemology are essentially normative or evaluative disciplines. Ethics is concerned with the justification of actions, and analogously, epistemology with the justification of belief (Chisholm 1977 pp. 1-2). A goal of ethics is to provide an account or explanation of moral appraisal, for example, good, right, and so forth. Similarly, epistemology seeks to provide an account or explanation of epistemic appraisal or evaluation. Chisholm’s account of knowledge proceeds by defining knowing a proposition in terms of the proposition’s being evident; and a proposition’s being evident in terms of the primitive relationship of epistemic evaluation, that is, more reasonable than.
Chisholm distinguishes two types of evaluation, absolute and practical, illustrating the distinction as follows. It may have been absolutely morally right to have killed Hitler or Stalin when they were infants, however, it would not have been practically right from the moral standpoint because no one could have foreseen the harm they would cause. Absolute rightness depends on an objective view of reality, that is, taking into consideration all of the truths related to an action. By contrast, practical rightness only depends on what a person could have foreseen.
Chisholm’s view is that justified belief depends on what is practically right for the person to believe. He is committed to the view that epistemic justification and, hence, knowledge, depends on evidence ‘internally’ available to the knower, a view known as Internalism. In support of this view, he points out that we rarely have direct access to the truth of propositions, that is, to reality. Being justified in believing a proposition is dependent on how well believing a given proposition meets the goals or requirements of rationality. The degree to which a proposition meets these goals is relative to the evidence available to a person; not relative to the absolute or ‘God’s eye’ view. Epistemic appraisal or evaluation is a function of the relative rationality of a person’s adopting a propositional attitude (acceptance, denial, withholding) given the evidence available to that person.
In his classic essay “The Ethics of Belief”, W. K. Clifford suggested that “it is always wrong … to believe on the basis of insufficient evidence” (Clifford 1877 pg. 183). In explicating one’s epistemic duties, Chisholm adopts the somewhat lower standard that one’s beliefs are innocent until proven guilty, that is, it is permissible to believe a proposition unless one has evidence to the contrary. At the foundation of Chisholm’s account of justified belief is the primitive concept of epistemic preferability, more reasonable than, which he explains by appealing to “the concept of what might be called an ‘intellectual requirement’.” (Chisholm 1977 pg. 14). He elaborates:
We may assume that every person is subject to a purely intellectual requirement–that of trying his best to bring it about that, for every proposition h that he considers, he accepts h if and only if h is true (Chisholm 1977 pg. 14).
Epistemic preferability, captured by Chisholm’s term more reasonable than, expresses a person’s relationship between two propositional attitudes. The more reasonable propositional attitude is the one that best fulfills one’s epistemic goals/intellectual requirements better than another attitude. Chisholm explains:
What is suggested when we say that one attitude is more reasonable than another is this: If the person in question were a rational being, if his concerns were purely intellectual, and if he were to choose between the two attitudes, then he would choose the more reasonable in preference to the less reasonable. (Chisholm 1966, pp. 21-22).
It may occur to some that this requirement may be satisfied by suspending judgment concerning every proposition, thereby, believing only what is true. However, Chisholm appeals to William James’s criteria, explaining that:
“There are two ways of looking at our duty in the matter of opinion–ways entirely different… We must know the truth: and we must avoid error…”
Each person, then is subject to two quite different requirements in connection with any proposition he considers; (1) he should try his best to bring it about that if a proposition is true then he believe it; and (2) he should try his best to bring it about that if a proposition is false then he not believe it. (Chisholm 1977 pp. 14-15).
Analogizing believing a proposition to betting on the truth of a proposition is a useful way of thinking about this requirement (Mavrodes 1973). Our epistemic goals can be thought of as providing the bettor with two pieces of advice: (1) win many bets; and (2) lose few bets. If we refrain from betting, we will have followed the second piece of advice and disregarded the first, and, if we bet on everything, we will have followed the first piece of advice to the exclusion of the second.
On Chisholm’s view, the epistemic goals or duties require not merely that we avoid believing false propositions, but that we also strive to find the truth. Our “intellectual requirement” is to strike a happy median between believing everything and believing nothing. While these are two distinct and independent intellectual requirements, one’s epistemic duty is to do one’s best to fulfill both requirements at the same time. If, for example, you saw a small furry animal with a tail in the distance but could not discern what kind of animal it was, you could better fulfill your epistemic duties by believing that it is a dog than denying that it is a dog (believing that it is not a dog); but you would best meet both epistemic goals by withholding or suspending belief rather than by either believing or denying it.
Chisholm elaborates that more reasonable than is a relationship between propositional attitudes p and q which person S may adopt with regard to propositions at time t, which means that “S is so situated at t that his intellectual requirement, his responsibility as an intellectual being, is better fulfilled by p than by q” (Chisholm 1977 pp. 14-15). More reasonable than is an intentional concept, that is, “if one proposition is more reasonable than another for any given subject S, then S is able to understand or grasp the first proposition.” It expresses a transitive and asymmetric relationship. “And finally, if withholding is not more reasonable than believing, then believing is more reasonable than disbelieving” (Chisholm 1977 pg. 13), for example, if agnosticism is not more reasonable than theism, then theism is more reasonable than atheism. Thus, for example, to say that accepting a proposition, p, is more reasonable than denying p for person, S, at time, t, is to say that believing p better fulfills S’s epistemic duties than does S’s denying p. In other words, S ought (epistemically) to accept p rather than deny p, given the evidence available to S.
Chisholm distinguishes levels of justified belief in terms of one propositional attitude being more reasonable than another, given a person’s evidence at a specific time. He defines terms of epistemic appraisal or evaluation that correspond to the level of justification a person has for a given proposition. Chisholm defines these terms to specify a hierarchy of levels of justified belief and specifies the minimum level of justified belief required for knowing. In Chisholm’s hierarchy of justified belief any proposition justified to a specific level is also justified to every lower level.
There is a concept of beyond reasonable doubt that is the standard in English Common Law. This is the level to which the members of the jury must be justified in believing that the accused is guilty in order to render a guilty verdict in a criminal trial. In this context it means that given the available evidence there is no reasonable explanation of the facts other than that the accused person has committed the crime. The underlying idea is an epistemic one, that is, the jurors must have good reason for believing that the accused committed the crime in order to convict the accused.
Chisholm adopts the term beyond reasonable doubt to identify a key level of epistemic justification, that is, when a person has an epistemic duty to believe a proposition. He defines it in a different way than Common Law as follows:
D1.1 h is beyond reasonable doubt for S =Df accepting h is more reasonable for S than withholding h. (Chisholm 1977 pg. 7).
In this sense, a proposition is beyond reasonable doubt for a person if and only if accepting it better fulfills the person’s intellectual requirements of believing all and only true propositions than withholding or suspending belief. More simply put, propositions which are beyond reasonable doubt are ones that a person epistemically ought to believe given the evidence he or she has.
Chisholm considers the proposition that the Pope will be in Rome on October 5th five years from now as an example of a proposition that has a positive level of justification for most of us, but not quite meeting this standard of beyond reasonable doubt. He points out that although it is more reasonable to believe it than to deny it (given that the Pope is in Rome on most days), but it is even more reasonable to withhold judgment about the Pope’s location five years from now. While the Pope spends most of his time in Rome, circumstances five years from October 5 th may require that he be somewhere else. Chisholm defines this slightly lower lever of justified belief, having some presumption in its favor, as follows:
D1.2. h has some presumption in its favor for S =Df Accepting h is more reasonable for S than accepting non-h. (Chisholm 1977 pg. 8).
Given the limited evidence that we have about the Pope’s whereabouts five years from now, it is more reasonable to believe that the Pope will be in Rome than believing that he will not be in Rome five years from October 5th. However, it is even more reasonable in these circumstances to withhold judgment about the Pope’s whereabouts. According to Chisholm, the proposition in question about the Pope’s whereabouts in the future has some presumption in its favor, that is, it is more reasonable to believe that it is true than false.
Chisholm defines a level of epistemic justification for propositions that are not beyond reasonable doubt and yet have a higher positive epistemic status than having some presumption in their favor. The level of justified belief is that of the proposition’s being acceptable which is defined as follows:
D1.3. h is acceptable for S =Df Withholding h is not more reasonable for S than accepting h. (Chisholm 1977 pg. 9).
An example of a proposition that has this level of justified belief is the proposition that I actually see something red when I seem to see something red under certain questionable lighting conditions. Withholding belief that I actually see something red is not more reasonable than believing it, and yet believing that I actually see something red may not be more reasonable than withholding it, that is, they may be equally reasonable to believe. Anything that is beyond reasonable doubt is acceptable and anything that is acceptable also has some presumption in its favor. As noted in the outset of this discussion, every higher level of justified belief is also justified to every lower level.
Chisholm thinks that while propositions that are beyond reasonable doubt have a high-level justification, that is, they ought to be believed, even though they do not have a sufficiently high level of justification for knowledge. The lower levels of justified belief, that is, acceptable or having some presumption in their favor, play an important role for Chisholm’s account as their level of justification may be raised to the level of evident when they support and are supported by other propositions at the lower levels.
Occupying the high end of Chisholm’s justification hierarchy are proposition which are certain. Chisholm distinguishes this epistemological sense of certain from the psychological sense. We may feel psychologically certain of the truth of a proposition, even though we are not justified in believing it. The epistemological sense of certainty represents the highest level of justified belief and is not merely a feeling of confidence in believing. Chisholm defines certainty as:
D1.4. h is certain for S =Df (i) h is beyond reasonable doubt for S, and (ii) there is no i, such that accepting i is more reasonable for S than accepting h. (Chisholm 1977, pg. 10)
As with all propositions justified to a higher level of epistemic justification, any proposition that is certain for a person also meets the criteria of every lower level of positive epistemic justification. Chisholm claims that propositions that describe the way things appear to a person and some truths of logic and mathematics, under the right conditions, are certain for us. The levels of justification do not come in degrees. Thus, no proposition that is certain, according to Chisholm is more reasonable than (or for that matter less reasonable than) any other proposition having this epistemic status. That is, certainty, in Chisholm’s technical sense, (as every level of justified belief) does not come in degrees.
Some philosophers thought that the criterion of the epistemic justification required for knowledge was certainty. Descartes equated certainty to not being subject to any possible doubt. Chisholm argues that this standard is too high a standard for knowledge because this would rule out, by definition, the possibility of our knowing many contingent truths which we think we can know. This would make skepticism about empirical knowledge true by definition.
Believing that the President is in Washington today because we saw him there yesterday and he spends most of his time there, is beyond reasonable doubt. However, we need stronger justification for knowing that he is there today. Knowledge requires justification to a higher level than the proposition’s merely being beyond reasonable doubt. These considerations indicate to Chisholm that the minimum level of justification required for knowledge is higher than being beyond reasonable doubt and lower than certainty. Capturing this intuition, Chisholm defines evident to single out the level of justification required for knowledge as:
D1.5 h is evident for S =Df (i) h is beyond reasonable doubt for S, and (ii) for any i, if accepting i is more reasonable for S than accepting h, then i is certain. (Chisholm 1977 pg. 10).
According to Chisholm’s version of the Traditional Analysis of Knowledge, a necessary condition of knowledge is that the proposition is evident.
Chisholm’s ethics of belief solves the problem of the Theatetus as follows: He identified the epistemic goals with respect to any proposition under consideration as: (1) believe it if true and (2) do not believe it if false. These goals determine the relative merit of adopting the attitudes of accepting, denying, and withholding judgment with respect to any given proposition and the person’s evidence, that is, in determining which attitude is more reasonable than another. Chisholm’s definition of knowledge identifies the level of justification required for knowing a proposition as being evident, meaning that (i) it is more reasonable for the person to believe the proposition than to withhold belief (beyond reasonable doubt) and (ii) any proposition that is more reasonable is maximally reasonable (certain). This analysis sheds light on the level of justification required for knowing a proposition. Moreover, it is not defined in terms of something which, in turn, is defined in terms of knowledge. Chisholm’s analysis of knowledge thus avoids the circularity problem of the Theatetus.
b. The Gettier Problem
Before leaving the analysis of knowledge to explore the other important parts of Chisholm’s theory of knowledge, a critical objection to Chisholm’s analysis of knowledge, the Gettier Problem, needs to be outlined. Edmund Gettier, in a monumental paper “Is Knowledge Justified True Belief?” (Gettier 1963), proposes a set of counterexamples to Chisholm’s definition of knowledge which identify a genetic defect in the Traditional Analysis of Knowledge. Gettier argues these counterexamples demonstrate a defect in Chisholm’s version of the Traditional Analysis of Knowledge, as well as Plato’s and Ayer’s versions.
To illustrate the problem, let us consider one of Gettier’s examples. Suppose that Smith and Jones are the only two applicants for a job. Smith believes that the person who will get the job has ten coins in his pocket because he knows that he has ten coins in his pocket and that the boss has told him that he will be offered the job. Unbeknownst to Smith, the other applicant, Jones, also has ten coins in his pocket and ultimately gets the job. Thus, Smith believes that the person who has ten coins in his pocket will get the job, it is true, and Smith is justified in believing (it is evident to him) that the person who will get the job has ten coins in his pocket. However, Smith’s evidence is defective and, thus, Smith does not know that the man who will get the job has ten coins in his pocket.
Gettier examples, as they have become known, point to a genetic defect in the Traditional Analysis of Knowledge, that is, a person can have a justified (evident) true belief which is not knowledge. The Gettier problem became a major focus of epistemology in the 1960’s and continues today, more than a half century later. Solutions were proposed that add a fourth condition of knowledge or explain why the Gettier examples were not really problematic. Chisholm notes the Gettier Problem in the first edition of his Theory of Knowledge (Chisholm 1966) suggesting that the solution to the problem lies in adding a fourth condition of knowledge. In his characteristic style, Chisholm presented major revisions of his definitions intended, among other things to address the Gettier Problem, in Theory of Knowledge (Second Edition) in 1977 and in the third edition in 1989.
3. Why Foundationalism?
Chisholm’s epistemology does not begin and end with the analysis of knowledge. His work on the analysis of knowledge clears the conceptual landscape for answering fundamental questions about the structure of empirical knowledge, and for providing an account of the justification of empirical beliefs. In the process, he provides an answer to the much debated question of whether or not empirical knowledge rests on a foundation of epistemically privileged beliefs? Philosophers thought that the answer is obvious, the problem being some maintain it is obviously yes, while others obviously no. Those answering the question in the affirmative defend foundationalism, and those in the negative defend the coherence theory of justification, or simply coherentism.
Chisholm characterizes foundationalism (or the myth of the given as its detractors refer to it) as supporting two claims:
The knowledge which a person has at any time is a structure or edifice, many parts and stages of which help support each other, but which as a whole is supported by its foundation.
The foundation of one’s knowledge consists (at least in part) of the apprehension of what has been called, variously, “sensations”, “sense impressions”, “appearances”, “sensa”, “sense-qualia”, and “phenomena”. (Chisholm 1946, pp. 262-3).
Chisholm joins philosophy luminaries including Rene Descartes, Bertrand Russell, and C. I. Lewis as a leading defender of foundationalism. However, his unique take on why empirical knowledge rests on a foundation of self-justified beliefs reveals much about his approach to epistemology.
Foundationalism’s historical roots are found in the work of Rene Descartes, the father of Modern Philosophy. In his Meditations on First Philosophy, Descartes embarks on securing a firm basis for empirical knowledge, having discovered many of the beliefs on which he based this knowledge to be false. He proposes to rectify this by purging all beliefs for which there was any possible reason to doubt. By applying this methodological doubt, he finds a set of propositions about which he cannot be mistaken. These include certainties about the contents of his mind, for example, about the way things appear to him. Applying the infallible method of deductive reasoning to this foundation of certainties, Descartes claims to build all knowledge in the way that the theorems of Geometry are derived from axioms and definitions, thereby eliminating possibility of error. Descartes argues that foundationalism is the only way to account for our knowledge of the world external to ourselves, and thereby, refutes skepticism.
Locke, Berkeley, and Hume, the British Empiricists, reject the claim that knowledge requires certainty and argue that Descartes’ deductive proof of the external world is unsound. They agree with Descartes that the foundation of empirical knowledge is sense-data but maintain that knowledge of the external world is merely justified as probable. As this probable justification is fallible and can justify false beliefs, the British Empiricists think that foundationalism is true, but compatible with skepticism.
Bertrand Russell, the 20th century founder of Anglo-American Analytic Philosophy, picks up the mantle of British empiricism. He too advocates for foundationalism, claiming that we have epistemically privileged access, as he calls it knowledge by acquaintance, to a foundation of sense-data. Russell, like his British Empiricist predecessors, thought that all empirical knowledge rested on this foundation, but did not claim that external world skepticism was refuted by foundationalism.
Russell’s empiricist successors, the logical empiricists or logical positivists, assumed that empirical knowledge was possible as they viewed science as the paradigm example of knowledge. Hans Reichenbach, a leading proponent of logical empiricism, rejects Russell’s and Descartes’ view arguing that empirical knowledge is not justified, and need not be justified, by a set of foundational beliefs to which we have epistemically privileged access. He claimed that, like scientific claims, empirical propositions are justified by their conformance with other merely probably propositions.
C.I. Lewis, a leading figure in 20th century American Philosophy (and Chisholm’s doctoral dissertation advisor), engages in a famous debate with Reichenbach on this very issue (see: Lewis 1952, Reichenbach 1952, van Cleve 1977, Legum 1980). In “Given Element in Empirical Knowledge”, (Lewis 1952) he argues that empirical knowledge must rest on a foundation of certainty, hence, foundationalism is the only viable alternative. Lewis’s rejection of Reichenbach’s position is based on the claim that there cannot be an infinite regress of justified beliefs. While agreeing that many empirical beliefs are justified because they are probable, he argues:
No proposition is probable unless some proposition is certain;
Therefore, there cannot be an infinite regress of merely probable beliefs;
Some empirical propositions are justified because they are probable;
Therefore, empirical knowledge must rest on a foundation of epistemic certainties.
a. The Myth of the Given and the Stopping Place for Socratic Questioning
Chisholm finds Descartes’ approach to epistemology attractive but is not persuaded by Descartes’ defense of foundationalism. Chisholm, also, endorses Lewis’s premise that if any proposition is probable then some proposition is certain (Chisholm 1989, pg. 14), but takes a different tack to defend foundationalism. Chisholm, not one to accept any philosophical dogma on something akin to religious faith, appeals to his method of developing a theory of knowledge in support of foundationalism. He suggests adopting the hypothesis that we know, more or less, what we think we know, and then, by asking Socratic questions about what justifies our believing of the things we think we know, develops the account for their justification.
In Perceiving (Chisholm 1957, pg. 54) and in Theory of Knowledge (Chisholm, 1966 pg. 18) Chisholm compares justifying a belief to confirming a scientific hypothesis. A hypothesis is confirmed by the evidence the scientist has supporting the hypothesis. Similarly, a belief is justified by the evidence a person has that supports its. Scientists often seek out further evidence to confirm the hypothesis. By contrast, only evidence already ‘internally’ possessed, can justify belief.
Chisholm claims that when we consider cases of empirical propositions which we think we know, we discover through Socratic questioning what justifies our believing these propositions. This process may begin by considering a proposition that I think I know, for example, that:
There is a tree.
To identify what justifies my believing this, we need to answer the Socratic question, ‘What justifies me in believing this proposition?’. One might mistakenly think that as this proposition is obviously true, that is, that the proper answer to the Socratic question is the proposition itself, that is, that there is a tree. However, this would be to misunderstand what the Socratic question is asking. It is not asking from what other beliefs have I inferred this proposition. The question is, “What evidence do I currently possess that supports my believing this proposition?” Sitting here in my office, looking out my window at a tree, I clearly have evidence that there is a tree, namely, that:
I see a tree.
This answer does not imply that I am currently thinking, or have ever thought, about seeing a tree, nor that I consciously believe that I see a tree and from this I infer that there is a tree. It merely implies that the proposition that I see a tree is evidence which is already available to me and which would serve to justify my belief that there is a tree.
This answer, however, is only the first part of a complete answer to the Socratic question “Why do you think that there is a tree?” or “What justifies you in thinking that there is a tree?”. The second part of the answer to the Socratic question is a rule of evidence, in this case a rule specifying conditions related to the proposition serving as evidence which are sufficient for being justified in believing the proposition in question, for example:
RE1. If S is justified in believing that S sees a tree, then S is justified in believing that there is a tree.
The answer to a Socratic question identifies two things: (1) a proposition that serves as evidence for the proposition in question, and (2) a rule of evidence specifying conditions met by the evidence which are sufficient for a person to be justified in believing the proposition in question.
This does not yet amount to a complete account of my being justified in believing that there is a tree. A proposition cannot justify a person’s belief unless the person is justified in believing it. This in turn suggests another step or level is required in the process of Socratic questioning, that is, “What justifies my believing that I see a tree?” The first part of the answer is the evidence that I have for believing this, for example:
I seem to see a tree.
This proposition asserts that I am experiencing a certain psychological or phenomenological state of the kind that I would have in cases where I am actually seeing a tree, dreaming that I am seeing a tree, or hallucinating that I am seeing a tree. The second part of the answer to this question is a rule of evidence, in this case:
RE2. If S is justified in believing that S seems to see a tree and has no evidence that S is not seeing a tree, then S is justified in believing that S sees a tree.
This in turn raises the next step or level in the process of Socratic questioning, that is, “What justifies my believing that I seem to see a tree?” The appropriate answer to this question in this case is not some other proposition that serves as evidence that I seem to see a tree. Rather, the truth of the proposition, that is, my having the psychological experience of seeming to see a tree, is my evidence for believing that I seem to see a tree. The second part of the answer is a rule of evidence like the following:
RE3. If it is true that S seems to see a tree, then S am justified in believing that S seems to see a tree.
This rule of evidence is a different kind than the others encountered in the process of Socratic questioning. This rule conditions justified belief on the truth of a proposition, in contrast to the other rules which condition justified belief on being justified in believing another proposition.
Chisholm questions whether our process of Socratic questioning goes on without end, ad infinitum, by either justifying one claim with another or by going around in a circle? He believes that “…if we are rational beings, we will do neither of these things. For we will find that our Socratic questions lead us to a proper stopping place.” (Chisholm 1977, pg. 19). We come to a final empirical proposition whose justification is that the proposition believed is true.
When we encounter an answer that the truth of the proposition justifies believing the proposition, Chisholm points out, we have reached the stopping place of Socratic questioning, that is, we have completed the account of the justification of the initial proposition. Furthermore, according to Chisholm, we typically find that the proposition reached at the end of the process of Socratic questioning describes the person’s own psychological state, that is, describes the way things appear to that person. Thus, at least hypothetically, Chisholm identifies a class of beliefs which may serve as the foundation of all empirical knowledge. This is tantamount to saying that if we know any empirical or perceptual propositions, then believing these propositions is, at least in part, justified by the relationship they have to the psychological proposition describing the way things appear to us.
Chisholm claims that when we consider cases of empirical propositions which we think we know, we discover that the process of Socratic questioning, that is, that the account of the justification for believing these propositions, comes to a proper stopping place in a finite number of steps. When we reach the final stage of Socratic questioning, we have discovered, as foundationalism implies, the foundation on which empirical knowledge rests. In contrast to Descartes, Chisholm does not think that the only alternative to skepticism is foundationalism. While he may agree with C. I. Lewis that you cannot have an infinite regress of propositions which are probable, he does not claim that this proves that that there is no viable alternative to foundationalism. Chisholm thinks that the fact that we find a proper stopping place to Socratic questioning makes it plausible to accept foundationalism as a postulate or an axiom for developing a theory of knowledge. That foundationalism is acceptable, for Chisholm, should be judged by how well it explains the justification of empirical beliefs. Thus, to defend foundationalism, Chisholm presents one of the most detailed and complete explanations answering the two fundamental questions: (i) What makes the foundational beliefs about one’s psychological states or sense-data ‘self-justified’ (as Chisholm calls them, ‘directly evident’)?, and (ii) How does the foundation of sense-data serve as the ultimate justification of all empirical knowledge? Chisholm’s answer to these two critical questions is thus discussed in the next section.
4. The Directly Evident—The Foundation
Descartes proposed that empirical knowledge must rest on a foundation of certainties, propositions about which one cannot be mistaken. His foundation was composed of propositions about which even an all-powerful and all-knowing evil genius could not deceive him. It included not only logical or necessary truths, but also psychological propositions about himself. These propositions are of the form that I exist…, I doubt …., I understand …, I affirm …, I deny …, I imagine …., and, most importantly, I have sensory perceptions of … . Propositions of the last type, that is, propositions about one’s psychological or phenomenological states describing the raw data of the senses, are perhaps the most crucial propositions upon which all empirical knowledge is founded. These sense-data propositions, expressed by statements like ‘I seem to see a fire’, describe the inner workings of one’s mind, and do not imply the existence of anything in the external world.
Locke, Berkeley, and Hume (the British Empiricists) recognize that the data of the senses, that is, sense-data, serves as the foundation of empirical knowledge. Bertrand Russell agrees with them that these propositions constitute the foundation of empirical knowledge and claims that they have a privileged epistemic status which he dubbed knowledge by acquaintance. C. I. Lewis agreed that empirical knowledge rests on a foundation of sense-data, which are the given element in empirical knowledge.
Many 20th century empiricists were skeptical about the existence of sense-data which led to their doubting that empirical knowledge rests on a foundation of epistemically privileged sense-data. Gilbert Ryle, for example, raised this type of objection and argued that sense-data theory is committed to an untenable view of the status of appearance. Chisholm enters the historical debate, defending foundationalism from Ryle’s objection in one of his earliest papers on epistemology, “The Problem of the Speckled Hen” (Chisholm 1942).
a. Sense-Data and the Problem of the Speckled Hen
Ryle asks you to suppose you take a quick look at a speckled hen which has 48 speckles that are in your field of vision. According to the view under consideration you have a 48 speckled sense-datum. If, as foundationalism claims, you can never be wrong about the sense-data that present themselves to the mind, it would seem that you could never be mistaken in thinking that you were presented with a 48 speckled datum. Ryle’s point is that while we might concede that one could never be mistaken in thinking that a sense-datum had two or three speckles, but, as the number of speckles gets sufficiently large, for example, 48, we may be mistaken the number of speckles in the sense-datum. Chisholm points out that this is not an isolated problem in an odd situation, but that similar issues can be raised concerning most perceptual presentations, that is, most sense-data are complex like the speckled hen case.
A.J. Ayer, a leading logical positivist and a defender of foundationalism, replies that the example is mistaken, arguing that any inaccuracy introduced in counting the speckles can be accounted for because sense-data do not have any definite number of speckles. (Ayer 1940). Chisholm points out that it is odd to think that the sense-data caused by a looking at a hen having a definite number of speckles do not have a specific number of speckles. Thus, Ayer must adopt either of two unacceptable positions. The first is that it is neither true nor false that the hen sense-datum has 48 speckles. This amounts to saying that certain propositions expressing sense-data are neither true nor false. But this is hardly acceptable because it would commit one to denying the logical law of the excluded middle. The alternative would be that while the hen sense-datum had many speckles, it did not have any specific number of speckles. Chisholm argues that this is untenable because it is like claiming that World War II would be won in 1943, but not in any of the months that make up 1943.
Chisholm thinks that the Problem of the Speckled Hen demonstrates that not all sense-data propositions are foundational. One’s justification for believing complicated sense-data propositions, for example, that I seem to see a 48 specked hen, are not the propositions themselves, but are other sense-data propositions, for example, that I seem to see an object with red speckles. Chisholm grants that complex sense-data propositions can and often do refer to judgments that go past what is presented or given in experience. Such propositions assign properties to our experience that compare the given experience to another experience. Any such judgment goes beyond what is given or presented in the experience and, as such, introduces the possibility of being mistaken in the comparison. When a sense-data judgment goes beyond what is presented in experience, its justification is not the truth of the proposition, but is justified by other simpler sense-data which in turn are either simpler or foundational.
Chisholm’s concludes that the class of sense-data proposition is larger than the class of epistemically foundational or basic propositions. Only the subset comprised of simple sense-data propositions, for example, propositions about colors and simple shapes that appear, may be foundational beliefs. The challenge for Chisholm is two-fold: (a) to provide an account of which sense-data propositions are foundational, or as he calls them Directly Evident, that avoids the metaphysical pitfalls Ryle identified with sense-data; and (b) to identify what enables them to serve as the foundation of perceptual knowledge, that is, to explain their privileged epistemic status.
b. The Directly Evident—Seeming, Appearing, and the Self-Presenting
In short, Chisholm claims that to discover what justifies our believing some proposition can be determined by a process of Socratic questioning which identifies the evidence we have for believing the proposition, and then the evidence we have for believing the evidence, until we reach a proper stopping point. The proper stopping point arises when the proper answer is that the evidence that justifies one in believing the proposition is the truth of the proposition. These propositions whose truth constitutes their own evidence are the given element in empirical knowledge, that is, the Directly Evident.
The following example of a line of Socratic questioning illustrates Chisholm’s point. Suppose I know that:
There is a blue object.
In response to the question of what evidence I have for believing this I may cite that:
I perceive (see, feel, hear, and so forth) that there is a blue object.
In response to the question of what evidence I have for accepting (2), I would cite that:
I seem to see to see something blue (or alternatively that I have a blue sense-datum).
When we reach an answer like (3), we have reached Chisholm’s proper stopping point of Socratic questions. On Chisholm’s view, psychological or phenomenological propositions like (3) are self-justifying or self-presenting, they are the given element in empirical knowledge, and they serve as the foundation of perceptual knowledge.
Chisholm defends Descartes’ and C. I. Lewis’ assertion that propositions which describe a person’s phenomenological experience, that is, propositions which describe the way that things seem or appear to a person, are important constituents of the foundation of perceptual knowledge. These phenomenological propositions which constitute the foundation of perceptual knowledge are expressed by statements using ‘appears’ or ‘seems’, and they do not imply that one believes, denies, has evidence supporting, or that is hedging about whether there is something that actually has a certain property. Rather, ‘appears’ or ‘seems’ describes one’s sensory or phenomenological state, for example, that I seem to see something white.
Chisholm distinguishes comparative and non-comparative uses of ‘appears’ in statements describing one’s sensations or phenomenological state. The comparative use describes the way that we are appeared to by comparing it with the way that certain physical objects have appeared in the past. Thus, when I use ‘appears’ in the comparative way, I am “saying that there is a certain manner of appearing, f, which is such that: (1) something now appears f, and (2) things that are blue may normally be expected to appear f.” (Chisholm 1977 pg. 59). By contrast, if I use ‘appears’ in the noncomparative way, I am saying that there is a blue way of appearing (or seeming) and I am now in this phenomenological state or having this kind of phenomenological experience. Chisholm claims that only those propositions expressed by sentences using the noncomparative descriptive phenomenological sense of ‘appear’ or ‘seems’ are directly evident.
Chisholm’s solution to the Problem of the Speckled Hen is that sense-data compose the given element in empirical knowledge, that is, the foundation on which all perceptual knowledge stands, but not all sense-data are foundational. Only sense-data statements referring to some sensory characteristics are candidates for this special status, and they can be called basic sensory characteristics. It should be said that, at least for most of us, the characteristic like the speckled hen’s appearing to have 48 speckles are not basic sensory characteristics, and therefore are not foundational beliefs. Rather, only appearance propositions using the basic or simple sensory characteristics, for example, basic visual characteristics (for example, blue, green, red), olfactory characteristics, gustatory characteristics, auditory characteristics, or tactile characteristics, will be candidates for the directly evident.
One might wonder what distinguishes appearing blue from appearing to have 48 speckles that makes the former a basic sensory characteristic while the latter not. Most people can recognize a triangle at a glance and do not need to count the three sides or angles in order to recognize that the object is a triangle. Moreover, at a glance, we can distinguish it from a square, rectangle, or pentagon. Contrast that with recognizing a chiliagon (1000 sided polygon). Other than perhaps a few geometric savants (perhaps Thomas Hobbes who attempted to make a polygon into a circle), we cannot recognize a chiliagon at a glance. In fact, we would have to go through a long process of counting to discover that a given polygon in fact had 1000 sides. Clearly, appearing chiliagon shaped is not going to be a basic sensory characteristic in contrast to appearing triangular.
Like most adults I can discern a triangle immediately, while very young children cannot. A child playing with a toy containing holes of different shapes and blocks to be inserted into the corresponding shaped hole may have difficulty matching the triangle to the triangular hole, indicating that it is difficult for the very young child to recognize a triangle. It seems reasonable to conclude that appearing triangular is a basic sensory characteristic for me but not for the very young child. Thus, one and the same characteristic may be a basic sensory characteristic for one person while not a basic characteristic for another depending on their visual acuity. Moreover, visual acuity may change from time to time for the same person, hence, at different times, the same characteristic may be a basic sensory characteristic at one time and not at another time. (Chudnoff 2021 discusses empirical evidence that training can help one develop new recognitional abilities).
The distinction between basic sensory characteristics from non-basic ones is based on whether or not a person requires evidence to be justified in believing that the sensation has a certain characteristic. For most of us (at least those of us who are not color-blind), being justified in believing the proposition that I seem to see something green would require no evidence beyond our phenomenological state or experience. By contrast, being justified in believing that I seem to see a 48 speckled thing would require our having evidence from counting up the speckles. Thus, being 48 speckled would not be a basic sensory characteristic. By contrast, being 5 speckled (or fewer) for most of us would be a basic sensory characteristic. The test of whether a sensory characteristic is basic would be the answer to the Socratic question of what justifies the person in believing that it is an experience of that characteristic.
Chisholm’s solution to the Problem of the Speckled Hen addresses the metaphysical concerns about sense-data. A standard view of sense-data is that if I am looking at a white wall that is illuminated by red lights, there is a red wall sense-datum, which is really red and this object is what ‘appears before my mind’. Philosophers have objected to the sense-data theory’s dependence upon the existence of non-physical ghost-like entities serving as intermediaries between physical objects and the perceiver. Ayer, for example, proposed that these odd metaphysical entities may have seemingly contradictory properties, for example, having many speckles but no specific number of speckles. Others rejected these metaphysical claims as entailing skepticism about the external world as we only have access to the sense-data. Chisholm intends his theory to account for the epistemic properties of sense-data, that is, that they are directly evident, without entailing the objectionable metaphysical assumption sense-data are ghost-like entities.
Chisholm explains that if we are going to be precise, something appears f to me is not directly evident because this implies that there are objects, sense-data, which appear to me and for which it is proper to seek further justification. Consider a case of my hallucinating a green table and as a result it is true that:
I seem to see a green table.
A defender of sense-data would say that this means the same as:
There is a green sense datum which is appearing to me.
But, it is perfectly proper to seek justification for the belief that a green sense-datum exists, hence, the proposition that I seem to see a green table is not directly evident.
Such examples suggest to Chisholm that a better formulation of the statement which expresses the directly evident is “I am experiencing an f appearance.” “But,” he is concerned that, “in introducing the substantive ‘appearances’ we may seem to be multiplying entities beyond necessity.” (Chisholm 1977, pg. 29). Chisholm, therefore, wants to avoid referring to any unusual entities like sense-data in statements intended to express the directly evident. The reason cited for avoiding references to sense-data is parsimony, that is, the principle of assuming the existence of no more types of objects than required.
Chisholm shows us how to get rid of the substantives in appear-statements. We begin with a statement:
a. Something appears white to me;
which is to be rewritten more clearly as:
b. I have a white appearance.
But this sentence contains the substantive ‘appearance’. To avoid reference to any strange metaphysical entities, sense-data, the sentence must be rewritten to read as:
c. I am appeared white to.
Chisholm notes that we have not yet succeeded in avoiding referring to sense-data, for ‘white’ is an adjective and, thus, must describe some entity (at least according to the rules of English grammar). We, however, want ‘white’ to function as an adverb that describes the way that I am appeared to. Thus, the sentence becomes:
d. I am appeared whitely to;
or, to put it in a somewhat less awkward way,
e. I sense whitely.
Chisholm does not propose that we should use this terminology in our everyday discourse, nor even that we should use this terminology whenever we are discussing epistemology. Rather, he wants us to keep in mind that when we use sentences like (a), (b), and (c) to express the foundation of empirical knowledge, what we are asserting is what (d) and (e) assert.
Chisholm concludes that (d) and (e), the directly evident propositions, do not imply that there are non-corporeal entities, sense-data, that are something over and above the physical objects and their properties which we perceive. When our senses deceive us, we are not seeing, hearing, feeling, smelling, or tasting (that is, perceiving) a non-physical entity. Rather, we are misperceiving or sensing wrongly, which is why Chisholm calls this the Adverbial Theory. We are sensing in a way that, if taken at face value, gives us prima facie evidence for a false proposition, that is, that we are actually perceiving something that has this property, hence, we do know the proposition we are perceiving something to have this property.
Chisholm maintains that our empirical knowledge rests on a foundation of propositions expressed by true noncomparative appear statements which we are sufficiently justified in believing to know, or to use his technical term, are evident. He further asserts that they have the highest level of epistemic justification, that is they are certain. However, in saying that they are certain, Chisholm is not endorsing Descartes’ view that they are incorrigible, that is, that we can never be wrong in believing these propositions. Nonetheless, Chisholm is agreeing with Descartes that they have a special level of justification., that is, they are in some sense self-evident or, as he prefers, directly evident.
To explain this special epistemic status Chisholm appeals to Gottfried Wilhelm Leibniz’s notion of primary truths, which are of two types: primary truths of reason and primary truths of fact. A paradigm primary truth of reason is a mathematical truth like a triangle has three sides. Such truths are knowable, a priori, independently of experience, because the predicate of the statement, having three sides, is contained in the subject, triangle (Leibniz 1916, Book IV Chapter IX). Knowing primary truths of reason requires no proof, rather they are immediately obvious. Leibniz claims that similarly our knowledge of our own existence and our thoughts (the contents of our own minds) are primary truths of fact immediately known through experience, a posteriori. There is no difference between our knowing them, their being true, and our understanding or being aware of them.
Leibniz likens our immediately intuiting the truth of basic logical truths to our direct awareness of our psychological or phenomenological states at the time they occur. We are directly aware of both primary truths of reason and experience because the truth of the propositions themselves is what justifies us in believing them. We reach the proper stopping point in Socratic questioning when have reached a primary truth of fact, a proposition describing our psychological or phenomenological state when it occurs. Its truth (or the occurrence of the state) constitutes our immediate justification in believing, hence, knowing such propositions. There is no need to appeal to any other reason that justified our believing them, hence, they are directly evident.
In explaining the epistemic status of appearance, Chisholm appeals to Alexis Meinong’s observation that psychological or phenomenological states of affairs expressed by propositions of the form: ‘I think …’, ‘I believe …’, ‘I am appeared to …’, and so forth, are self-presenting in the sense that whenever these propositions are true, they are certain for the person (Chisholm 1989 pg. 19). Thus, when one is appeared to whitely (in the non-comparative sense of appeared to) one is justified in believing it and there is no proposition that the person is more justified in believing. On Chisholm’s view these self-presenting propositions, for example, about the way things (non-comparatively) appear are paradigm examples of the directly evident, hence, serve as the foundation of empirical knowledge.
Chisholm explains that while all self-presenting propositions are directly evident, not all directly evident propositions are self-presenting. The directly evident, the foundation on which empirical knowledge rests, also contains propositions that are related to, but are not themselves, propositions that are self-presenting. Chisholm writes:
But isn’t it directly evident to me now both that I am thinking and that I do not see a dog? The answer is yes. But the concept of the directly evident is not the same as that of the self-presenting. (Chisholm 1977, pg. 23)
Thus, according to Chisholm, the foundation of empirical knowledge is comprised of a broader class of propositions that include the class of self-presenting propositions, their logical implications, and the negation of the self-presenting. He uses the term directly evident to designate this class of foundational beliefs, which he defines as:
Def 2.2 h is directly evident for S =Df. h is logically contingent; and there is an e such that (i) e is self-presenting for s, and (ii) necessarily, whoever accepts e accepts h. (Chisholm 1977 pp. 23-24)
Thus, for example, Descartes’ first foundational proposition, that I exist, is directly evident for me whenever I think, for this later proposition is self-presenting for me, and (per Descartes’s insight) necessarily whoever accepts the proposition, that I think, accepts the proposition, that I exist.
The class of propositions which are directly evident for a person include propositions concerning a person’s occurrent beliefs and thoughts and propositions describing the way that things appear to a person. These latter propositions are expressed by noncomparative appear-statements of the form, ‘I am appeared F-ly to’. This class of propositions serve as the foundation of knowledge, the set propositions in relationship to which all other propositions are justified.
5. The Truths of Reason and A Priori Knowledge
Leibniz divided true propositions into two types: truths of reason and truths of fact. Truths of reason are necessarily true and their negation is impossible, while truths of fact are contingent and their negation is possible. Leibniz’s division is also based on the source of knowledge of propositions of each kind. We find out that a necessary proposition is true by analyzing it into simpler ideas or truths until we reach what he calls ‘primary truths’. He concludes that the source of knowledge of necessary truths is reason and can be known a priori, that is, independently of experience, while the source of knowledge of contingent truths is experience and can be known a posteriori.
The focus to this point has been on Chisholm’s views on the empirical foundation of knowledge or foundational knowledge a posteriori. Chisholm believes that some of our knowledge is based on necessary truths which are known a priori. He provides the following account of the justification of necessary truths, including logical truths, mathematical truths, and conceptual truths, explaining how some of these truths serve as evidence for empirical knowledge.
a. The Justification of A Priori Knowledge
Chisholm appeals to Leibniz, Frege (the late 19th and early 20th century German philosopher, logician, and mathematician), and Aristotle to explain the basis of a priori knowledge. Leibniz writes: “The immediate awareness of our existence and of our thoughts furnishes us with the first a posteriori truths of facts… while identical propositions”, [propositions of the form A=A], “embody the first a priori truths or truths of reason… Neither admits of proof, and each can be called immediate.” (Leibniz 1705, Book IV, Ch 9). The traditional term for Leibniz’s first truths of reason is ‘axioms’. Frege explains: “Since the time of antiquity an axiom has been taken to be a thought whose truth is known without being susceptible by a logical train of reasoning” (Chisholm 1989 pg. 27). Chisholm explains the meaning of ‘incapable of proof’ by appealing to Aristotle’s suggestion in Prior Analytics that “[a]n axiom or ‘basic truth’… is a proposition ‘which has no proposition prior to it’; there is no other proposition which is ‘better known’ than it is” (Chisholm 1989 pg. 27).
Chisholm proposes that axioms are necessary propositions known a priori serving as foundational propositions. They are similar in the following respect to the self-presenting (directly evident) propositions about how we are appeared to, for example, that we are appeared redly to. When we are appeared to in a certain way, we are justified in believing the proposition that we are appeared to in this way; in Chisholm’s terminology, they are evident to us. We ‘immediately’ know about these mental states because they present themselves to us. Analogously, there are some necessary truths which are evident to us ‘immediately’ upon thinking about them.
Chisholm defines axioms, the epistemically foundational propositions, as follows:
D1
h is an axiom =Df h is necessarily such that (i) it is true, and (ii) for every S, if S accepts h, then h is certain for S. (Chisholm 1989 pg. 28)
His examples of axioms include:
If some men are Greeks, then some Greeks are men;
The sum of 5 and 3 is 8;
All squares are rectangles.
Notice that according to this definition, if a person accepts a proposition which is an axiom, the proposition is certain for that person. But being an axiom is not sufficient for the proposition’s being evident or justified for a person. The person may never have considered the proposition or, worse, may believe or accept that it is false, and hence cannot be justified in believing it at all. To be sufficient for an axiom is to be certain or evident (justified) for a person, and the person must also accept the proposition.
Chisholm, therefore, adds the condition that the person accepts the proposition, defining a proposition’s being axiomatic for S as:
D2
h is axiomatic for S =Df (i) h is an axiom, and (ii) S accepts h. (Chisholm 1989 pg. 28).
Thus, for the proposition that all squares are rectangles to be axiomatic for a person requires not only that the proposition be an axiom, that is, necessarily true and necessarily such that if the person accepts it then it is certain for the person, but also that the person believes or accepts the proposition. Note that a proposition which is axiomatic for a person has the highest level of justification for the person, putting axiomatic propositions on a par, epistemically, with propositions that are directly evident.
Chisholm claims that the class of propositions that are axiomatic is the class of foundational propositions known a priori. There are also non-foundational propositions known a priori. For example, propositions that are implied by axioms may also be known a priori. However, it is not sufficient that the axiom implies the other proposition for the second proposition to be known a priori, as that would imply that all implications of axioms are also justified, whether the person is aware of the implications or not. Rather, it must also be axiomatic for the person that the axiom implies the other proposition. Suppose, for example, that it is axiomatic for a person that all interior angles of a rectangle are right angles, and also that it is axiomatic for that person that something’s being a square implies that it is a rectangle. In that case, the proposition that all the interior angles of a square are right angles is also known a priori for that person.
As it is axiomatic for every person that any proposition implies itself, axiomatic propositions are also known a priori. The theorems of logic or mathematics are also known a priori, as long as the person accepts the axiom and as long as it is axiomatic for that person that the axiom implies the theorem. Chisholm adds these additional propositions to the class of a priori knowledge by defining a priori knowledge as:
D3
h is known a priori by S =Df There is an such that (i) e is axiomatic for S, (ii) the proposition, e implies h, is axiomatic for S, and (iii) S accepts h. (Chisholm 1989 pg. 29)
Chisholm defines a proposition’s being a priori as:
D4
h is a priori =Df It is possible that there is someone for whom h is known a priori.
(Chisholm 1989 pg. 31).
b. Chisholm, Kant, and the Synthetic A Priori
Kant distinguished two types of a priori propositions, analytic and synthetic. Roughly, an analytic proposition is one in which the predicate adds nothing new to the subject, for example, ‘all squares are rectangles’. It asserts that a component of the complex concept of the subject, for example, square or equilateral rectangle, is the concept of the predicate, rectangle. The underlying idea is that the concept of the subject can be analyzed in a way that it includes the concept of the predicate. By contrast, synthetic propositions are propositions in which the predicate is ascribing properties to the subject over and above what is contained in the concept of the subject, for example, the square is large.
It is generally thought that analytic propositions are not only necessarily true, but also a priori. However, as analytic propositions seem to be redundant and trivial, they appear to contribute little or no content to a person’s knowledge. This led Kant to raise the much-debated question of whether there are propositions which are synthetic and known a priori.
Chisholm argues that much of the debate concerning Kant’s question is based on a much broader concept of ‘analytic’ than the one which Kant had in mind. To clarify the epistemological importance of Kant’s question, Chisholm provides definitions of ‘analytic’ and ‘synthetic’. Underlying the concept of an analytic proposition are two concepts, that of a property implying another property and two properties being conceptually equivalent. He defines the first as:
D5
The property of being F implies the property of being G =Df The property of being F is necessarily such that if something exemplifies it then something exemplifies the property of being G. (Chisholm 1989 pg. 33)
Thus, for example, the property of being a bachelor implies the property of being single. A bachelor is necessarily such that if something is a bachelor, then something is a single person. He then defines what it is for two properties to be conceptually equivalent as:
D6
P is conceptually equivalent to Q =Df Whoever conceives P conceives Q, and conversely. (Chisholm 1989 pg. 33)
For example, the property of being a bachelor is conceptually equivalent to being a single male, as anyone conceiving of or thinking of a bachelor conceives of a single male, and vice versa, anyone who conceives of a single male conceives of a bachelor.
Chisholm defines the concept an ‘analytic proposition’ in terms of the forgoing concepts as follows:
D7
The proposition that all Fs are Gs is analytic =Df The property of being F is conceptually equivalent to a conjunction of two properties, P and Q, such that: (i) P does not imply Q, (ii) Q does not imply P, and (iii) the property of being G is conceptually equivalent to Q. (Chisholm 1989 pg. 34)
A proposition which is not analytic is synthetic, as per the following definition:
D8
The proposition that all Fs are Gs is synthetic =Df The proposition that all Fs are Gs is not analytic. (Chisholm 1989 pg. 34)
Chisholm’s definitions clarify the philosophical importance of Kant’s question, which is whether a synthetic proposition—a proposition in which the predicate cannot be found in the analysis of the subject, that is, a proposition that is not redundant and adds content to the subject—can be known to be true a priori. Finding such a proposition implies that “the kind of cognition that can be attributed to reason alone may be more significant” (Chisholm 1989 pg. 34).
Chisholm suggests that there are four types of examples of synthetic a priori propositions. Examples of the first type are the propositions expressed by the following sorts of sentences: “Everything that is a square is a thing that has shape” and “Everyone who hears a something in C-sharp major hears a sound”. Some have claimed that the property of being a square is conceptually equivalent to the properties of having a shape and some additional properties. The second possible type of synthetic a priori propositions are referred to by Leibniz as ‘disparates’. An example of such a proposition is ‘Nothing that is red is blue’. Chisholm notes that while attempts have been made to show propositions of these two types to be analytic, none has been successful.
The third possible type of synthetic a priori propositions are statements of moral judgments like the one expressed by the following sort of sentence: “All pleasures, as such, are intrinsically good, or good in themselves, whenever and wherever they may occur”. Chisholm concurs with Leibniz’s assertion that while such propositions can be known, no experience of the senses could serve as evidence in their favor.
The final possible type of synthetic a priori propositions are propositions of arithmetic. Kant asserted that propositions like 2 +1 = 3 are not analytic, hence, they are synthetic. Some might question whether the propositions they assert are of the right form, that is, all Fs are Gs, but there may be a way to formulate them into the ‘all Fs are Gs’ form. While the principles of arithmetic have been analyzed in terms of sets, this has not been done in such a way that the predicate can be analyzed out of the subject, in which case they have yet to be shown to be analytic.
While various epistemic principles to account for the justification of certain types of propositions are discussed in this section, more of these epistemic principles are examined in the next one. Chisholm’s use of these principles raises meta-epistemological questions related to the status of these principles and the nature of epistemology itself. Are the epistemic principles necessary or contingent? Can they be known a priori or a posteriori? And are they analytic or synthetic? While Chisholm’s answers to these questions are not clear, it would not be surprising if he thought that they are synthetic a priori necessary truths.
6. The Indirectly Evident
Chisholm’s account of the Directly Evident, explains and defends one of the two main theses of foundationalism. It identified a set of propositions describing one’s psychological or phenomenological states, the way things appear or seem, as being self-presenting. These propositions and some of their logical consequences are epistemic certainties composing the directly evident foundation of empirical knowledge. The second main thesis of foundationalism is that these directly evident propositions ultimately serve, at least in part, as the justification of all empirical knowledge. To complete his theory of knowledge Chisholm undertakes to explain how empirical propositions which are indirectly evident are justified. In the process Chisholm undertakes to solve the problem of empiricism that has plagued epistemology since Descartes.
Chisholm’s account of the Indirectly Evident proposes to answer the fundamental question of how propositions about the external world are justified by propositions about one’s internal psychological experiences or states, solving what he calls the problem of empiricism. This problem finds its roots in Descartes and was inherited by his British Empiricist successors, Locke, Berkeley, and Hume.
Descartes proposed to solve this problem by employing deductive logic. Starting from his discovered foundation of certainties, composed of propositions about the contents of his mind and some necessary truths that could be proven with certainty, he sets out to prove the truth of propositions about the external world. Following his method of Geometric proof (deductive logic), he sets out to derive, through deductive reasoning the existence of the external world from the empirical certainties about the contents of his mind and some other necessary truths.
Descartes argues that, for example, when he is having a certain sort of visual experience, he is certain of the proposition that I seem to see something red. From this certainty and some additional logical certainties, among them being the proposition that God exists and is not a deceiver (for which he provides deductive proofs) using deductive reasoning he derives propositions about the external world, for example, that there is something red. In this manner, Descartes purports to have built knowledge about the external world, from foundational certainties using only deductive reasoning, a method which cannot go wrong. This epistemological program, its methodology and its associated philosophy of mind earned Descartes, and his European continental successors Baruch Spinoza and Gottfried Wilhelm Leibniz, the title of Continental Rationalists.
John Locke, the progenitor of British Empiricism, claims that all knowledge of the external world is based on experience. He argues that Descartes’ demonstrations are flawed, claiming that knowledge of the external world cannot be justified by applying deductive reasoning to the foundational propositions. Locke argues that a fundamental mistake in Descartes’ program is setting the standard of certainty for knowledge and epistemic justification too high. On Locke’s view, avoiding skepticism merely requires the probability of truth to account for the transfer of justified belief from the contents of the mind to propositions about the external world.
To account for the transfer of justification, Locke appeals to his empiricist philosophy of mind according to which the mind is a blank slate, a tabula rasa, upon which sense-data is deposited. The data provided by the senses is the source knowledge about the world. Knowledge of the external world is ultimately justified by the experience of the senses. Locke, allowing for fallible epistemic justification, claims that one can be completely justified in believing a proposition that is not entailed by the evidence or reasons that one has for believing the proposition.
Locke claims that the move from one’s sensing something to be red to the proposition that there is something that is red is justifiable because of the resemblance between the contents of one’s minds (sensations and ideas) and objects in the external world which cause these sensations and ideas. Thus, for example, we are justified in believing the proposition that I am actually seeing something red because our idea or mental representation of the red thing resembles the object which caused it. Thus, according to Locke, any proposition about the external world is justified only if the mental representation resembles the corresponding physical object.
George Berkeley, Locke’s successor, finds Locke’s justification of beliefs about physical objects from their resemblance to the contents of one’s mind problematic. Berkeley is concerned that this inference is justifiable if, and only if, we have reason to think that the external world resembles our ideas. To be justified in believing that there is a resemblance one would have to be able to compare the ideas and the physical object. However, we cannot possibly compare the two. Noting that on Locke’s view we only have epistemic access to our ideas, Berkeley objects that this inference is problematic. We can never get “outside our minds” to observe the physical object and compare it to our mental image of it, and thus, we can have no reason to think that one resembles the other. Berkeley concludes that Locke’s view entails skepticism about the external world, that is, the belief that no empirical beliefs about the external world could ever be justified.
To avoid these skeptical consequences while maintaining that knowledge of the external world is based on sense-data, Berkeley advocates phenomenalism, the view captured by his slogan “esse est percipi” (“to be is to be perceived”). (Berkeley 1710, Part I Section 3). Berkeley’s phenomenalism claims that physical objects are made up of sense-data which are the permanent possibility of sensation. There are no physical objects over and above sense-data. Propositions about the physical world are to be reduced to propositions about mental experiences of perceivers, that is, phenomenological propositions.
Berkeley’s position can be clarified with an example. The proposition that the ball is red may be analyzed in terms of (and thus entails) propositions about the perceivers’ sensations, for example, that the ball appears red, spherical shaped, and so forth. Common sense propositions about the physical objects composing the external world are justified on the basis of an inductive inference from the propositions describing how things appear to perceivers that are entailed by the external world proposition. The ball’s being red is confirmed by the phenomenal or mental sensations of spherical redness that perceivers have. One’s having certain spherical and round sense-data confirms via induction the proposition that the ball is red which entails those sense-data.
The objection that Berkeley raises to Locke’s theory is a problem endemic to empiricism. It leaves open a gap that needs to be bridged to account for knowledge on the basis of the evidence of the senses. The gap to be explained is how sense-data, that is, propositions about one’s own mental states, justifies one’s beliefs in propositions about object in the external world. Berkeley avoids the problem of explaining the reason to think that one’s sensations resemble the real physical objects by adopting phenomenalism. In Berkeley’s metaphysics, Idealism, there are no non-phenomenal entities; physical objects are just the permanent possibility of sensation. The difference between sense-data that are veridical and non-veridical is that the veridical perceptions are sustained by God’s continuously having these sense-data in His mind. Thus, for Berkeley, the meaning of statements about physical objects may be captured by statements referring only to sense experience. In his theory physical objects just are reducible to sense-data. Berkeley proposes that God’s perceptions “hold” the physical universe together.
David Hume, the next luminary of British Empiricism, finds the dependence of epistemic justification on God’s perceptions to be unacceptable. Hume’s answer to the explanatory gap in Locke’s theory is that we naturally make the connection. One way of understanding Hume is by pointing to the fact that he adopts the view which in the 20th century would become known as epistemology naturalized which relegates the explanation of the inference to science. Others have understood Hume as embracing skepticism with respect to the external world. Thomas Reid, Hume’s contemporary, invokes common sense to explain the inference. He argues that we have as good a reason to think that these common sense inferences are sufficient justification for knowledge as we have for thinking that deductive reasoning is sufficient justification for the derivation of knowledge.
Bertrand Russell rejects Berkley’s view that there are really no physical objects, but they are just bundles of perceptions. Russell accounts for perceptual knowledge claiming that we have direct access to sense-data, and these sense-data serve as the basis of empirical knowledge. Russell, in The Problems of Philosophy, admits that “in fact ‘knowledge’ is not a precise conception: it merges into ‘probable opinion’.” (Russell 1912 pg. 134).
C. I. Lewis (Lewis 1946) proposes a pragmatic version of phenomenalism to bridge the explanatory epistemological gap between sense-data and the external world. Lewis agrees with the British Empiricists that sense-data are what is ‘directly’ experienced and, moreover, serve as the given element in empirical knowledge. Berkeley’s phenomenalism attempts to bridge the explanatory gap with metaphysical Idealism. Lewis agrees with Russell that this is problematic. Lewis proposes a version of phenomenalism that is compatible with metaphysical Realism. On this view, external world propositions entail an infinite number of conditional propositions stating that if one initiates a certain type of actions (a test of the empirical proposition in question), then one experiences a certain type or sense-data. Thus, the explanatory gap is bridged by the rules of inductive inference.
Lewis’s example helps to explain his phenomenalistic account. Consider the external world proposition that there is a doorknob. Lewis claims that this proposition logically entails an unlimited number of conditional propositions expressing tests that could be undertaken to confirm that there really is a doorknob causing the experience. One such conditional would “if I were to appear to reach out in a certain way, I would have a sensation of grasping a doorknob shaped object.” One’s undertaking to appear to reach out and grab the doorknob, followed by the tactile experience of doorknob sense-data, provides confirmation, hence, pragmatic justification for believing the proposition that there is a doorknob. Thus, according to Lewis, the justification of empirical beliefs is based on an inductive interference, having confirmed sufficiently many such tests.
a. Chisholm’s Solution to the Problem of Empiricism
Chisholm enters the fray arguing (Chisholm 1948) that Lewis’s view is defective and thus fails to solve the problem of empiricism, that is, it cannot account for the inference from the mind to the external world. According to Lewis’s phenomenalism, statements about the external world, for example:
This is red;
entail propositions only referring to mental entities, sense-data, for example:
Redness will appear.
Chisholm argues that certain facts about perceptual relativity demonstrate that physical world propositions like P do not entail any propositions which refer exclusively to mental entities like R.
Chisholm explains that P, when conjoined with:
This is observed under normal conditions; and if this is red and observed under normal conditions, redness will appear;
entails R, that redness will appear. But P in conjunction with:
This is observed under normal conditions except for the presence of blue lights; and if this is red and observed under conditions which are normal except for the presence of blue lights, redness will not appear;
entails that R is false. If P and any other proposition that is consistent with P, for example S, entails that R is false, then P cannot entail Q. Perceptual relativity, the way that things appear in any circumstance being relative to the conditions under which the object is observed, makes it clear that S is consistent with P. However, P and S entail that redness will not appear, for red things do not appear red under blue light. Therefore, P does not entail R. Similarly, no physical object statement (like P) entails any proposition that is only about sense-data (like R). Chisholm concludes that Lewis’s phenomenalism is untenable.
The problem of empiricism, the justification of beliefs about physical objects based on perception, that Chisholm embarks on solving requires a plausible account of perceptual knowledge. The Lockean account claims that sense-data (mental ideas) justifies or makes evident beliefs about physical objects because sense-data resembles the physical objects. agreement of ideas. However, on Locke’s account we can never “get outside” the mind to observe the resemblance between the sense-data and the physical object causing them. Thus, it fails to provide the requisite reason for thinking that physical objects resemble the mental images. Phenomenalism proposes to avoid this problem by claiming that propositions about physical objects entail purely psychological propositions about apparent experience, sense-data. Thus, inductive reasoning (or the hypothetico-deductive method) from the psychological propositions could provide justification for the physical proposition they entail; thereby, justifying beliefs about the physical object.
But, Chisholm argues, propositions about physical objects do not entail any purely psychological propositions, that is, phenomenalism is false. Thus, if empiricism is to be salvaged, there must be another account or explanation of how propositions about appearances provide evidence that justifies beliefs about the physical world. Chisholm answers this problem with his account of the Indirectly Evident.
b. Epistemic Principles—Perception and Memory, Confirmation and Concurrence
Chisholm adopts three methodological assumptions in developing his epistemic principles or rules of evidence to account for empirical knowledge:
We know, more or less, what (upon reflection) we think we know (an anti-skeptical assumption);
On occasions when we do have knowledge, we can know what justifies us in believing what we know;
There are general principles of evidence that can be formulated to capture the conditions which need to be satisfied by these things that we know.
Carneades of Cyrene (ca.213-129 B.C.E.), the Ancient Greek Academic Skeptic who was a successor to Plato as the leader of the Academy, developed three skeptical epistemic principles to explain the justification that perception provides for beliefs about the external world. On Carneades’ view, while ‘perceiving’ something to be a cat is not sufficient for knowing that there is a cat, nonetheless, it makes the proposition that there is a cat acceptable. While disagreeing with Carneades’ skeptical conclusion, Chisholm thinks that Carneades’ approach to developing epistemic principles is correct.
Chisholm formulates Carneades’ first epistemic principle related to perception as:
C1.
“Having a perception of something being F tends to make acceptable the proposition that something is an F.” (Chisholm 1977 pg. 68).
Having identified these acceptable propositions, Carneades notes that the set of perceptual propositions that are “hanging together like the links of a chain”, and which are “uncontradicted and concurring” (Chisholm 1977 pg. 69), including the perceptions of color, shape, and size, are also acceptable. The concurrence of these propositions makes all of them acceptable, hence his second principle
C2.
Confirmed propositions that concur and do not contradict each other are more reasonable than those that do not.
Finally, the acceptable propositions that remain after close scrutiny and testing are even more reasonable than merely acceptable. This is captured in a third epistemic principle:
C3.
“Concurrent propositions that survive ‘close scrutiny and test’ are more reasonable than those that do not.” (Chisholm 1977 pg. 70).
Pyrrho, Carneades’ Ancient Greek skeptic predecessor, believed that our common sense beliefs about the world were completely irrational and thus irrational to act upon. This extreme form of skepticism on its face is unacceptable to most; for example, people in general do not try to walk off cliffs nor think that doing so would be rational. Carneades’ approach to epistemology and his development of principles explaining how our beliefs are rational or acceptable, but not knowledge, by contrast, seems intuitively plausible. Thus, Chisholm adopts Carneades’ common sense approach in developing his account of the indirectly evident.
Chisholm points out that there are three possible ways in which the indirectly evident may be justified:
By the relationship they bear to what is directly evident;
By the relationship they bear to each other; and
By their own nature, independent of any relationship they bear to anything else.
He notes that theories that account for the justification of beliefs in this first way is considered foundationalist, and that ones that account for justification of beliefs in the second way is coherentism or a coherence theory. Chisholm claims that the indirectly evident is justified in all three ways. This is consistent with his view that focusing on the ‘isms’ is not conducive to solving philosophical problems.
Chisholm claims that just about every indirectly evident proposition is justified, at least in part, by some relationship it bears to the foundation of empirical knowledge, to the directly evident. The foundation cannot serve as justification for knowing anything unless we are justified in believing these propositions. To remind us of how the directly evident is justified, Chisholm presents the following epistemic principle
(A)
S’s being F is such that, if it occurs, then it is self-presenting to S that he is F. (Chisholm 1977 pg. 73).
The epistemic principles are formulated as schemata, that is, abbreviations for infinitely many principles. In (A), ‘F’ may be replaced by any predicate that would be picked from “a list of various predicates, each of such a sort as to yield a description of a self-presenting state of S.” (Chisholm 1977 pg. 73). This principle asserts that, for example, if I am appeared redly to, then it is self-presenting to me that I am appeared redly to. Moreover, whenever I am in such a state, for example, the state of being am appeared redly to, the proposition that I am appeared to redly is evident, directly evident, for me.
Chisholm’s next two epistemic principles concern perception. These principles are intended to show how the indirectly evident is justified by the combination of their relationship to the foundation and by their own nature, that is, the nature of perception and memory. Some clarification of the terminology used in these principles will aid in understanding the theory. Chisholm uses ‘believes he/she perceives that’ to assert a relationship between a person and a proposition, for example, that Jones believes that she perceives that there is a cat. This can be true even when it is false that there is a cat in front of her, for example, even when she is hallucinating. An alternative way Chisholm sometimes expresses this is as ‘Jones takes there to be a cat’. Chisholm prefers to use ‘believes that she perceives’ in place of ‘takes’ as it makes the ‘that’-clause explicit, noting that we will assume that ‘believes that she perceives’ means simply that the person has a spontaneous nonreflective experience, one that she would normally express by saying, “I perceive that…”.
Chisholm observes that, to account for the justification of perceptual propositions, one might be inclined (as he was in the first edition of Theory of Knowledge) to formulate a principle along the line of Carneades’ first epistemic principle that if a person believes that he perceives that there is a sheep, then the person is justified in believing that there is a sheep.
However, such a principle must be qualified because of cases like the following. Suppose that a person seems to see an animal that looks like a sheep, but also knows that (i) there are no sheep in the area, and (ii) many dogs in the area look like sheep. Contrary to what this principle implies, the person is not justified in believing that it is a sheep (but rather a dog).
Chisholm qualifies this principle to exclude cases like this one because the person has grounds for doubt that the proposition in question. To qualify this principle he defines a term, ground for doubt, which in turn depends on having no conjunction of propositions that are both acceptable for the person which tend to confirm that the proposition in question is false.
Chisholm defines the requisite notion of confirmation as
D4.1
e tends to confirm h =Df Necessarily, for every S, if e is evident for S and if everything that is evident for S is entailed by e, then h has some presumption in its favor for s.
He explains that confirmation is both a logical and an epistemic relation. If it obtains between two propositions, that is, if e confirms h, then necessarily it obtains between these two propositions (it is a matter of logic that it obtains). Furthermore, if e confirms h, and if one knew that e was true, one would also have reason for thinking that h was true. Chisholm cautions that from the fact that e confirms h, it does not follow that the conjunction e and another proposition, g, also confirms h. What we assert in saying that e confirms h may also be expressed by saying that h has a certain (high) probability in relation to e.
Armed with this concept notion of confirmation, he now defines what is it for something to be believed without ground for doubt as≔=
D4.3
S believes, without ground for doubt, that p =Df (i) S believes that p, and (ii) no conjunction of propositions that are acceptable for S tends to confirm the negation of the proposition that p. (Chisholm 1977 pg. 76)
Chisholm qualifies the epistemic principle under consideration as:
(B)
For any subject S, if S believes without ground for doubt that he perceives something to be F, then it is beyond reasonable doubt for S that he perceives something to be F. (Chisholm 1977 pg. 76).
While this principle justifies beliefs about the external world to a higher degree than did Carneades’ principle, it falls short of rendering them sufficiently justified for knowledge. Chisholm’s reason for this is that the property designated in the schema by F can be replaced by a property like being a sheep about which the person may have mistakenly classified the object he is looking at.
Chisholm proposes a third epistemic principle which yields knowledge level justification of propositions by restricting the properties that one is perceiving the object to have to the ‘proper objects’ of perception. These are sensible characteristics, such as visual characteristics like colors and shapes, auditory characteristics like loud and soft, tactile characteristics like smooth and rough, olfactory characteristics like spicy and burn, gustatory characteristics like salty and sweet, and ‘common sensibles’ like movement and number. This principle states:
(C)
For any subject S and any sensible characteristic F, if S believes, without ground for doubt, that he is perceiving something to be F, then it is evident for S that he perceives something to be F. (Chisholm 1977 pg. 78)
This principle accounts for justifying the indirectly evident on the basis of the directly evident. Consider how it works in a case that Jones is looking at a red object. Assume that she is appeared to redly (the object appears to her to be red) and that she has no evidence that would count against her actually perceiving that there is something red (that is, that nothing acceptable to her that tends to confirm that she is not actually perceiving something to be red). In such a case it is evident to her that (i) she actually perceives something to be red, and (ii) that there is something red. Moreover, she knows those propositions assuming that she believes them.
These epistemic principles account only for the justification of our beliefs concerning what we are perceiving at any given moment. To account for the justification of anything about the past, Chisholm proposes epistemic principles to account for the justification of beliefs based on memory. We should note that “’memory’ presents us with a terminological difficulty analogous to that presented by ‘perception’.“ (Chisholm 1977 pg. 79). Chisholm proposes that the expression ‘believes that he remembers’ be used in a way analogous to the way he uses ‘believes that he perceives’; that is, ‘S believes that he remembers that p’ does not imply the truth of p (nor does it imply that p is false).
Chisholm notes that “[s]ince both our memory and perception can play us false, we run a twofold risk when we appeal to the memory of a perception.” (Chisholm 1977 pg. 79). If we are justified in believing a proposition based on our seeming to remember that we perceived it to be true, we can go wrong in two ways: our perception or our memory (or possibly both) may mislead us. For this reason, Chisholm formulates principles for remembering that we perceived, principles along the same lines as the principles concerning perception but takes into account that the evidence that memory of a perception provides is weaker than that of perception. The principles about memory of perceptions, therefore, only justify propositions to a lower epistemic level of justification than those of perception. Corresponding to (B), Chisholm proposes:
(D) For any subject S, if S believes, without ground for doubt, that he remembers perceiving something to be F, then the proposition that he does remember perceiving something to be F is one that is acceptable for S. (Chisholm 1977 pg. 80).
Restricting the range of ‘F’ to sensible predicates, he presents the following analogue to (C):
(E) For any subject S, if S believes, without ground for doubt, that he remembers perceiving something to be F, then it is beyond a reasonable doubt for S that he does remember perceiving something to be F. (Chisholm 1977 pg. 80).
Chisholm points out that if our memory of perceptions is reasonable, then so must our memory of self-presenting states, and thus, proposes that:
(F) For any subject S and any self-presenting property F, if S believes, without ground for doubt, that he remembers being F, then it is beyond a reasonable doubt for S that he remembers that he was F. (Chisholm 1977 pg. 81).
Although Chisholm explained the justification of some empirical beliefs, the epistemic principles presented thus far do not account for knowledge of ordinary common-sense propositions (for example, ‘There is a cat on the roof’). The epistemic principles proposed to this point account for these propositions being acceptable or beyond reasonable doubt, but do not account for their being evident, that is, justified to the level required for knowledge.
Chisholm appeals to the coherence of propositions about memory and perception, that is, their being mutually confirming and concurring, to explain how beliefs can be justified to the level required for knowledge. A proposition justified to a certain level is also justified to all lower levels, for example, an evident proposition, is also beyond a reasonable doubt, acceptable, and has some presumption in its favor. Some propositions justified according to the epistemic principles of perception and memory are evident, others beyond reasonable doubt, and others as acceptable, but all propositions justified by these principles are at least acceptable.
Chisholm proposes that if the conjunction of all propositions that are acceptable for someone tend to confirm another proposition, then this latter proposition has some presumption in its favor, that is:
(G) If the conjunction of all those propositions e, such that e is acceptable for S at t tends to confirm h, then h has some presumption in its favor for S at t. (Chisholm 1977 pp. 82-83).
He then defines the concept of a concurrent set of propositions, which may be thought of as coherence of propositions or beliefs, as follows:
D4.4 A is a set of concurrent propositions =Df A is a set of two or more propositions each of which is such that the conjunction of all the others tends to confirm it and is logically independent of it. (Chisholm 1977 pg. 83).
He proposes the following, somewhat bold, epistemic principle of how concurring proposition raise their level of epistemic justification:
(H) Any set of concurring propositions, each of which has some presumption in its favor for S, is such that each of its members is beyond reasonable doubt for S.
To explain how this principle works, Chisholm asks us to consider the following propositions:
There is a cat on the roof today;
There was a cat on the roof yesterday;
There was a cat on the roof the day before yesterday;
There is a cat on the roof almost every day.
He asks us to suppose that (1) expresses a perceptual belief and, therefore, is beyond reasonable doubt; (2) and (3) express what I seem to remember perceiving, and thus, are acceptable; and (4) is confirmed by the conjunction of these acceptable statements. As this set of propositions is concurrent, each is beyond reasonable doubt (according to (H)).
Chisholm completes his theory of evidence by accounting for our knowledge which is gained via perception with the following principle:
(I) If S believes, without ground for doubt, that he perceives something to be F, and if the proposition that there is something F is a member of a set of concurrent propositions each of which is beyond reasonable doubt for S, then it is evident for S that he perceives something to be F. (Chisholm 1977 pg. 84).
This completes the account that Chisholm promised of how the Indirectly Evident may be justified in three ways. The first is, at least partially, by its relationship to the foundation. The epistemic principles accounting for justification of propositions based on perception and memory are partially dependent on the self-presenting states of being appeared to in a certain way, that is, seeming to perceive and seeming to remember perceiving. The second is that they provide justification because of their own nature, that is, that perception and memory provide prima facie evidence for believing that they represent the way things are in the world. The third is that by the coherence, the confirmation and concurrence, of propositions that are justified to a lower level than required by knowledge. Chisholm’s account is not only a version of foundationalism, but also incorporates elements of the coherence theory.
Chisholm admits that these epistemic principles provide, at best, an outline of a full-blown account of our knowledge of the external world, and a very rough one at that. They do not give an account of our knowledge of many of the very common things that we know about the past and about the complex world that we live in. However, this was all that Chisholm promised to provide. Moreover, this account is one of the most complete and comprehensive accounts developed of empirical knowledge.
7. Postscript
This presentation of Chisholm’s epistemology is largely based on the version of his theory from the second edition of his book Theory of Knowledge (Chisholm 1977). Chisholm continuously revised and improved his theory based on counterexamples and objections raised by his colleagues and students. His subsequent works provided his answers to the Gettier Problem, as well as a more detailed account of the epistemic principles accounting for the Indirectly Evident. In 1989 Chisholm published his third, and what became his final, edition of his Theory of Knowledge. This article is intended to introduce Chisholm’s theory of knowledge to lay the groundwork for the reader to undertake a detailed examination of the final version of his theory. It is left as an exercise to the reader to decide whether Chisholm’s principles do what he intends them to do.
8. References and Further Reading
Ayer, A. J., 1940. The Foundations of Empirical Knowledge, New York, NY: St. Martins Press.
Berkeley, George, 1710. The Principles of Human Knowledge, (http://www.earlymoderntexts.com/assets/pdfs/berkeley1710.pdf)
Chisholm Roderick, 1942, “The Problem of the Speckled Hen,” Mind 51(204): 368-373.
Chisholm, Roderick, 1946, Monograph: “Theory of Knowledge” in Roderick M. Chisholm, Herbert Feigl, William H. Frankena, John Passmore, and Manley Thomson (eds.), Philosophy: The Princeton Studies: Humanistic Scholarship in America, 233-344, reprinted in Chisholm 1982.
Chisholm, Roderick, 1948. “The Problem of Empiricism,” The Journal of Philosophy 45, 512-517.
Chisholm, Roderick, 1957. Perceiving: A Philosophical Study, Ithaca: Cornell University Press.
Chisholm, Roderick, 1965. “‘Appear’, ‘Take’, and ‘Know’”, reprinted in Robert J. Swartz (ed.), Perceiving, Sensing, and Knowing, (University of California, Berkeley, California, 1965).
Chisholm, Roderick, 1966. Theory of Knowledge, Englewood Cliffs, NJ: Prentice-Hall.
Chisholm, Roderick, 1977. Theory of Knowledge, 2nd edition. Englewood Cliffs, NJ: Prentice-Hall.
Chisholm, Roderick, 1979. “The Directly Evident,” in George Pappas (ed.), Justification and Knowledge, Dordrecht, D. Reidel Publishing Co.
Chisholm, Roderick, 1982. The Foundations of Knowing, Minneapolis: University of Minnesota Press.
Chisholm, Roderick, 1989. Theory of Knowledge, 3rd edition. Englewood Cliffs, NJ: Prentice-Hall.
Chudnoff, Elijah, 2021. Forming Impressions: Expertise in Perception and Intuition, Oxford, Oxford University Press.
Clifford, W. K., 1877. “The Ethics of Belief”, Contemporary Review (177), reprinted in Clifford’s Lectures and Essays (London, MacMillan, 1879).
Conee, Earl and Feldman, Richard, 2011. Evidentialism: Essays in Epistemology, Oxford, Oxford University Press.
Descartes, Rene, 1641, Meditations on First Philosophy(Third Edition), edited by Donald J. Cress, Indianapolis, IN, Hackett Publishing Company, 1993.
Feldman, Richard, 1974. “An Alleged Defect in Gettier-Counterexamples,” Australasian Journal of Philosophy 52(1), 68-69.
Feldman, Richard, 2003. Epistemology, Upper Saddle River, NJ, Prentice Hall.
Foley, R., 1997. “Chisholm’s Epistemic Principles,” in Hahn 1997, pp. 241–264.
Goldman, Alvin, 1976. “A Causal Theory of Knowing.” Journal of Philosophy, 64, 357-372.
Hahn, Lewis Edwin 1997. The Philosophy of Roderick M. Chisholm, (The Library of Living Philosophers: Volume 25), Lewis Edwin Hahn (ed.), Chicago, La Salle: Open Court.
Hume, David, A Treatise of Human Nature, (London, Oxford Clarendon Press, 1973).
Hume, David, Enquiries Concerning Human Understanding and Concerning the Principles of Morals (3rd Edition), (London, Oxford Clarendon Press, 1976).
James, William, 1896. “The Will to Believe”, in The Will to Believe, and Other Essays in Popular Philosophy, and Human Immortality, New York, Dover Publications, 1960.
Kyburg, Henry E. Jr., 1970. “On a Certain Form of Philosophical Argument,” American Philosophical Quarterly, 7, 229-237.
Legum, Richard A. 1980. “Probability and Foundationalism: Another Look at the Lewis-Reichenbach Debate.” Philosophical Studies 38(4), 419–425.
Lehrer, Keith, 1997, “The Quest for the Evident” in Hahn 1997, pp. 387–401.
Lewis, C.I., 1929. Mind and the World Order. New York, Dover Publications.
Lewis, C.I., 1946. An Analysis of Knowledge and Valuation. La Salle, IL, Open Court.
Lewis, C.I., 1952. “The Given Element in Empirical Knowledge.” The Philosophical Review 61, 168-175.
Locke, John, 1690, An Essay Concerning Human Understanding, edited by Peter H. Niddich (Clarendon Press; 1st Clarendon edition (2 Aug. 1979)).
Mavrodes George, 1973. “James and Clifford on ‘The Will to Believe’,” in Keith Yandell (ed.), God, Man, and Religion (McGraw-Hill, New York, 1973).
Pollock, J and Cruz, J. 1999. Contemporary Theories of Knowledge, 2nd edition. New York, Rowman & Littlefield.
Pryor, J. 2001. “Highlights of Recent Epistemology,” The British Journal for the Philosophy of Science 52, 95-124. (Stresses that modest foundationalism looks better in 2001 than it looked ca. 1976).
Quine. W.V.O. 1951. “Two Dogmas of Empiricism.” The Philosophical Review 60, 20-43.
Reichenbach, Hans, 1952. “Are Phenomenal Reports Absolutely Certain.” The Philosophical Review 61, 168-175.
Russell, Bertrand, 1912, The Problems of Philosophy, Hackett Publishing Company, Indianapolis.
Sellars, Wilfred, 1956, “Empiricism and the Philosophy of Mind,” in H. Feig l and M. Scriven, Minnesota Studies in the Philosophy of Science, Vol. I, (Minneapolis: University of Minnesota Press, 1956), pp. 253-329.
Sosa, Ernest, 1980. “The Raft and The Pyramid: Coherence versus Foundations in the Theory of Knowledge,” in French, Uehling, and Wettstein (eds.), Midwest Studies in Philosophy, Volume V, Studies in Epistemology, (University of Minnesota Press, Minneapolis, 1980).
Sosa, Ernest, 1997, “Chisholm’s Epistemology and Epistemic Internalism,” in The Philosophy of Roderick M. Chisholm (The Library of Living Philosophers: Volume 25), Lewis Edwin Hahn (ed.), Chicago, La Salle, Open Court, pp. 267–287.
van Cleve, James. 1977. “Probability and Certainty: A Reexamination of the Lewis-Reichenbach Debate’,” Philosophical Studies 32(4), 323-34.
van Cleve, James. 2005. “Why Coherence is Not Enough: A Defense of Moderate Foundationalism,” in Contemporary Debates in Epistemology, edited by Matthias Steup and Ernest Sosa. Oxford, Blackwell, pp. 168-80.
Vogel, Jonathan. 1990. “Cartesian Skepticism and Inference to the Best Explanation.” The Journal of Philosophy 87, 658-666.
Communication is crucial for us as human beings. Much of what we know or believe we learn through hearing or seeing what others say or express, and part of what makes us human is our desire to communicate our thoughts and feelings to others. A core part of our communicative activity concerns linguistic communication, where we use the words and sentences of natural languages to communicate our ideas. But what exactly is going on in linguistic communication and what is the relationship between what we say and what we think? This article explores these issues.
A natural starting point is to hold that we use words and sentences to express what we intend to convey to our hearers. In this way, meaning seems to be linked to a speaker’s mental states (specifically to intentions). Given that this idea is at the heart of Paul Grice’s hugely influential theory of meaning and communication, this article begins by spelling out in detail how Grice makes the connection between communication and thought in section §1. The Intentionalist approach personified by Grice’s model has been endorsed by many theorists, and it has provided a very successful paradigm for empirical research; however, it is not without its problems. Section §2 surveys a number of problems faced by Grice’s specific account, and §3 considers challenges to the core Intentionalist claim itself, namely, that meaning and communication depend on the intentions of the speaker. Given these concerns, the article closes in section §4 with a sketch of two alternative approaches: one which looks to the function expressions play (teleology), and one which replaces the Intentionalist appeal to mental states with a focus on the social and normative dimensions of language and communication.
1. The Intentionalist Stance: Grice’s Theory of Meaning and Communication
a. Grice’s Theory of Meaning
Paul Grice’s seminal work has had a lasting influence on philosophy and has inspired research in a variety of other disciplines, most notably linguistics and psychology. His approach to meaning and communication exemplifies a general thesis which came to be called ‘Intentionalism’. It holds that what we mean and communicate is fixed by what we intend to convey. This idea is intuitively compelling but turns out to be hard to spell out in detail; this section thus offers a fairly detailed account of Grice’s view, divided into two subsections. §1.a summarises the core claims and concepts of Grice’s theory of meaning as well as Grice’s proposed definition of communication. §1.b. briefly summarises a different strand of Grice’s theory, namely, his theory of conversation, which concerns the role that the assumptions of cooperation and rationality play in communication. In setting out the different elements of Grice’s theory, the distinctions between his definition of communication and his theory of conversation become clear. References to other important Intentionalist accounts are provided throughout the article as well.
i. Natural and Non-Natural Meaning
Grice (1957/1989, pp. 213-215) starts by distinguishing two different kinds of meaning: natural and non-natural. The former occurs when a given state of affairs or property naturally indicates a further state of affairs or property, where “naturally indicates” entails standing in some kind of causal relation. So, for instance, we might say “Smoke means fire” or “Those spots mean measles”. This kind of natural meaning relation is very different, however, from non-natural meaning. For non-natural meaning, the relationship between the sign and what is signified is not straightforwardly causal. Examples of non-natural meaning include:
“Three rings on the bell mean the bus will stop”;
“By pointing at the chair, she meant you should sit down”;
“In saying ‘Can you pass the salt?’ he meant Pass the salt”.
It is non-natural meaning (of which linguistic meaning forms a central case) that is the main explanatory target of Grice’s theory of meaning and it is therefore also the focus of this article.
In order to provide a systematic theory of non-natural meaning, Grice distinguishes two main types of non-natural meaning: speaker-meaning and sentence-meaning (this has come to be used as the standard terminology in literature, though it should be noted that Grice himself often preferred a different terminology and made more fine-grained distinctions). As the terms suggest, speaker-meaning is what speakers mean when uttering a sentence (or using a type of gesture, and so forth) on a particular occasion, although sentence-meaning is what sentences mean, where ‘sentences’ are understood as the abstract entities that can be used by speakers on different occasions. Importantly, what a speaker means when using a sentence on a particular occasion need not correspond to its sentence-meaning. For instance, in (3), it seems the speaker means something (pass the salt) which differs slightly from the sentence-meaning (can you pass the salt?). This point returns in subsection §1.b.
Having distinguished speaker-meaning from sentence-meaning, Grice (1957/1989) advances his central claim that speaker-meaning grounds and explains sentence-meaning, so that we can derive sentence-meaning from the more basic notion of speaker-meaning. This claim is crucial to Grice’s overarching aim of providing a naturalistically respectable account of non-natural meaning. That is to say, Grice’s aim is to give an account of non-natural meaning which locates it firmly in the natural world, where it can be explained without appeal to any strange abstract entities (see the article on Naturalism for further details on naturalistic explanations). To do this, Grice was convinced that gestural and conventional meaning needed to be reduced to claims about psychological content—the things that individual gesturers and speakers think and intend. In order to explain how this reduction comes about, Grice’s analysis of speaker-meaning is considered in §1.a.ii, and Grice’s explanation of sentence-meaning in §1.a.iii.
ii. Speaker-Meaning and Intention
According to Grice, speaker-meaning is to be explained in psychological terms, more specifically in terms of a speaker’s intentions. Grice’s analysis thus lines up with the intuition that what a speaker means when producing an utterance is what she intends to get across. Starting with this intuition, one might think that speaker-meaning simply occurs when a speaker intends:
(i) to produce an effect in the addressee.
For instance, imagine that Anne says, “I know where the keys are”. We might think that Anne means Anne knows where the keys are by this utterance because this is the belief she intends her audience to form (that is, the effect she intends to have on her audience is that they form that belief). However, Grice (1957/1989, p. 217) argues that this condition is not sufficient. To see this, imagine that Jones leaves Smith’s handkerchief at a crime scene in order to deceive the investigating detective into believing that Smith was the murderer. Jones intends the detective to form a belief (and so satisfies condition (i)), but it seems incorrect to say that Jones meansthat Smith was the murderer by leaving the handkerchief at the crime scene. In this situation, it does not seem right to think of Jones as non-naturally meaning or communicating anything by leaving the handkerchief (this intuition is reinforced by recognising that if the detective knew the origin of the sign—that it had been left by Jones—that would fundamentally change the belief the detective would form, suggesting that the belief the detective does form is not one that has been communicated by Jones).
To address this worry, Grice suggests adding a second condition to the definition of speaker-meaning, namely that the speaker also intends:
(ii) that the addressee recognises the speaker’s intention to produce this effect.
This condition avoids our problem with Jones and the handkerchief, demanding the intentions involved in communication to be overt and not hidden, in the sense that the speaker must intend the addressee to recognise the speaker’s communicative aim. Although Grice (1957/1989, pp. 218-219) believes that these two conditions are indeed necessary for speaker-meaning, he argues that one more condition is required for a sufficient analysis. This is because of cases such as the following. Imagine that Andy shows Bob a photograph of Clyde displaying undue familiarity with Bob’s wife. According to Grice, we would not want to say in such a case that Andy means that Bob’s wife is unfaithful, although Andy might well fulfil our two conditions (he might intend Bob to form such a belief and also intend Bob to recognise his intention that Bob forms such a belief). The reason this does not count as a genuine case of communication, Grice claims, is that Andy would have acquired the belief that his wife is unfaithful just by looking at the photo. Andy’s intentions do not then, Grice contends, stand in the right relation to Bob’s belief, because Bob coming to have the belief in question is independent of Andy’s intentions. So, according to Grice, the speaker must not only intend to produce an effect (condition (i)) and intend this intention to be recognised (condition (ii)) but the recognition of (ii) must also play a role in the production of the effect.
Grice holds that these three conditions are necessary and jointly sufficient for a speaker to mean something by an utterance. His account of speaker-meaning (for assertions) can be thus summarised as follows:
A speaker, S, speaker-means that p by some utterance u if and only if for some audience, A, S intends that:
(a) by uttering u, S induces the belief that p in A;
(b) A should recognise that (a);
(c) A’s recognition that (a) should be the reason for A forming the belief that p.
Grice further specifies that what the speaker means is determined, that is, established, by the intended effect. (i-iii) define what is needed for an assertion, as the intended effect is for the audience to form a certain belief—for example, in the example above, Anne intends her audience to form the belief that Anne knows where the keys are. On the other hand, in cases where the speaker means to direct the behaviour of the addressee, Grice claims that the intended effect specified in (i-iii) should instead be that the audience performs a certain action—for example, a person pointing at a chair means the addressee should sit down.
iii. Speaker-Meaning as Conceptually Prior to Sentence-Meaning
As noted above Grice held that speaker-meaning is more basic than sentence-meaning. Note that Grice’s analysis of speaker-meaning does not contain any reference to sentence-meaning. The analysis refers only to utterances and although these can be productions of a sentence or of some other abstract entity (such as a recurrently used type of gesture) this does not have to be the case. To illustrate this, consider an example from Dan Sperber and Deidre Wilson (1995, pp. 25-26) in which a speaker responds to the question “How are you feeling?” by pulling a bottle of aspirin out of her handbag. Here, the speaker means that she is not feeling well but she does not produce anything appropriately considered an utterance of a sentence or some other type of behaviour that has an established or conventionalised meaning. This illustrates that speaker-meaning can be explained in psychological terms without reference to sentence-meaning.
Sentence-meaning, on the other hand, does (according to Grice) require appeal to speaker-meaning. The idea here is that the sentences of a language do not have their meaning in virtue of some mind-independent property but in virtue of what speakers of the language do with them. Hence, Grice claims that “what sentences mean is what (standardly) users of such sentences mean by them; that is to say what psychological attitudes toward what propositional objects such users standardly intend […] to produce by their utterance.” (Grice, 1989, p. 350). For example, the sentence “snow is white” means snow is white because this is what speakers usually intend to get across by uttering this sentence. In this way, Grice explains non-natural meaning in purely psychological terms.
One reservation one might have about this approach is that it looks like Grice gets the relationship between speaker-meaning and sentence-meaning ‘backwards’ because sentence-meaning seems to play an important role in how speakers get across what they intend to convey. For example, Anne’s intention to convey that she knows where the keys are by uttering “I know where the keys are” seems to depend on the meaning of the sentence. So how can speaker-meaning be conceptually prior to sentence-meaning? This worry can be addressed, however, by highlighting a distinction between constitutive and epistemic aspects of speaker-meaning. On Grice’s account, what a speaker means is constituted by the kind of complex intention described in (i-iii) above. However, speakers can nevertheless exploit an established sentence-meaning when they try to get their audiences to recognise what they mean (Grice, 1969/1989, p. 101). The idea here is that an efficient way for a speaker such as Anne to get across that she knows where the keys are will simply be to say, “I know where the keys are”. Although what she means is determined by her intention, the use of the respective sentence allows her to convey this intention to her audience in a straightforward way. At the same time, an established use of a sentence will also inform (or constrain) the formation of communicative intentions on the side of the speaker. For example, a speaker who knows the meaning of the sentence “I know where the keys are” will not utter it intending to convey that ostriches are flightless birds, say, because—in the absence of a very special context—it would be irrational for the speaker to think that this intention would be correctly recognised by the addressee. Importantly, this does not entail that speaker-meaning cannot diverge from sentence-meaning at all and, as is discussed in §1.b, in fact, such divergences are not uncommon.
This section concludes with a note on communication. As has been noted by Intentionalists such as Peter Strawson, for Grice the kind of complex intention that he described seems to play two roles. First, it is supposed to provide an analysis of what it takes for a speaker to mean something. In addition, however, it is also “undoubtedly offered as an analysis of a situation in which one person is trying, in a sense of the word ‘communicate’ fundamental to any theory of meaning, to communicate with another.” (1964, p. 446). Put somewhat differently, on Grice’s account for an agent to communicate something she will need to mean what she tries to get across and this meaning is analysed in terms of the complex intention that Grice provided as an analysis of speaker-meaning. The communicative attempt will be successful—that is, communication will occur—if and only if this intention is recognised by the speaker. For this reason, such intentions have come to be called “communicative intentions” in the literature following Grice (he himself preferred to speak of “m-intentions”).
b. Grice’s Theory of Conversation
The basis of Grice’s theory of conversation lies in his distinction between what is said and what is implicated by a speaker, both of which are part of the content that a speaker communicates (and therefore part of the speaker’s communicative intention). Roughly speaking, what is said lines up with sentence-meaning, with adjustments being made for context-sensitive expressions, such as “I” or “tomorrow” and ambiguous terms (such as “bank” or “crane”; see the article on Meaning and Context-Sensitivity). Although what is implicated is what a speaker communicates without explicitly saying it. To illustrate this distinction, consider the following examples:
(4) Alice: Do you want to go to the cinema?
Bill: I have to work.
(5) Professor: Some students have passed the exam.
In an exchange such as (4), it is clear that Bill does not merely communicate what he literally says, which is that he has to work. He also communicates that he cannot go to the cinema. Similarly, in (5) the professor does not merely communicate that some students have passed the exam (which is what she explicitly says) but that not all students have passed. Such implicitly or indirectly communicated contents are what Grice calls “implicatures” and his theory of conversation attempts to explain how they are conveyed.
At the basis of his explanation lies a crucial assumption about communication, namely that communication is a rational and cooperative practice. The main idea is that communicative interactions usually serve some mutually recognised purpose—such as the exchange of information—and that participants expect everyone to work together in pursuing this purpose. This assumption is captured in Grice’s Cooperative Principle:
Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged. (1975/1989, p. 26)
Grice further specifies what communicative cooperation involves by distinguishing four central conversational categories: quantity, quality, relation, and manner. All categories come with one or more conversational maxims that are supposed to be observed by cooperative communicators. Roughly, maxims under the category of quantity require interlocutors to provide the right amount of information, whereas quality maxims require speakers to provide only information that they believe to be true (or for which they have sufficient evidence). The only maxim under the category of relation is that interlocutors must make relevant contributions. Finally, maxims of manner require that interlocutors’ contributions are perspicuous.
How do the Cooperative Principle and the associated maxims help to explain how implicatures are communicated? Consider again the exchange in (4). In order to figure out what Bill implicated when he said, “I have to work”, Alice can reason as follows: I assume that Bill is being cooperative, but what he said, on its own, is not a relevant response to my question. I should infer, then, that Bill intended to communicate something in addition to what he merely said, something that does provide a relevant response to my question. This further information is probably that he cannot come to the cinema because he will be at work. A similar explanatory method is available for (5).
More generally, Grice (1975/1989, p. 31) proposes the following as a general pattern for inferring implicatures:
S has said that p;
there is no reason to suppose that S is not observing the maxims, or at least the Cooperative Principle;
S could not be doing this unless he thought that q;
S knows […] that I can see that the supposition that S thinks that q is required;
S has done nothing to stop me thinking that q;
S intends me to think […] that q;
and so S has implicated that q.
Notice that this pattern is not supposed to be deductively valid but rather to provide a pattern for an inference to the best explanation. In the first two decades of the twenty-first century it became common for theorists to hold that inferences of this kind not only play a role in the process of interpreting implicatures but also in many other types of linguistic phenomena, such as resolving ambiguous and vague terms. Because such inferences rely not just on sentence-meaning but also on considerations of cooperation, context, and speaker intention, they are often referred to as “pragmatic inferences” or instances of “pragmatic reasoning”.
2. Problems with Grice’s Theory of Meaning
Grice’s theory of meaning has been subject to numerous challenges. This section briefly outlines the standard objections and explains how Grice and other Intentionalists have tried to deal with them. A further problem for Grice’s account, concerning its psychological plausibility, is considered separately in §3.
a. Problems with the First Condition of Grice’s Analysis
As noted, the first condition of Grice’s analysis states that in order to communicate, a speaker must have the intention to bring about an effect in her audience. There are two main types of objection to this claim: first, there are cases in which the intended effects are not the ones that Grice claims they are and, second, there are cases where speakers do not intend to have any effect on an audience, yet which still communicate something.
To appreciate the first worry, it is useful to focus on assertions, where Grice claims that the intended effect is that the audience forms a belief. Reminders, however, seem to be a counterexample to this claim. For instance, imagine a scenario in which the addressee knows that a woman’s name is ‘Rose’ but cannot recall this piece of information in a particular situation. In such a case, the speaker might hold up a rose or simply say the following to remind the addressee:
(7) Her name is Rose.
It seems that the speaker does not want to make the addressee believe that the woman’s name is Rose because the addressee already believes this. It is equally clear, however, that the speaker means or communicates that the woman’s name is Rose. A second type of counterexample concerns examination cases. Imagine that a teacher asks a student when the Battle of Waterloo was fought to which the student replies:
(8) The Battle of Waterloo was fought in 1815.
Again, the student is not intending that the teacher forms the respective belief because the teacher already knows when the Battle of Waterloo occurred. But the student still clearly communicates that the Battle of Waterloo was fought in 1815.
Turning to the second objection, some cases suggest that a speaker need not have any audience-directed intention whatsoever. For instance, sometimes a speaker lacks an audience, such as when writing in a personal diary or practicing a speech in an empty room. However, we still want to say that the speaker means something in such cases. In other cases, speakers do have an audience, but they do not intend their utterances to have any effect on the audience. For example, imagine that an employee at a train station is reading out a list of departures over the loudspeaker. They utter:
(9) The next London train departs at 4 pm.
It is possible that the employee does not care in the least if—and therefore also does not intend that—any passenger thereby comes to believe that the train leaves at 4 pm. Rather the employee is just complying with the requirements of her job. Nonetheless, it seems highly counterintuitive to claim that the employee does not communicate that the train departs at 4 pm. These are just a few illustrations of the numerous examples that, theorists have argued, challenge the fundamental Intentionalist claim that communication requires intentions to cause certain effects in an audience.
Grice (1989) and other scholars who are sympathetic to his theory (for example, Neale 1992, Schiffer 1972) have responded to such examples in numerous ways. A first strategy is to try to accommodate the problem cases by modifying Grice’s original analysis. For example, Grice suggests that reminders could be dealt with by specifying that the intended effect for assertions is not that a belief is formed but that an activated belief is formed (1969/1989, p. 109). Or again, to deal with cases such as examinations Grice proposes further modifying the first condition so that it requires not that the addressee forms an activated belief, but rather that the addressee forms the activated belief that the speaker has the respective belief (1969/1989, pp. 110-111). Hence, in the examination case, the student would intend his utterance to have the effect that the teacher forms the activated belief that the student believes that the Battle of Waterloo was fought in 1815. To deal with other counterexamples of this kind, Grice (1969/1989) proposed further (and increasingly complex) refinements of his original analysis.
Another strategy to deal with alleged counterexamples is to argue that they are in fact compatible with Grice’s original analysis. For example, in cases in which no audience is directly present one might argue that the utterances are still made with the intention of having an effect on a future or possible audience (Grice, 1969/1989, pp. 112-115; Schiffer, 1972, pp. 73-79). Finally, a third strategy is to argue that the definitions of meaning provided by Grice and his followers capture speaker-meaning in its primary sense and that the counterexamples only involve cases of meaning in “an extended or attenuated sense, one derived from and dependent upon the primary sense” (Schiffer, 1972, p. 71). The idea seems to be that the counterexamples are in a sense parasitic upon more standard cases of meaning. For example, it might be argued that an utterance such as (8) is a case of meaning in an extended sense. The utterance can be taken to mean that the next London train departs at 4 pm because this is what speakers usually mean by uttering this sentence (standard meaning), and this is captured by the Gricean analysis.
However, counterexamples to Grice and his followers have been numerous and varied and not everybody has been convinced that the responses proposed can successfully deal with them all. For instance, William Alston (2000, pp. 45-50) has presented a critical examination of Intentionalist defences and pointed out—among other things—that it is far from clear that examples such as (8) can be treated as extended cases of meaning. Further, Alston asks the more general question of whether such a treatment would be attractive from a methodological point of view. In the face of these challenges, then, it remains an open question whether Grice’s fundamental claim—that communication necessarily involves intentions to cause effects on one’s audience (captured in the first clause of his analysis)—can be defended.
b. Problems with the Third Condition of Grice’s Analysis
A less fundamental but nonetheless important worry concerns the third condition of Grice’s analysis, according to which recognition of the speaker’s intention must be (at least part of) the reason that the intended effect comes about. As noted, Grice introduced this condition to deal with cases in which the first two conditions are fulfilled but where there is no relation between the two (because the speaker presents evidence that is already sufficient to bring about the intended effect, as in the photograph case discussed in §1.a.ii).
Theorists have worried that this condition might be too strict because it excludes cases that do intuitively involve communication. A first concern is that intuitions are far from clear even for Grice’s own photograph example. Is Grice right to hold that Andy failed to mean that Bob’s wife is unfaithful when showing the photograph to Bob? In addition, however, there are also counterexamples in which the third condition is not fulfilled but where there is a rather clear intuition that the speaker communicated something. Consider reminders again. It seems that reminders might not only pose a problem for Grice’s first but also for the third condition because they usually have their effects by means of prompting the speaker to remember something rather than because the hearer recognises that the speaker intended this effect. For example, a speaker who utters (6) to remind the addressee that a certain woman’s name is ‘Rose’ might well intend the speaker to remember this not because the addressee recognises this intention but because seeing the rose is sufficient to prompt the addressee’s memory. Nonetheless, one might still want to allow that by producing the utterance the speaker meant and communicated that the woman’s name is ‘Rose’.
Such examples undermine the necessity of Grice’s third condition and, despite his insistence to the contrary (1969/1989, p. 109), several Intentionalists have suggested that it might be dropped or at least weakened (Neale, 1992, pp. 547-549; Schiffer, 1972, pp. 43-48; Sperber and Wilson, 1995, pp. 29, 50-54). One such weakening strategy is considered in §3.a.ii, when the main tenets of Relevance Theory are discussed.
Another strategy that has been proposed is to move the third clause outside the scope of the communicator’s intentions (Moore, 2017a). The idea is that, in general, communicative acts are efficacious when they are properly addressed, and the recipient produces the intended response partly because she recognises that she is being addressed. However, the communicator need not intend that the recipient recognises the overtness of a communicative act as a reason to produce the intended response. This strategy is part of a wider attempt at de-intellectualising the cognitive and motivational requirements for engaging in Gricean communication (as §3.b shows).
c. The Insufficiency of Grice’s Analysis
Although the first two objections claim that certain aspects of Grice’s analysis are not necessary, one can also object by claiming that Grice’s conditions are insufficient to account for communication. This objection, first raised by Strawson (1964, pp. 446-447) and further developed, among others, by Schiffer (1972, pp. 17-27), maintains that there are cases in which all three of Grice’s conditions are satisfied but which do not count as cases of communication. These examples are complicated, but Coady (1976, p. 104) nicely summarises the clearest one:
The most intelligible of such examples is one we owe to Dennis Stampe (although it is not cited by Schiffer) in which a man playing bridge against his boss, and anxious to curry favour, wants his boss to win and to know that the employee wants him to win. He has reason to believe that the boss will be pleased to see that the employee wants him to win but displeased at anything as crude as a signal or other explicit communication to the effect that now he has a good hand. Hence, when he gets a good hand the employee smiles in a way that is rather like but just a bit different from a spontaneous smile of pleasure. He intends the boss to detect the difference and argue (as Grice puts it): ‘That was not a genuine give-away smile, but the simulation of such a smile. That sort of simulation might be a bluff (on a weak hand), but this is bridge, not poker, and he would not want to get the better of me, his boss, by such an impropriety. So probably he has a good hand, and, wanting me to win, he hoped I would learn that he has a good hand by taking his smile as a spontaneous give-away. That being so, I shall not raise my partner’s bid.
What cases of this kind suggest, then, is that Grice’s original analysis is insufficient to ensure the overtness required for communication, that is, that the relevant intentions of speakers are transparent to their interlocutors. As noted above, Grice’s second condition was introduced to prevent deceptive intentions (as in the case of the murderer who left Smith’s handkerchief at the crime scene). However, the example of the bridge players shows that the second condition is insufficient to exclude deception at higher levels. In response to this worry, Strawson (1964, p. 447) proposed adding a fourth condition to Grice’s analysis, but Schiffer (1972, pp. 18-23) argued that in fact five conditions would be needed. However, as Schiffer himself highlights, the problem with all such moves to add additional clauses seeking to rule out certain orders of deceptive intentions is that they will always be open to the construction of ever more complex counterexamples in which all the conditions are fulfilled but where a deceptive intention at a still higher level causes problems. Hence, there is a threat of an indefinite regress of conditions.
Grice himself discusses two main strategies for responding to the concern about deceptive intentions. The first is to insist that complexity has an upper bound, so the regress stops at some point and no further conditions are needed (1969/1989, pp. 98-99). His claim is that at some point the intention that a speaker would need to have for it to constitute a further counterexample would just be too complex to count as psychologically real for the speaker or addressee. However, Grice himself (1969/1989, p. 99) and other Intentionalists (for example, Schiffer 1972, pp. 24-26) raised doubts about this response, objecting both that it fails to specify exactly where the cut-off point is and that it fails to provide the philosophically rigorous analysis of communication that Grice set out to deliver (since it fixes the cut-off point on the basis of contingent facts about the cognitive capabilities of current interlocutors rather than on the basis of the nature of communication).
The second strategy—the one that Grice prefers—is simply to rule out deceptive or hidden intentions (1969/1989, pp. 99-100). However, as Grice (1982/1989, p. 303) realises, there seems to be a worry that the introduction of these conditions is ad hoc and, related to this, other theorists such as Kent Bach and Robert Harnish (1979, p. 153) have wondered why it would be appropriate to introduce a condition against these complex forms of deception but not against simple forms of deception such as lying. Further, Schiffer (1972, p. 26) claims that Grice’s condition might be incapable of accounting for some of the more intricate counterexamples that he constructs. Despite these worries, Grice (1982/1989, pp. 302-303) and some other theorists such as Neale (1992, p. 550) have maintained that a condition against hidden intentions provides the best remedy for deception cases.
Schiffer himself proposes dealing with deceptive cases by appealing to what he calls “mutual knowledge” (1972, p. 30). (David Lewis (1969) is generally credited with introducing the related notion of common knowledge; Schiffer’s notion of mutual knowledge is also clearly related to other so-called ‘common ground’ views, such as those of Stalnaker (2002; 2014), Bach and Harnish (1979), and Clark (1992; 1996).) Roughly, two or more people have mutual knowledge of some state of affairs if they all know that the state of affairs obtains, know that the others know that the state of affairs obtains, know that the others know that they know that the state of affairs obtains, and so on indefinitely. For example, two people who are facing each other while sitting around a table with a candle on it will have mutual knowledge that there is a candle on the table because each of them knows that there is a candle on the table, knows that the other one knows that there is a candle on the table, knows that the other knows that one knows that there is a candle on the table, and so on. Schiffer’s (1972, p. 39) proposal is then to build into Grice’s analysis of speaker-meaning a condition that requires a speaker to be intending to bring about a state of affairs that makes it mutual knowledge that the speaker is intending to cause an effect in the audience by means of the audience’s recognition of this intention. In other words, for Schiffer, communication consists in contributing to the set of mutually known things by making one’s intention known to one’s audience. This, according to Schiffer, handles counterexamples that involve deception because in these cases the interlocutors lack mutual knowledge of the speaker’s intentions. For instance, in the example above, it is not mutual knowledge that the employee intends the boss to believe that he has a good hand by means of the boss’ recognition of this intention because the boss does not know that the employee intends his fake smile to be recognised as such.
Now, it is an essential feature of mutual knowledge that it involves an endless series of knowledge states and one might wonder why the introduction of a condition that appeals to mutual knowledge would therefore not also invite a problematic regress; Grice (1982/1989, p. 299) himself expressed such a worry about Schiffer’s account. Responding to this, Schiffer claims that the infinite series of knowledge states required for mutual knowledge is a “harmless regress” (1972, p. 30) because such an infinite series is also required for individual knowledge. For example, it might be thought that if a person knows that Berlin is in Germany, she will also know that she knows that Berlin is in Germany, know that she knows that she knows that Berlin is in Germany, and so on. Hence, he argues that appeals to mutual knowledge do not create any special problem.
The literature on mutual knowledge contains several attempts at spelling out Schiffer’s insight that the notion of mutual knowledge, understood in terms of dispositions to draw an indefinite number of inferences, is not problematic from a psychological point of view. One such attempt is discussed in §3.a.ii, when the notion of mutual manifestness is introduced. This strategy, proposed and popularised in the first two decades of the twenty-first century, though not discussed in this article in detail, is to conceptualise mutual knowledge as a relational mental state, namely, as a mental state that two or more individuals have if there is a ternary relation that holds between them and a certain proposition (Wilby, 2010). Relational accounts of mutual knowledge, as well as of other cognate notions, have also been criticised on several fronts (see, for example, Battich and Geurts (2020)).
d. Problems with Conventional Speech Acts
Another challenge to the idea that Grice’s model is sufficient to give us a general account of communication has to do with a class of speech acts which are usually referred to as “conventional” (in the following, “speech act” is used to refer to what is usually called “illocutionary speech acts” in speech act theory; for further details, see the relevant sections of the article on John Langshaw Austin). Ever since Strawson’s (1964) introduction of Grice’s analysis to speech act theory, Intentionalists have distinguished conventional speech acts from what they deem to be communicative speech acts. According to them, communicative speech acts are speech acts that must be performed with a communicative intention and they will only be successful if the audience recognises this communicative intention. Types of speech acts that are standardly claimed to be communicative in this way are assertions and directives. Conventional speech acts, on the other hand, neither require any communicative intention on the side of the speaker nor any recognition by the audience for their successful performance. Instead, conventional speech acts depend on the existence of certain conventional or institutional background rules.
Here are two examples. First, consider checking in poker. In order to check in poker, all that one needs to do is to say “check” when it is one’s turn and certain conditions are satisfied (for example, no other player has yet made a bet in that round, and so forth). Importantly, no intention is required for a player to check as illustrated by the fact that sometimes players manage to check by saying “check” although they intended to bet. For this and other reasons, it is also not necessary that the player’s intention to check is recognised. Second, consider pronouncing a couple husband and wife. For a priest to do so, she only needs to say “I hereby pronounce you husband and wife” at the right point during a marriage ceremony. Again, for the speech act to work (that is, for the couple to be married) it will be irrelevant if the priest has a particular intention or if anyone recognises this intention. All that is necessary is that the speech act is performed according to the rules of the church. Of course, these are only two examples and it should be clear that there are many other conventional speech acts, including pronouncing a verdict, naming a ship, declaring war, and so on.
The problem that conventional speech acts pose for Grice’s account is that they have to be classified as non-communicative acts because for Grice speech acts are communicative only if they depend on the existence and recognition of communicative intentions. However, this result is unattractive because it seems false to claim that a speaker who checks in poker does not communicate that she is checking, or that a priest who is marrying a couple does not communicate this. Although theorists in the Gricean tradition usually recognise and accept that analysing communication in terms of certain complex intentions has this result, (Bach and Harnish, 1979, p. 117), a common defence is that it does not undermine Grice’s analysis because “such speech acts as belong to highly conventionalised institutions are, from the point of view of language and communication, of marginal interest only” (Schiffer, 1972, p. 93), and so belong to the study of institutions rather than communication (Sperber and Wilson, 1995, p. 245). It is unclear, however, why their conventional or institutional nature should make conventional speech acts less significant for the study of communication or be a reason to declare them non-communicative. One might claim instead that the Gricean theory is too narrow to include conventional speech acts and therefore defective as a general theory of communication.
e. Problems with Explaining Sentence-Meaning in Terms of Speaker-Meaning
Although the objections discussed in §2.a-d challenged the necessity and sufficiency of Grice’s proposed analysis of meaning and communication, the final set of objections challenges Grice’s claim that sentence-meaning can be reduced to speaker-meaning. One of the most well-known of these objections comes from Mark Platts (1997, pp. 89-90. Another famous objection to this claim has been presented by John Searle (1965); Searle’s objection has been addressed by Grice (1969/1989, pp. 100-105) and Schiffer (1972, pp. 27-30)). The problem with Grice’s analysis, Platts argues, is that a language allows for the formation of infinitely many sentences, most of which have never been used by any speaker, so there are no intentions that are usually associated with them. On Grice’s account, this would have the absurd consequence that most sentences lack meaning.
A possible response to this objection would be to slightly modify Grice’s account and claim that a sentence does not have meaning in virtue of the intentions that speakers actually have when using the sentence but in virtue of the intentions they would have if they were to use the sentence. However, this raises the question why speakers would have such intentions when using the sentence and it seems hard to explain that without making some reference to the meaning of the sentence itself. A more promising starting point in accounting for unuttered sentences is to note that sentences are complex entities that are composed of more basic elements, for example, words (see the article on Compositionality in Language). Taking this into account, a Gricean might argue as follows: there is a finite set of sentences that are regularly used by speakers and which thus have communicative intentions associated with them. Following Grice, one can claim that the meanings of this fixed set of sentences are determined by the intentions that speakers usually have when using them. But once the meaning of these sentences is fixed in this way, the meaning of the constituent words must be fixed, too (by how they are used by speakers in constructing sentences). And once the meaning of words is fixed in this way, they can be combined in novel ways to form new sentences and thereby restrict the possible intentions that speakers may have when using these new sentences and, therefore, also fix the meanings of these new sentences (such a move might be compatible with holding that speakers tacitly know a Davidsonian-style T-theory for their language, though the Intentionalist would claim that the basic meanings deployed in such a theory are generated via speaker intentions rather than via Davidson’s own notion of ‘radical interpretation’; see Davidson: Philosophy of Language). Grice (1968/1989) seems to make a similar proposal when arguing that not only sentence-meaning but also word-meaning should be explained by reference to speaker intentions (for a similar proposal, see Morris (2007, 262)). However, the success of this strategy will depend to a large extent on whether word-meaning can indeed be explained in terms of speaker intentions. After the first two decades of the twenty-first century, a detailed account of how exactly this might be done is yet to come.
3. Are Intentionalist Approaches Psychologically Implausible?
Although section §2 looked at specific objections to some of the elements of the Gricean model, an objection can also be levelled at Intentionalism more generally, concerning its psychological implausibility. This objection deserves to be taken seriously because it targets the core feature of the Intentionalist approach, namely, the overarching ambition of explaining semantic notions in psychological terms.
The objection from psychological implausibility takes two forms: first, there is a worry that the specific Intentionalist model which Grice gives us fails to fit with the cognitive processes which underlie the grasp of meaning and communication. Second, some have objected to the Intentionalist claim per se along the lines that, from a developmental and evolutionary point of view, it gets the relationship between meaning and thought the wrong way round. These challenges are explored in this section.
a. Is Grice’s Model Psychologically Implausible?
To get an intuitive sense of the concern, consider a speaker, S, who has the communicative intention of informing her hearer, H, that coffee is ready. One way of spelling out what it is for S to have an informative intention is to say that:
(i) S intends H to form a certain belief, in this case, that coffee is ready.
As already discussed, if S’s act is to count as communicative in Grice’s sense, it must be overt. So, at least (ii) must also hold:
(ii) S intends that H comes to believe that S intends to inform H.
Correlatively, understanding a communicative act seems to require the hearer to recognise the speaker’s intention as specified in (ii), the content of which comprises a complex embedding of intentions and beliefs. However, the current objection goes, in ordinary communicative exchanges speakers and hearers do not seem to go through complex inferences about each other’s mental states. Indeed, linguistic communication appears to be, by and large, an ‘automatic’ process. So, the model proposed by Grice seems not to fit with the actual psychological processes involved in language understanding and communication.
This concern was often voiced in Grice’s lectures (see, for instance, Warner (2001, p. x)). A prominent presentation of the argument can be found in the work of Ruth Millikan (1984, Chapter 3; other authors who have voiced similar concerns include Alston, 2000; Apperly, 2011; Azzouni, 2013; Gauker, 2003; Pickering and Garrod, 2004). In the remainder of this section, this objection is considered on its own terms, together with some of the replies that have been offered. Millikan’s teleological approach comes back in §4.a.
i. Grice’s Response to the Challenge: Levels of Explanation
Grice’s response to the allegation of psychological implausibility was to stress that his account was meant as a rational explanation of communicative interactions, and not as an attempt to capture the psychological reality of everyday communicative exchanges (Warner, 2001). Another proposal for how to understand Grice’s claim has been offered by Bart Geurts and Paula Rubio-Fernandez (2015), drawing on David Marr’s (1982) distinction between the computational and algorithmic levels of explanation. A computational level explanation captures the task or function a system aims to perform (that is, what mapping from inputs to outputs the system is designed to implement), while an algorithmic explanation captures the actual rules or states of the system which realise that function. So, supposing that the function we want to compute is ‘2x2’ (this is the computational level description of the system), there are then two different algorithms that a system could use to compute this function:
x · x x + x
2 · _____ x · _____
So, two systems can have the same computational description while making use of different algorithms.
In the context of linguistic communication, Geurts and Rubio-Fernandez (2015, pp. 457-459) argue, Gricean pragmatics is pitched at the computational level, specifying the desired mappings between inputs and outputs. Processing theories of communication, on the other hand, are pitched at the algorithmic level. Only processing theories need to reflect the psychological or cognitive reality of the minds of speakers since they need to explain how the model provided by Gricean pragmatics is implemented. According to Geurts and Rubio-Fernandez, then, the objection from psychological implausibility rests on the tacit assumption that Gricean pragmatics is not only a computational theory of pragmatics, but also an algorithmic theory of processing. If the distinction between these two levels of explanation can be maintained, and Gricean pragmatics is understood as being only a computational theory, then the objection from psychological implausibility is undermined.
Finally, if one wants to argue that Gricean pragmatics is psychologically implausible, one needs to provide reliable empirical evidence. Importantly, allegations of psychological implausibility often seem to rely on evidence from introspection. As Geurts and Rubio-Fernandez (2015, pp. 459-466) point out, even assuming that introspection is always reliable (which it is not), ascribing propositional attitudes need not be a conscious or consciously accessible process. It might very well be a process that, by and large, occurs unconsciously (even though, on reflection, communicators can easily and consciously access the outputs of this process). Therefore, evidence from introspection alone is not enough to support the argument that the Gricean model is psychologically unrealistic.
ii. A Post-Gricean Response: Relevance Theory
The concern to capture the psychological reality of our everyday communicative exchanges is also at the heart of a highly influential post-Gricean approach known as ‘Relevance Theory’, which aims at providing a theory of communication that is not only plausible but also explanatory from a cognitive point of view. Proposed by Dan Sperber and Deirdre Wilson (1995), Relevance Theory has been highly influential not only in philosophy but also in linguistics and psychology (Noveck, 2012).
It is important to note that Relevance Theory aims at being, primarily, a theory of communication and not a theory of meaning, although it is suggestive of what is known as a ‘contextualist’ approach with respect to meaning (see Meaning and Context-Sensitivity, section 2). Furthermore, advocates of Relevance Theory stress that, even if Gricean insights have inspired much of their theorising, their approach differs in crucial respects from Grice’s own theory of communication (Sperber and Wilson, 2012, Ch. 1). To introduce the reader to this alternative approach, this section starts by presenting the notions of mutual manifestness and ostensive-inferential communication, which are meant, respectively, to replace the notion of mutual knowledge and expand Grice’s notion of communication (as anticipated in §§2.b and 2.c).
According to Sperber and Wilson (1995, pp. 15-21), the notion of mutual knowledge is not psychologically plausible because it requires interlocutors to have infinitely many knowledge states, and this is just not possible for creatures with limited cognitive resources. Although the argument against mutual knowledge is not perfectly clear, it seems that, according to Sperber and Wilson, when one knows (in the ‘occurrent’ sense of the term) that p, one must have formed a mental representation that p, and no cognitively limited being can form infinitely many representations. It is worth noticing that this further assumption might be misleading. As the early proponents (see §2.c) of the notion of mutual knowledge pointed out, the infinite series of knowledge states is to be understood as a series of inferential steps, which need not directly reflect individuals’ representational states. Therefore, there might not be anything psychologically improper about the notion of mutual knowledge. Setting this point to one side, however, it might still be that mutual manifestness proves more plausible from a cognitive perspective, and thus the notion is still worthy of exploration.
Sperber and Wilson begin their account by defining what they call ‘an assumption’. This is a thought that the individual takes to be true, and it is manifest to an individual if that individual has the cognitive resources to mentally represent it (in a certain environment at a certain time). Importantly, an assumption can be manifest without in fact being true. In this respect, the notion of manifestness is weaker than that of knowledge. Moreover, an assumption can be manifest even if the corresponding representation is neither currently entertained nor formed by the individual. Indeed, an assumption can be manifest simply if the individual has the cognitive resources to infer it from other assumptions, where the notion of inference is meant to cover deductive, inductive, and abductive inferences alike.
With the notion of manifestness in place, the definition of ostensive-inferential communication is next to be discussed, starting with the notion of informative intention. An informative intention is an intention to make a set of assumptions manifest (or more manifest) to the audience. This definition is meant to capture the fact that sometimes we intend to communicate something vague, like an impression. If one intends to make a set of assumptions more manifest to an audience, one might represent that set of assumptions under some description, without thereby representing any of the individual propositions in the set (Sperber and Wilson, 1995, pp. 58-60).
Often enough, the communicator also intends to make the fact that she has an informative intention manifest or more manifest to an audience. According to Sperber and Wilson (1995, pp. 60-62), when the communicator intends to make the informative intention mutually manifest between themselves and the audience, they thereby have a communicative intention. A communicative act in this sense is successful when the communicative intention is fulfilled, namely, when it is mutually manifest between the interlocutors that the communicator has an informative intention.
Finally, Sperber and Wilson (1995, p. 63) define ostensive-inferential communication:
The communicator produces a stimulus which makes it mutually manifest to communicator and audience that the communicator intends, by means of this stimulus, to make manifest or more manifest to the audience a set of assumptions.
Importantly, Sperber and Wilson do not take Grice’s third clause to be necessary for ostensive-inferential communication (see §2.b). In other words, they do not think that the fulfilment of the informative intention must be based on the fulfilment of the communicative intention. As a reminder: Grice’s third clause was meant to exclude, inter alia, cases of ‘showing’ from cases of genuine non-natural meaning. According to Sperber and Wilson (1995, pp. 59-60), however, it is useful to think that there is a continuum of cases from showing something to ‘meaning something’ (in Grice’s sense) in which, at one end of the spectrum, the third clause does not hold (showing), while at the other end of the spectrum the third clause does hold and the informative intention could not be retrieved without having retrieved the communicative intention.
The core of the explanation of pragmatic reasoning proposed by Sperber and Wilson hinges on the idea that any ostensive act comes with a presumption of relevance, and ultimately it is this assumption that guides the recipient in interpreting the utterance. This explanation is based on two principles, which Sperber and Wilson (1995, p. 261) call the cognitive and the communicative principles of relevance. The cognitive principle is meant to capture a general feature of human cognition, namely that it ‘tends to be geared to the maximisation of relevance’ (Sperber and Wilson, 1995, pp. 260-266); Sperber and Wilson emphasise that, in this context, the term ‘relevance’ is used in a technical sense. In this technical sense, the relevance that a representation has for an individual at a given time is a function that varies positively with cognitive benefits and negatively with cognitive costs, or effort required to access the representation (via either perception, memory or inference). This use of the term is then close to, but might not coincide exactly with, the use of the term in ordinary language.
The communicative principle of relevance (Sperber and Wilson, 1995, pp. 266-273), which applies to communication specifically, states that ‘every utterance communicates a presumption of its own optimal relevance’. The presumption of optimal relevance means that, other things being equal, the hearer will consider the utterance as worth the interpretive effort. The interpretive enterprise is thus aimed at finding the most relevant interpretation of an utterance that is compatible with what the hearer believes about the speaker’s abilities and preferences. The range of possible interpretations is constrained by the fact that the hearer will look for the most relevant interpretation that can be achieved whilst minimising cognitive effort.
Contra Grice, this heuristic mechanism for utterance interpretation does not presuppose that communication is a cooperative enterprise-oriented toward the achievement of a common goal. Therefore, it might provide a more straightforward explanation of communicative interactions that are not prima facie cooperative (for example, examinations, adversarial communication, and so on). On the other hand, one might argue, the principles of relevance are so general and vague that they can hardly be falsified, and therefore might lack the desired explanatory power. Wilson (2017, pp. 84, 87) has responded to this objection by pointing out that the principles would be falsified if, for instance, utterance interpretation were systematically driven by considerations of informativeness, where the information derived from utterance interpretation is informative but not relevant.
Relevance Theory sees the heuristic mechanism as explaining how speakers and hearers come to select appropriate assumptions among the many that are manifest to them, which is something that Grice left unexplained (although Levinson 1989 queries the extent to which Relevance Theory itself offers an adequate explanation in this regard). Importantly, proponents of Relevance Theory take the heuristic mechanism to be at play not only in the derivation of implicatures, but more generally in any sort of pragmatic reasoning that leads communicators to retrieve the communicated content, where this stretches from reference assignment and ambiguity resolution to the adjustment of lexical meaning (for an overview, see Wilson (2017, section 4.4); for a critique, see Borg (2016)).
Finally, according to Relevance Theory, pragmatic reasoning is fundamentally an exercise in mindreading, which they understand as attributing and reasoning about mental states (Sperber and Wilson, 2012, Ch. 11). In this respect, Relevance theorists in general tend to take a radically mentalistic stance on how to best interpret Gricean explanations of pragmatic reasoning. However, as the next section shows, this mentalistic stance has itself come under attack.
b. Is the Intentionalist Assumption of the Priority of Thought over Language Plausible?
The first objection from psychological implausibility (§3.a) was that the reasoning about mental states posited in the Gricean model was too complex to play a central explanatory role in linguistic exchanges. Even if this objection is dismissed, however, a related but distinct concern emerges from the fields of developmental and comparative psychology. Infants and (more controversially) some non-human primates appear to be proficient communicators, yet it is unclear to what extent, if at all, they can reason about mental states. This concern threatens one of the central assumptions of the Intentionalist approach, namely that reasoning about mental states comes before the grasp of linguistic meaning. In other words, this second objection from psychological implausibility holds that the Intentionalist approach gets the priority between language and mental state concepts the wrong way around.
A first reason for thinking that pre-linguistic or non-linguistic creatures lack the ability to attribute mental states is that most of what we come to know or believe about other minds is based on our understanding of what others say. It is thus not obvious that children could acquire mental state concepts without already having access to this source of information (for a rich elaboration of this point from an empirical point of view, see Astington and Baird (2005), as well as Heyes (2018, Ch. 8)). Correlatively, the youngest age at which children apparently manifest an understanding of false beliefs, which is often regarded as the hallmark of the capacity for mental state attribution, is between 3 and a half and 4 years of age, namely when they master a language that contains mental states terms (see Wellman, Cross, and Watson (2001), but see also Geurts and Rubio-Fernandez (2013); for a general overview of the field, see Theory of Mind). If one goes for the lowest threshold and assumes that the critical age for false-belief understanding is 3 and a half years of age, there still is an important gap between the time at which infants start communicating flexibly and effectively (after their first birthday), and the time in which they reason about others’ mental states.
Moreover, evidence from atypical development is suggestive of the same developmental trajectory, with linguistic competency developing prior to mental state attribution. For instance, deaf children born of hearing parents, who are delayed in their acquisition of a mental state vocabulary, also manifest a delay in their ability to reason about others’ mental states. Importantly, as soon as they learn the relevant mental state terms, their performance on mindreading tasks improves, even on tasks which do not involve significant use of language (see, for example, Pyers and Senghas (2009)). Or again, at least some individuals on the autistic spectrum successfully acquire language, even though this condition is standardly held to involve impairment to the ability to attribute mental states to others (see, for example, references in Borg (2006, fn. 6)).
The challenge to the Intentionalist approach is thus that from an empirical point of view, linguistic competency seems to precede, and possibly to support, the development of abilities to reason about mental states (rather than the reverse picture assumed by Intentionalists). However, not all researchers working in developmental or comparative psychology accept this claim. In response to the challenge, advocates of an Intentionalist model have argued, first, that there are reasons to think pre-linguistic communication must itself be Gricean in nature (so that an ability to do Gricean reasoning must be present prior to language acquisition; see, for example, Tomasello (2008; 2019)). Second, they have argued that (contrary to the suggestion above) typically developing infants do show sensitivity to others’ mental states prior to the acquisition of language (see, for example, Onishi and Baillargeon (2005); for a useful overview and further references, see Rakoczky and Behne (2019)). While the literature on child development in this area is fascinating, it is also extensive and full consideration of it would unfortunately take us too far afield. This section closes, then, just by noting that the exact nature of infants’ mindreading abilities is still hotly debated and thus whether infants and animals prove to be a direct challenge to Intentionalism remains to be seen. One final point is worth noting, however: following up on the considerations presented in §3.a.i, it seems that a model of communication need not, in general, determine a univocal cognitive underpinning. Therefore, in principle, there could be room for different specifications of the cognitive/motivational mechanisms that underlie communication, even if one grants that the relevant forms of communication must be Gricean in nature (this idea has been pursued by Richard Moore (2017b), who has proposed downplaying the cognitive requirements of Gricean communication in a way that might insulate the approach from developmental evidence of the kind alluded to in this section).
4. Rejecting Intentionalism
Although the previous two sections looked at a range of problems both for the Gricean model for understanding meaning and communication, and for the more general Intentionalist approach which Grice’s framework exemplifies, this article closes by touching briefly on two possible alternatives to the Intentionalist approach.
a. Teleological Approaches: Millikan
As noted above, Millikan’s work provides a clear statement of the objection that the Gricean model is psychologically implausible, but it is important to note that her objection to the Gricean programme emerges from her wider theory of meaning. Millikan’s ‘teleological semantics’ aims (like Grice’s approach) at giving a naturalistic explanation of meaning and communication. However, Millikan seeks to couch this in evolutionary terms which do not (contra Grice) presuppose mastery of mental state concepts. Instead, Millikan tries to show that both meaning and speakers’ intentions should be accounted for in terms of the more fundamental, teleological notion of ‘proper function’. (An alternative, equally important, variety of the teleological approach can be found in the work of Fred Dretske.) In very broad strokes, the proper function of a linguistic device, be it a word or a syntactic construction, is held to be the function that explains why that word or syntactic construction is reproduced and acted upon in certain ways. For instance, the proper function of a word like “dog” is “to communicate about or call attention to facts that concern dogs” (Millikan, 2004, p. 35) and it is the fact that speakers do use the word in this way, and hearers do come to think about such facts when they hear this word, that explains the continued use of the word. Or again, according to Millikan (1984, pp. 53-54), the indicative grammatical mood, which can take several different forms within and across languages, has the proper function of producing true beliefs, while the imperative mood has the proper function of producing compliance. When speakers utter a sentence in the imperative mood, they typically intend to produce compliance in their hearers, and this linguistic device has proliferated because often enough hearers do comply with imperatives.
Given this background, Millikan argues that communicative intentions are, by and large, not necessary for explaining linguistic communication. If you state that p, in Normal circumstances (where ‘Normal’ is a term of art that Millikan has tried repeatedly to specify) I will come to believe that p without needing to represent that you overtly intend me to believe that p. Indeed, according to Millikan (1984, pp. 68-70) language use and understanding happen, first and foremost, ‘automatically’, as ways to express (or act upon others’ expression of) beliefs and intentions. (In a later work, Millikan (2017) suggests that, although the phenomenology of the grasp of meaning may be that of an automatic process, there may nevertheless be some underlying inferential work involved.)
According to Millikan, we engage in the sort of mentalistic reasoning envisaged by Grice only when we somehow inhibit or exploit parts of the automatic processes for language production and understanding. A crucial aspect of Millikan’s argument is that only the mental states that we represent, and which are instantiated in some region of our brains, can be causally relevant to the production and understanding of linguistic utterances. In her view, mental states that we can easily and readily come to have on reflection, but that we do not use in performing a certain task, do not play any direct causal role and she argues that communicative intentions are, most of the time, dispositional in this sense.
b. Normative Social Approaches
A second major alternative to the Intentionalist approach has been offered by Robert Brandom (1994), who suggests that we explain linguistic communication, as well as semantic features of thought and talk, in terms of skilful participation in norm-governed social practices. The core idea is that it is possible to define practices that involve the use of linguistic signs (which Brandom terms ‘discursive practices’) as a species of norm-governed social practices, and that this definition can be given in non-semantic terms. Thus, Brandom argues that we can translate the normative dimension of basic discursive practices (which dictate what is permissible and what is not) into the inferential role of sentences in a language (which dictate which sentences are derivable, or compatible, or incompatible with which other sentences). For instance, a sentence like ‘New York is East of Pittsburgh’ entails, together with other background assumptions, the sentence ‘Pittsburgh is West of New York’. Brandom holds, then, that we can define the meaning of a sentence in terms of its inferential role within the discursive practices in which that sentence appears.
As Loeffler (2018, pp. 26-29) points out, Brandom does not offer a direct critique of Intentionalist accounts. However, one of the main motivations for exploring Brandom’s view is that it offers an account of how utterances can come to have conceptually structured content without presupposing that this content derives from the conceptually structured content of mental states.
Unlike Millikan’s approach, Brandom’s proposed explanation of content is non-naturalistic. Indeed, it makes crucial use of normative notions, chiefly those of commitment and entitlement. According to Brandom, the normativity of these twin notions cannot be accounted for naturalistically. Several theorists see the anti-naturalistic strand of Brandom’s proposal as highly objectionable; for further elaborations of the concerns underlying the objection, see Reductionism). However, this objection reflects general difficulties with naturalising normativity, and these difficulties are not specific to Brandom’s project. In fact, one might argue, Gricean conceptions of communication face an analogous problem, in that they explain linguistic communication as reasoning about mental states, an activity that also has an essentially normative dimension.
A key element of Brandom’s account (and of later, Brandom-inspired approaches, such as Geurts (2019); see also Drobnak (2021)) is the notion of a commitment. Commitments can be conceived of as ternary relations between two individuals and a proposition. To use one of Geurts’ examples, when Barney promises Betty that he will do the dishes, he becomes committed to Betty to act consistently with the proposition that Barney will do the dishes. On the other hand, on accepting Barney’s commitment, Betty becomes entitled to act on the proposition that Barney will do the dishes. As above, the notion of commitment is a normative one (if I commit myself to you to do the dishes, you are entitled to act on the proposition that I will do the dishes, and I can be held responsible if I fail to do the dishes). Moreover, the notion is non-mentalistic. Commitments can be undertaken either implicitly or explicitly, and one can undertake a commitment without knowing or believing that one has done so. Correlatively, one is entitled to act on someone’s commitments whether or not the agent knows that she is so entitled. Therefore, if we coordinate our actions by relying on the commitments that we undertake, we need not always attribute, or reason about, psychological states (for further elaboration and defence of this point, see Geurts (2019, pp. 2-3, 14-15)).
An important upshot of these considerations is that the commitment-sharing view of communication has the potential to account for pre-linguistic communicative interactions without presupposing much, if anything, regarding mindreading capacities, and this might constitute an explanatory advantage over Gricean approaches. In this respect, a promising line of inquiry would be to consider the notion of ‘sense of commitment’ elaborated by John Michael and colleagues (2016). Roughly, the idea is that an individual manifests a sense of commitment when, in the context of a joint action, that individual is motivated to act partly because she believes that other participants in the joint action expect her to do so. Michael and colleagues have proposed several strategies to spell out the notion of sense of commitment in detail, and to experimentally track the emergence and modulation of this aspect of human psychology from infancy. The unexplored potential of this framework for studying pre-linguistic communication becomes apparent if one considers communicative acts, like pointing gestures, as aimed at coordinating contributions to the joint action. Normative inferential approaches also hold out the promise of other advantages. For instance, theorists such as Geurts (2019) and Kukla and Lance (2009) argue that it can give us an attractive explanation of different speech act types (such as assertions, promises, directions or threats). Furthermore, Geurts (2019) emphasises that analysing speech acts in terms of commitments allows one to give a unified treatment of both conventional and non-conventional speech acts. This is an advantage over traditional Gricean pictures, in which conventional speech acts such as “I hereby pronounce you husband and wife” were considered non-communicative (see §2.d).
Finally, the approach might also be thought to yield a good account of the notion of common ground. Geurts’ (2019, pp. 15-20) proposal is to preserve the iterative structure of mutual knowledge (see §2.c), but to redefine common ground in terms of shared commitment. In a nutshell, all that is required for a commitment to be in place is that it is accepted, and when it is accepted, it thereby enters the common ground. If I commit to you to do the dishes, and you accept my commitment, we both become committed to act in a way that is consistent with the proposition that I will do the dishes. In other words, you too become committed to the proposition that I will do the dishes and as a result, for instance, you yourself will not do the dishes. Now, if I accept this commitment that you have as a result of accepting mine, I thereby undertake a further, higher-order commitment, that is, a commitment to you to the proposition that I am committed to you to do the dishes, and so on, as in the iterations that constitute mutual knowledge.
If the analysis of the notion of common ground in terms of shared commitments is tenable, it seems that there are good prospects for explaining pragmatic reasoning and linguistic conventions (on the subject of conventions, see Geurts (2018)). Regarding implicatures, Geurts observes that the same pragmatic reasoning that was proposed by Grice (see §1.b) can be cast in terms of commitments rather than psychological states:
It is common ground that:
(1) the speaker has said that p;
(2) he observes the maxims;
(3) he could not be doing this unless he was committed to q;
(4) he has done nothing to prevent q from becoming common ground;
(5) he is committed to the goal that q become common ground.
And so he has implicated that q. (Geurts, 2019, p. 21)
Although the schema is rather sketchy, it seems that it has the same explanatory capability as its Gricean counterpart. Of course, such a schema will be genuinely non-mentalistic only if all the elements in it have non-mentalistic descriptions and one might wonder whether the conversational maxims themselves can be rephrased in non-psychological terms. Geurts contends that the only maxim for which the reformulation might be problematic is the first maxim of quality (‘do not assert what you believe to be false’), since it is the only maxim that is cast in explicitly psychological terms. However, even in this case, he argues that the notions of belief and intention can be replaced without loss by appeal to the notion of commitments.
Of course, normative inferential approaches face independent challenges as well (for instance, many theorists have questioned whether commitments themselves can be spelt out without prior appeal to semantic content, while Fodor and Lepore (1992; 2001) famously objected to the holistic nature of such approaches). There may also be significant challenges to be faced by commitment-based accounts concerning how hearers identify exactly which commitments speakers have undertaken by their utterances, where this might be thought to require a prior grasp of linguistic meaning (in which case, to get off the ground, the normative inferential approach would need an independent account of how children acquire knowledge of word-meanings). However, if the problems for Intentionalist approaches which have been discussed here are ultimately found to hold good, then alternative approaches such as the normative inferential model will clearly deserve much further exploration.
5. Conclusion
This article began by asking how a gesture or utterance could have a meaning and how that meaning might come to be communicated amongst interlocutors. The starting point was the intuitively appealing idea that meaning and communication are tied to thought: an utterance (u) by a speaker (S) might communicate some content (p) to an audience (H) just in case p was the content S intended H to grasp. As §1 made explicit, spelling out this simple (Intentionalist) idea turns out to be pretty complex, leading Grice, the most famous advocate of the Intentionalist approach, to a three (or more) line definition of speaker-meaning which posited a complex set of (recursively specified) mental states. Grice’s model faces a range of objections. Opponents might query the necessity of Grice’s clauses (§§2.a-2.b) or argue that they are insufficient (§2.c), they might have a concern that the account fails to accommodate conventional speech acts (§2.d), or they might object to Grice’s proposed reduction of sentence-meaning to speaker-meaning (§2.e). Above and beyond these worries, however, it might also be objected that the starting premise—the idea that meaning and communication link inherently to thought—is mistaken, perhaps because such models can never be psychologically realistic (§3.a) or because they fail to cohere with developmental evidence about the relative priority of language acquisition over mental state reasoning (§3.b). Finally, two alternative approaches were surveyed, ones that seek to explain linguistic meaning and communication without assigning a constituent role to the content of thoughts. These two kinds of alternative approaches were the teleological model advocated by Millikan (§4.a) and the sort of normative-inferential model advocated by Brandom (§4.b). However, it is an open question whether these approaches can provide viable alternatives to the well-established Intentionalist account, since they might not be without their own significant problems. How we should understand meaning and communication, then, remains unsettled.
Apperly, Ian A. 2011. Mindreaders. The Cognitive Basis of “Theory of Mind”. Hove: Psychology Press.
Astington, Janet W., and Jodie A. Baird. 2005. Why Language Matters for Theory of Mind. Oxford: Oxford University Press.
Azzouni, Jody. 2013. Semantic Perception. How the Illusion of a Common Language Arises and Persists. Oxford: Oxford University Press.
Bach, Kent, and Robert Harnish. 1979. Linguistic Communication and Speech Acts. Cambridge, MA: MIT Press.
Battich, Lucas, and Bart Geurts. 2020. “Joint Attention and Perceptual Experience.” Synthese.
Borg, Emma. 2006. “Intention-Based Semantics.” In The Oxford Handbook to the Philosophy of Language, by Ernie Lepore and Barry Smith, 250-266. Oxford: Oxford University Press.
Borg, Emma. 2016. “Exploding Explicatures.” Mind & Language 31 (3): 335-355.
Brandom, Robert. 1994. Making it Explicit: Reasoning, Representing and Discursive Commitment. Cambridge, MA: Harvard University Press.
Clark, Herbert. 1992. Arenas of Language Use. Chicago, IL: University of Chicago Press.
Clark, Herbert. 1996. Using Language. Cambridge: Cambridge University Press.
Coady, Cecil. 1976. “Review of Stephen R. Schiffer, Meaning.” Philosophy 51: 102-109.
Drobnak, Matej. 2021. “Normative inferentialism on linguistic understanding.” Mind & Language.
Fodor, Jerry, and Ernest Lepore. 1992. Holism: A Shopper’s Guide. Oxford: Blackwell.
Fodor, Jerry, and Ernest Lepore. 2001. “Brandom’s Burdens: Compositionality and Inferentialism.” Philosophy and Phenomonelogical Research LXIII (2): 465-481.
Gauker, Christopher. 2003. Words Without Meaning. Cambridge, MA: MIT Press.
Geurts, Bart. 2018. “Convention and Common Ground.” Mind & Language 33: 115-129.
Geurts, Bart. 2019. “Communication as Commitment Sharing: Speech Act, Implicatures, Common Ground.” Theoretical Linguistics 45 (1-2): 1-30.
Geurts, Bart, and Paula Rubio-Fernandez. 2013. “How to Pass the False-Belief Task Before Your Fourth Birthday.” Psychological Science 24 (1): 27-33.
Geurts, Bart, and Paula Rubio-Fernandez. 2015. “Pragmatics and Processing.” Ratio XXVIII: 446-469.
Grice, Paul. 1957/1989. “Meaning.” In Studies in the Way of Words, 213-223. Cambridge, MA: Harvard University Press; first published in The Philosophical Review 66(3).
Grice, Paul. 1968/1989. “Utterer’s Meaning, Sentence-Meaning, and Word-Meaning.” In Studies in the Way of Words, 117-137. Cambridge,MA: Harvard University Press; first published in Foundations of Language 4.
Grice, Paul. 1969/1989. “Utterer’s Meaning and Intention.” In Studies in the Way of Words, 86-116. Cambridge, MA: Harvard University Press; first published in The Philosophical Review 78(2).
Grice, Paul. 1975/1989. “Logic and Conversation.” In Studies in the Way of Words, 22-40. Cambridge, MA: Harvard University Press; first published in Syntax and Semantics, vol.3, P. Cole and J. Morgan (eds.).
Grice, Paul. 1982/1989. “Meaning Revisited.” In Studies in the Way of Words, 283-303. Cambridge, MA: Harvard University Press; first published in Mutual Knowledge, N.V. Smith (ed.).
Grice, Paul. 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press.
Heyes, Cecilia. 2018. Cognitive Gadgets. The Cultural Evolution of Thinking. Cambridge, MA: Harvard University Press.
Kukla, Rebecca, and Mark Lance. 2009. ‘Yo!’ and ‘Lo!’: The Pragmatic Topography of the Space of Reasons. Cambridge, MA: Harvard University Press.
Levinson, Stephen. 1989. “A Review of Relevance.” Journal of Linguistics 25 (2): 455-472.
Lewis, David. 1969. Convention. A Philosophical Study. Cambridge, MA: Harvard University Press.
Marr, David. 1982. Vision. A Computational Investigation into the Human Representation and Processing of Visual Information. New York: W. H. Freeman and Company.
Michael, John, Natalie Sebanz and Günter Knoblich. 2016. “The Sense of Commitment: A Minimal Approach.” Frontiers in Psychology 6: 1968.
Millikan, Ruth G. 1984. Language, Thought and Other Biological Categories. New Foundations for Realism. Cambridge, MA: MIT Press.
Millikan, Ruth G. 2004. Varieties of Meaning. Cambridge, MA: MIT Press.
Millikan, Ruth G. 2017. Beyond Concepts: Unicepts, Language, and Natural Information. Oxford: Oxford University Press.
Moore, Richard. 2017a. “Convergent minds: ostension, inference and Grice’s third clause.” Interface Focus 7(3).
Moore, Richard. 2017b. “Gricean Communication and Cognitive Development.” The Philosophical Quarterly 67: 303-326.
Morris, Michael. 2007. An Introduction to the Philosophy of Language. Cambridge: Cambridge University Press.
Neale, Stephen. 1992. “Paul Grice and the Philosophy of Language.” Linguistics and Philosophy 15: 509-559.
Noveck, Ira. 2012. Experimental Pragmatics: The Making of a Cognitive Science. Cambridge: Cambridge University Press.
Pickering, Martin J., and Simon Garrod. 2004. “Toward a Mechanistic Psychology of Dialogue.” Behavioural and Brain Sciences 27: 169-190.
Platts, Mark. 1997. Ways of Meaning: An Introduction to a Philosophy of Language. Cambridge, MA: MIT Press.
Pyers, Jennie E., and Ann Senghas. 2009. “Language Promotes False-Belief Understanding: Evidence from Learners of a New Sign Language.” Psychological Science 20 (7): 805-812.
Rakoczky, Hannes, and Tanya Behne. 2019. “Commitment Sharing as Crucial Step Toward a Developmentally Plausible Speech Act Theory?” Theoretical Linguistics 45 (1-2): 93-97.
Schiffer, Stephen. 1972. Meaning. Oxford: Oxford University Press.
Searle, John. 1965. “What Is a Speech Act?” In Philosophy in America, by Max Black, 221-240. London: George Allen & Unwin Ltd.
Sperber, Dan, and Deirdre Wilson. 1995. Relevance: Communication and Cognition. London: Blackwell.
Sperber, Dan, and Deirdre Wilson. 2012. Meaning and Relevance. Cambridge: Cambridge University Press.
Stalnaker, Robert. 2002. “Common Ground.” Linguistics and Philosophy 25: 701-721.
Stalnaker, Robert. 2014. Context. Oxford: Oxford University Press.
Strawson, Peter. 1964. “Intention and Convention in Speech Acts.” The Philosophical Review 73 (4): 439-460.
Tomasello, Michael. 2008. Origins of Human Communication. Cambridge, MA: MIT Press.
Tomasello, Michael. 2019. Becoming Human. A Theory of Ontogeny. Harvard: Harvard University Press.
Warner, Richard. 2001. “Introduction.” In Aspects of Reason, by Paul Grice, vii-xxviii. Oxford: Oxford University Press.
Wellman, Henry M., David R. Cross, and Julanne Watson. 2001. “Meta-Analysis of Theory-of-Mind Development: the Truth about False Belief.” Child Development 72(3): 655 – 684.
Wilby, Michael. 2010. “The Simplicity of Mutual Knowledge.” Philosophical Explorations 13(2): 83-100.
Wilson, Deirdre. 2017. “Relevance Theory.” In The Oxford Handbook of Pragmatics, by Yan Huang, 79-101. Oxford: Oxford University Press.
Giordano Bruno was an Italian philosopher of the later Renaissance whose writings encompassed the ongoing traditions, intentions, and achievements of his times and transmitted them into early modernity. Taking up the medieval practice of the art of memory and of formal logic, he focused on the creativity of the human mind. Bruno criticized and transformed a traditional Aristotelian theory of nature and helped revive atomism. His advocacy of Copernicanism and the claim that there is an infinite number of worlds was innovative. In metaphysics, he elevated the concepts of matter and form to absolutes so that God and creation coincide. Bruno also advocated for a version of pantheism, and he probed the powers that shape and develop reality, including occult forces that traditionally belong to the discipline of magic. Most of his theories were made obsolete in detail with the rise of early modern empiricism; nevertheless, modern rationalism, which explored the relation between mind and world, and the modern critique of dogmatic theology were both influenced by Bruno’s philosophy.
Bruno was born in 1548 in southern Italy. He was educated in Naples, first by free-lance teachers, then at the Dominican convent of San Domenico Maggiore. After giving early indications of a provocative and critical attitude to Church teachings, he started the life of a migrant scholar that led him to Switzerland, France, England, and Germany. Throughout his travels, he continually tried to secure a position at a university, and he was frequently supported by monarchs and princes. While in Padua he was denounced a heretic by his host, a Venetian patrician, and he was interrogated by the Inquisition, first in Venice. In Rome, he was burned as an unrepentant heretic in 1600.
Bruno’s death at the will of the Catholic Church was immediately perceived as emblematic for the freedom of thought against dogmatic intolerance; this was especially due to an eyewitness report from the stake that spread via Protestant circles. John Toland, a freethinker himself, made Bruno a hero of anti-Christian propaganda. Bruno probably influenced Baruch Spinoza in his alleged pantheism, if not his atheism. As such, Bruno aroused the interest of defenders and critics of pantheism in the 18th and 19th centuries, until he was rediscovered as a critical thinker in his own right, one who broke with medieval traditions and paved the way to modern idealism.
Bruno was born in February 1548 in the historic town of Nola, a cultural center in Campania, Southern Italy (about 25 km northeast of Naples). The following summary of his life aims at introducing philosophical themes as they developed throughout his career. Most details of his early life are known through Bruno’s depositions at the Inquisition trial (Firpo 1993; Mercati 1988; Spampanato 2000; Canone 1992; 2000). He was given the name Filippo. His father Giovanni Bruno was in military service, and his mother was Fraulisa Savolino. Bruno began his education in his home town of Nola, and at age 15 he went to Naples where he studied with private and public teachers. One of them was Giovanni Vincenzo Colle, known as Il Sarnese (d. 1574); another known teacher was Theophilus Vairanus, a friar of the order of the Augustinians who later taught at an Augustinian school in Florence and became a professor at the University of Rome. He died in Palermo in 1578. Colle might have introduced the student to Averroism, as he defended the philosophy of Averroes, according to Hieronymus Balduinus (Destructio destructionum dictorum Balduini, Naples 1554). Vairanus probably taught the young student Augustinian and Platonic approaches to wisdom and might have inspired him to name some interlocutors in his Italian dialogues Teofilo.
At this early stage, Bruno also started studying the art of memory by reading Peter of Ravenna’s book Foenix, which likens the art of memory to the combination of paper and letters: images are ‘placed’ on an ideal chart so that the memorized content can be recalled in a quasi-mechanical way. Likely to finance his further education, Bruno entered the convent of the Dominicans in Naples, San Domenico Maggiore, a center of the order, where Thomas Aquinas once had resided, and which was an early stronghold of Thomism. Bruno acquired the name Giordano, which he maintained, with few exceptions, for the rest of his career in spite of his conflicts with the Church. After being consecrated a priest and enrolling in the study of theology, in 1575 Bruno concluded his studies with two theses that confirm the scholastic/Thomist character of the curriculum: “It is all true what Thomas Aquinas teaches in the Summa contra Gentiles” and “It is all true what the Magister Sententiarum says [Peter Lombard in his Sentences]”. Early during his novitiate, Bruno raised suspicion for giving away images of saints, specifically one of St. Catherine of Siena and perhaps one of St. Antonino, while maintaining only a crucifix. Around the same time, he scolded a fellow novice for reading a pious book in praise of the Virgin Mary. In view of his later productions, this incident can be read as indicating Bruno’s critique of the cult of saints, which makes him appear friendly with Protestantism. The accusation, however, was soon dropped. Nevertheless, about ten years later, in 1576, a formal process was opened returning to the earlier incident and adding new accusations regarding the authority of the Church Fathers, and the possession of books by Chrysostom and Gerome that included commentaries by Erasmus of Rotterdam, which were prohibited. This amounted to his excommunication. As a student in Naples, Bruno might have learned of Erasmus through a group of local heretics, the Valdensians, who adhered to the teachings of Juan de Valdes (d. 1541), who questioned the Christian doctrine of the Trinity. Bruno argued with another friar about scholastic argumentation and the possibility to express theological themes in other forms of argumentation. He adduced Arius as an example (an ancient heretic who denied the dual nature of Jesus as man and God), and the resulting investigation touched upon the essentials of Catholic theology. Bruno traveled to Rome, probably defending his case at the Dominican convent Santa Maria sopra Minerva (the center of the Inquisition and in charge of approving or prohibiting books), and then left the Dominican convent, thus starting his career as a wandering scholar.
Bruno traveled northern Italy (Genoa, Turin, Savona, and other places) and allegedly published a book De’ segni de’ tempi (Signs of the times) in Venice, which is lost and might have dealt with astronomy and meteorology in Italian; for he reported to have taught, around that time, lectures on the popular textbook of astronomy, Sphaera, by the 13th-century author Johannes de Sacrobosco. In 1579 Bruno arrived in Geneva, the preferred destination of religious refugees albeit a fortress of Calvinism. After working at a printing press, Bruno enrolled at the university and published a pamphlet against Antoine de la Faye, then professor of philosophy and, as a protégé of Theodore Beza, an eminent theologian and the successor to John Calvin. The content of the pamphlet is unknown, but as the result of a formal trial, Bruno was excommunicated from the Reformed Church and had to apologize. He left Geneva and moved to Toulouse, where he stayed from 1579 through 1581, again teaching the Sphaera and also on the soul according to Aristotle. In Toulouse, Bruno also met the Portuguese philosopher Francisco Sanchez (died 1623) who dedicated to Bruno his new book Quod nihil scitur (Nothing is known). The book is a manifesto of modern skepticism that upends traditional, scholastic reliance on logical argument. Bruno shared his critique of scholastic Aristotelian logic but trusted the potency of the human intellect; therefore, he despised Sanchez, a sentiment which is confirmed by a note in his copy, where he calls him a “wild ass.” France was troubled with interconfessional struggles, Huguenots (Reformed) against Catholics; when these tensions broke out, Bruno had to leave Toulouse and moved to Paris, where he hoped to impress King Henry III.
In Paris in 1582, Bruno published and dedicated to the King two of his first works, which treated in his peculiar way the art of memory that seems to have been of interest to the monarch. De umbris idearum (The shadows of ideas) is a theory of mind and reality, the annexed Ars memoriae (Art of memory) applies that to the construction of mental contents; at the same time Bruno dedicated to Henri d’Angoulême the Cantus Circaeus (Circe’s chant), Circe being a mythological figure that elicits humanity from animalistic appearance and, again, practices memory. In political terms, the philosopher opted with these dedications for the Catholic faction. The King had been interested in the theory of memory and offered the guest a provisional lectureship. Also in Paris, Bruno published a comedy Il candelaio (Chandler). With letters from the French King Bruno came to the embassy of Michel de Castelnau in London, where he stayed, close to the court of Queen Elizabeth, from 1583 through 1585.
England was in a complex political situation, given the tensions between the Protestant Queen Elizabeth of England and the Catholic King Philip II of Spain, each of whom was having to deal with religious dissension in their respective kingdoms; and, France mediated there. Bruno, who was not the only Italian dwelling in England at the time, befriended courtiers and intellectuals like Philip Sidney and John Florio and, vying for recognition and stability in London, produced six of his works in the Italian language (as it was fashionable at the court) commonly called the Italian Dialogues, the best known of his literary and philosophical legacy. Nevertheless, when a Polish diplomat, Albert Laski, was sent to Oxford to visit the university, Bruno joined him with the intent to find an academic position. He debated and gave lectures on various topics but eventually was ridiculed and accused of plagiarism. He thus left Oxford and returned to London, before heading to Paris when Castelnau was recalled to France.
In Paris, Bruno befriended Fabrizio Mordente and first promoted then chastised his geometrical work; this was the first time Bruno entered the field of mathematics. His interest was directed towards the ontological truth and practical application of geometry that he was to discuss in his works on physics and epistemology. He also published a summary of Aristotle’s Physics according to mnemotechnical principles (Figuratio physici auditus), thus showcasing his competence in scholastic philosophy. At the Collège de Cambrai, the institution for teachers sponsored by the King, Bruno arranged a public disputation in May 1586. As was customary at the time, a student of Bruno’s presented a number of the teacher’s theses, which were directed against Aristotle. These were at the same time printed and dedicated to King Henry III as Centum et viginti articuli de natura et mundo adversus Peripateticos (One hundred and twenty theses on nature and the world, against the Aristotelians) and reprinted in 1588 as Camoeracensis acrotismus (Cambrai lecture). The debate became tumultuous, Bruno left the lecture hall immediately and was next seen in Germany at the Calvinist University of Marburg.
Still in 1586, Bruno started as a private lecturer at Wittenberg University, the center of Lutheran education, where, among others, Philip Melanchthon had taught. The lectures covered mostly Aristotelian philosophy, some of them were published in the 19th century based on transcripts by a student. Bruno also publishes several works that apply the Lullian method to principles of research under the heading lampas (lamp) and composes a major work Lampas triginta statuarum (Torch of thirty statues), a cosmology according to Lullism and art of memory, in which every part of reality is scrutinized by thirty categories.
In 1588, Bruno leaves Wittenberg, on account of the rising influence of Calvinists. He delivered a programmatic “Farewell Lecture” (Oratiovaledictoria) and subsequently sought a position in Prague, where Rudolf II of Habsburg entertained a court of scholars and scientists, among whom were later the astronomers Tycho Brahe (d. 1601) and Johannes Kepler (d. 1630). In Prague, Bruno publishes a Lullian study and dedicates to the Emperor a critique of mathematics (Articuli adversus mathematicos) without professional success. He next moved on to Helmstedt, again a Lutheran university, where he stays from January 1589 through mid-1590. Here, Bruno garnered a third excommunication, because the Lutheran pastor appears to have detected some heresy (Omodeo 2011) in his thought. While in Helmstedt, Bruno worked on his trilogy, soon to be published in Frankfurt, and on several works that dealt with occult sciences and magic.
In 1590 Bruno traveled to Frankfurt where he published a trilogy of poems in hexameter (on the model of Lucretius) with prose commentaries that encompass his philosophy: De minimo, on the infinitely small, De monade, a theory of monads that are both metaphysical and physical minimal parts or atoms, and De immenso, the theory of the infinity of magnitude and innumerous worlds. From Frankfurt, Bruno traveled for a short time to Zurich, where he delivered lectures on the principles of metaphysics, which were later published as a compendium of metaphysical terms (Summa terminorum metaphysicorum). While in Frankfurt, Bruno received letters from a patrician in Venice, Giovanni Mocenigo, inviting him to give private lectures on the art of memory. Soon after he arrived in Venice in 1591, Bruno proceeded to the Venetian university in Padua to lecture on geometry hoping to find a permanent academic position. This attempt failed (Galileo Galilei later obtained that position) and he returned to Venice. His sponsor Mocenigo, however, was dissatisfied with Bruno’s service (most likely, he expected some magical practice) and denounced him in May 1592 to the Inquisition as a heretic. In early 1593, Bruno was transferred to the Inquisition in Rome where interrogations and investigations continued. His books were censured for heretical content. Among others, the accusations regard the eternity of creation, the equivalence of divine and created powers, the transmigration of the human soul, the soul as the form of the human body, the motion of the earth in relation to the teaching of the Bible, and the multitude of worlds. Pope Clement VIII ordered not to use torture as the case of heresy was proven, and Cardinal Robert Bellarmine, the author of a history of heresies (De controversiis, 1581-1593), presented a list of eight heretical propositions the defendant had to recant. Only two of the propositions are known: one questioning the sacrament of reconciliation, the other the theory of the soul as the helmsman of the body. Both tenets challenge the Christian doctrine of the individual soul and its afterlife. Bruno was declared a heretic, formally “unrepentant, pertinacious, and obstinate”, and thus delivered to the authorities who burned him at the stake on Campo de’Fiori in Rome on 17 February 1600.
b. Works
The standard editions of Bruno’s works are the 19th-century collection of his Latin writings initiated by Francesco Fiorentino (Bruno 1962) and the Italian dialogues edited by Giovanni Gentile and Giovanni Aquilecchia (Bruno 1958). These texts are also available online (see References below). Besides many separate text editions, there is a collection of Latin works in progress, commented with Italian translations, under the direction of Michele Ciliberto (Bruno 2000a; 2001; 2009; 2012). Bilingual editions with extensive commentaries of the Italian works were published with French translation under the direction of Giovanni Aquilecchia (Bruno 1993-2003) and with German translation under the direction of Thomas Leinkauf (Bruno 2007-2019).
Bruno’s works have unusual but meaningful titles. They are listed here in chronological order of publication or – for works published posthumously – of composition. No original manuscripts are extant. The list will help readers find them as they are mentioned in the following text.
De umbris idearum (The shadows of ideas) 1582
Cantus Circaeus ad memoriae praxim ordinatus (Circe’s chant applied to the practice of memory) 1582
De compendiosa architectura et complemento artis Lullii (Comprehensive construction and complement to the art of Lull) 1582
Candelaio (Chandler) 1582
Ars reminiscendi (Art of memory; reprint of dialogue II of Cantus Circaeus) 1583
Explicatio triginta sigillorum et Sigilli sigillorum (Unfolding of the 30 sigils and the sigil of sigils) 1583
London dialogues in Italian:
La cena de le ceneri (The Ash Wednesday supper) 1584
De la causa, principio e uno (Cause, principle, and one) 1584
De l’infinito, universo e mondi (The infinite, universe, and worlds) 1584
Spaccio de la bestia trionfante (Expulsion of the triumphant beast) 1584
Cabala del cavallo Pegaseo con l’aggiunta dell’Asino Cillenico
(Cabal of the Pegasus horse with the ass of Cyllene) 1585
De gli eroici furori (Heroic frenzies) 1585
Figuratio aristotelici physci auditus (Arrangement of the Physics of Aristotle) 1586
Dialogi duo de Fabricii Mordentis Salernitani prope divina adinventione (Two dialogues on Fabrizio Mordente’s almost divine invention) 1586
Dialogi. Idiota triumphans. De somnii interpretatione. Mordentius. De Mordentii circino (Dialogues: The triumphant idiot; Interpretation of a dream; Mordente; Mordente’s compass) 1586
Centum et viginti articuli de natura et mundo adversus Peripateticos (One hundred and twenty theses on nature and world, against the Aristotelians) 1586
De lampade combinatoria lulliana (Torch of Lullian combinatorics) 1587
De progressu et lampade venatoria logicorum (Procedure and searching torch of logicians) 1587
Artificium perorandi (The art of persuasion) 1587
Animadversiones circa lampadem lullianam (Advice regarding the Lullian torch) 1587
Lampas triginta statuarum (Torch of thirty statues) 1587
Oratio valedictoria (Farewell speech) 1588
Camoeracensis Acrotismus seu rationes articulorum physicorum adversus Peripateticos (Cambrai lecture, or arguments for the theses in physics against the Aristotelians) 1588
De specierum scrutinio et lampade combinatoria Raymundi Lullii (Investigation of species and Lullian combinatory torch) 1588
Articuli centum et sexaginta adversus huius tempestatis mathematicos atque philosophos (One hundred and sixty theses against mathematicians and philosophers of our times) 1588
Libri physicorum Aristotelis explanati (Aristotle’s Physics explained) 1588
Oratio consolatoria (Funeral speech) 1589
De rerum principiis, elementis et causis (Principles, elements, and causes of things) 1589-1590
De magia (Magic) 1589-1590
De magia mathematica (Mathematical magic) 1589-1590
Medicina lulliana (Lullian medicine) 1590
Frankfurt Trilogy
De triplici minimo et mensura (The threefold minimum and measure) 1591
De monade, numero et figura (Monad, number, and shape) 1591
De innumerabilibus, immenso et infigurabili (The innumerable, immense, and shapeless) 1591
De imaginum, signorum et idearum compositione (The composition of images, signs, and ideas) 1591
Summa terminorum metaphysicorum (Compendium of metaphysical terms) 1591
Theses de magia (Theses on magic) 1591
De vinculis in genere (Bonds in general) 1591
Praelectiones geometricae (Lectures in geometry) 1591
Ars deformationum (Art of geometrical forms) 1591
2. Major Philosophical Themes
Giordano Bruno’s philosophical production spanned only ten years. One is therefore not likely to detect major phases or turns in his development. While it is certainly possible to differentiate certain subdisciplines of philosophy in his work (epistemology, physics, metaphysics, mathematics, natural theology, politics), what is typical of his many writings is that almost all themes are present in all the treatises, dialogues, and poems. Therefore, it is feasible to outline his theories with some works, in which one aspect is prevalent, provided we remain aware of the interconnection with the rest.
a. Epistemology, Art of Memory, Theory of Spirit
Bruno entered the professional stage with his De umbris idearum (The Shadows of Ideas), which contains his lectures on mnemotechnic introduced by a dialogue between Hermes and a philosopher that explains its underlying philosophy. Hermes Trismegistus was a legendary Egyptian sage, the alleged source of Plato and of others, whose spurious writings became fashionable in the Renaissance for seemingly reconciling Christian with pagan wisdom. Bruno was one of the promotors of Hermeticism. The book explains the purpose of mnemotechnic or art of memory. Throughout history, memory was not only a subjective experience of remembering things but a specific faculty of the human mind that can be trained and directed. Memory consists in creating an internal writing that represents ideas as though the internal cognition were the shadow of the absolute ideas. Ideas are only worth pursuing if they are real. Bruno endorses strains of the Neoplatonic tradition, according to which the Platonic Forms are not more real than the visible world but equally present in that world. In that vein, Bruno explains the metaphor of shadow as not made of darkness (as one would think) but as being the vestige of light and, vice versa, the trace of reality in the ideal. Shadow is open for light and makes light accessible. Consequently, while human understanding is only possible by way of ‘shadows,’ that is, wavering between truth and falsehood, every knowledge is apt to be either deformed or improved towards the good and true. In an analogy from physics: in the same way as matter is always informed by some form, form can take on various kinds of matter, and in this sense, the intellect, including memory, can be informed by any knowledge. If it is true that whatever is known is known by way of ideas, then any knowledge is known by the mediation of something else. This mediation is the task of the art of memory. Bruno elaborates on these principles in repeated sequences of thirty chapters, headed as ‘intentions’ and ‘concepts’, which reiterate the pattern that everything is, in a way, a shadow of everything else. Based upon the universal and unbreakable concordance of things (horizontally and across layers), memory does nothing other than connect contents that on the surface are distinct. To make this approach plausible, Bruno invokes the ancient and Renaissance Platonists, as well as Kabbalah, which have in common to see any one thing as referencing everything else and truth as such. One example, borrowed from Nicholas of Cusa, is a rectangular line erected upon a basis: when the line inclines towards the basis, it not only makes the angle acute, it creates at the same time an obtuse angle so that both imply each other mutually, and in that sense the different is equal. In arguing this way, Bruno expressly abandons the traditional philosophical standards of determination and classification of the Aristotelian school; he dismisses them as ‘merely logical’ while postulating and claiming to construe a harmonious unity and plurality, which is at the same time correct as to concepts and controls reality.
All this gives a philosophical explanation of the technique of memorizing that had dominated Bruno’s reputation throughout his career from his early lectures up to his invitation to Venice. Memory is the constructive power of the soul inherited from the life-giving principle of the world, and thus it orders, understands, and senses reality. Artifice and nature coincide in their mutual principles; the internal forces and the external perception correlate. Traditional mnemotechnic construed concentric wheels with letters of the alphabet, which represented images and contents; such wheels provided means to memorize contents quasi mechanically. Bruno also endorses this technique with the philosophical argument that such technique applies this universal harmony in diversity, which structures the world, and recreates its intelligibility. The psychological experience of memorizing content consists of concepts, triggers, passive and active evocation of images, and ways of judgment. This art of memory claims that it conforms with both metaphysical truth and the creative power of the mind. At the same time, on the concentric circles are tokens that actualize the conversion of anything intended into anything, because, as stated, imagination is not plainly abstract but vivid and alive. Creating schemata (‘figuring out’) for memorization is an art and yet not artificial in the negative sense; it is the practical execution or performance of reality. Here is an example from De umbris idearum: Suppose we need to memorize the word numeratore (numberer), we take from the concentric wheels the pairs NU ME RA TO RE. Each pair is represented by an image, and together they form a sentence: NU=bee, ME=on a carpet, RA=miserable, and so on. This produces a memorizable statement describing an image: ‘A bee weaves a carpet, dressed in rags, feet in chains; in the background a woman holding out her hands, astride a hydra with many heads.’ (Bruno 1991, LXV). This appears farfetched, but less so if we consider that, with any single change of one of the pairs, we arrive at a different statement yet supported by the same imagined picture. Such a picture, though artificial, allows for a smooth transition from concept to concept; and such concepts capture the complexity of reality.
Without making any claims in this direction, Bruno practices what today is called semiotics, namely, the discipline that approaches understanding reality from the perspective of designating and intellectually processing. That is clear from the title of his last book on memory, Deimaginum, signorum et idearum compositione (The composition of images, signs, and ideas). Although this discipline deals with signification and its methods, it still relies on depicting reality as though it were ‘digesting’ it. A key concept in Bruno’s epistemology and metaphysics of memory is conversion (convertere, conversio). The purposefully arranged images that support remembrance are effective because one sign must be converted into a referent, and images convert according to schemata into new images or signs and representations. This is exercised in all of Bruno’s mnemotechnic works. Such transformations might appear arbitrary but are based on the constant transformation of psychic states and physical reality and on the intellectual activity of turning attention to an object. Love is an example of this conversion: profiting from Marsilio Ficino’s theory of love, Bruno claims that love not only dominates the relations between humans, and between God and humans, but is also the force that organizes the living and non-living world. This is only possible because love means that any of these can take on the essence of any other so that the loving and the beloved convert mutually into each other (Sigillus Sigillorum n. 158; Bruno 2009, 2:256, 472). Bruno’s interest in the art of memory was fueled by his Platonizing metaphysics, which seeks the convergence of the universe in one principle and the convertibility of truth into understanding. For this purpose, he also invoked the (late ancient) tradition of Hermeticism as visible in Hermes as the messenger of the philosophical principles and in making the sorceress Circe speaker of a treatise on memory. During his time in Germany, Bruno produced texts and excerpts on magic, on spellbinding (De vinculis), universal principles, mathematical and medical application of Lullism, and on the cosmology of the soul (Lampas triginta statuarum). All of them have the idea of transformation at their basis. Before modern empiricism and science based on mathematical calculus and projections, natural magic was a reputable discipline, which investigated the invisible ‘occult’ forces that drive events in nature, including spiritual powers. Bruno contributed to magical theories by raising the question of how those forces are related to empirical and metaphysical certainties. In his notes on magic, Bruno likens the interconnectedness of things to the ‘conversation’ in human languages by way of translation as transformation: in an analogous way, magical signs and tokens are hints that make understanding divinity possible (De magia, Bruno 2000a, 192–94 [1962 vol. III, 412]).
Bruno became best known for his Copernicanism and his end as a heretic. However, Bruno’s epistemology of memory, his cosmology, and his interest in magic are all convergent with the project of a universal theory of everything that, by definition, purports the unity of thought, existence, and objectivity. This can be seen in two literary descriptions of subjectivity and objectivity. In the Heroic Frenzies, Bruno narrates the mythos of Actaeon who in search of the goddess Diana is turned into a deer and eaten by his own dogs. The dogs allegorize the human intellect and the convergence of knowledge and object, a dissolution in which knowledge achieves its end. The other example can be found in Bruno’s De immenso: Bruno remembers his childhood experience that from the mountain Cicala of his home town the Vesuvius looked menacing, but from the Vesuvius the familiar mountain was alien. From this, he concluded that the center of the world is wherever one stands and that the world has no physical boundaries (Bruno 1962, vol. I 1, 313-317). His cosmology is based on an epistemology that aims at overcoming the divide between theory, practice, and objective truth.
b. Physics and Cosmology
Bruno’s fame as a heretic was closely linked to his opposition to Aristotelian physics and his endorsement of Copernicanism, which was particularly pronounced in three of the Italian dialogues and in the Frankfurt trilogy. Since Nicolaus Copernicus had introduced a planetary system in which the earth orbits around the sun—as opposed to the astronomy of Ptolemy that explained the movement of planets with circles and epicycles around the earth—, Bruno discusses the question of whether this is only a mathematical model or reality. He points out that both models are mutually convertible, with the sun taking the place of the earth. However, preferable is not what is easier to calculate, more plausible, or more traditional (indeed, all this is the case in the Copernican model, including his reliance on ancient philosophy) but what is compelling on all fronts. If it is true that the earth moves, it must be so because it is “possible, reasonable, true, and necessary” (Ash Wednesday Supper III, Bruno 2018, 153; 1958, 131). The emphasis lies on the combination of possibility and truth; for a theory cannot suffice in being plausible and true, in a way, it also has to be necessary, so that what is possible is also real. If the planetary orbs are not mere hypothetical objects of mathematical calculation, philosophy has to accept them as real movements and explain how this motion comes about. Whatever changes is driven by an effective principle, which is necessarily internal; and that maxim applies to the stars as well as to magnetism and animal sex (Ash Wednesday Supper III, Bruno 2018, 123; 1958, 109). Copernicanism, for Bruno, is not only one segment of knowledge about reality, it is symptomatic of how human understanding of the powers of the world works. Astronomy exemplifies that mathematics is more than calculus; it is the real structure of the world (as the Pythagoreans had taught), and in being intellectual it has to be a reality that transcends the material and is inherent in everything.
In this context, Bruno equates the world to a machine and to an animal: as any living being is composed of distinct parts, which would not exist without the whole, so is the world one diverse organism that is composed of distinct parts. When he returns to Copernicus and discusses astronomy in great detail and with some modifications in his De immenso, Bruno reiterates that the universe is one, to the effect that there cannot be any first-mover (beyond the ultimate sphere in Aristotelian astronomy); rather, the earth and everything else is animated with the soul as the center of every part (De immenso III 6, Bruno 1962, vol. I 1, p. 365). In Aristotelian natural philosophy, the soul was the incorporeal principle of movement of living bodies. Bruno transfers and applies this notion to the universe. Hence it follows for him that there is a world soul, that the heavenly spheres are animated, that all planets are alike (that is, the earth is as much a planet as any other), that the number of spheres and suns is innumerable or even infinite, and that nature and God are identical insofar as God is omnipresent. This is the reason why Bruno famously extended the Copernican ‘closed world’ to an open and infinite universe. Copernicus had made the sun the center of the universe and assigned the earth and the planets their orbits accordingly, but he did not expressly deny the Aristotelian theory that the world is finite in extension. Bruno went a step further and inferred from the equivalence of all planets and the infinite power of God that the universe must be infinite. God, causation, principles, elements, active and passive potencies, matter, substance, form, etc. are all parts of the One and distinguished only by logical conceptualization, as it is inevitable in human discourse (De immenso VIII 10, Bruno 1962, vol. I 2, p. 312). These cosmological ideas have led later readers to the interpretation that Bruno was a pantheist, identifying God and nature, both being the whole of the universe; they also could talk of atheism if he meant to say that God is nothing but a name for natural mechanisms. The terms ‘atheism’ and ‘pantheism’ were coined later, but as a matter of fact, these possible interpretations dominated the reception of Bruno from the mid-18th century in relation to Baruch Spinoza while others insisted that God’s absolute immanence admits for some sort of transcendence and distinction from the finite world (see below section 4).
c. Metaphysics and Mathematics
To consolidate this novel approach to traditional themes, Bruno had to rearrange philosophical terminology and concepts. In his De la causa, he addressed the scholastic philosophy of cause and principle, matter and form, substance and accident, and also one and many. In Aristotelian causality, finality was the dominating force, and, in Christian thought, that had been identified with God who governs the world. Bruno correlated universal finality with the internal living power and controlling reason in all things. Accordingly, if God is usually understood as beyond the world and now identified as the internal principle, the distinction between internal and external causation vanishes. Bruno uncovers the conceptual problems of Aristotelian causality, which includes matter and form as two of the principles: if they are only descriptors of things, they are not real, but if they are supposed to be real, they need to be matching to the extent that there is no matter without form, no form without matter, and both are co-extensive. Prime matter in school philosophy is either nothing (prope nihil, for lack of form) or everything and receptacle of all forms. What is logically necessary to be kept distinct, such as forms and matter or the whole and its parts, is metaphysically one and also as infinite as all potentialities. Bruno closes his dialogue on Cause, Principle, and the One with an encomium of the One. Being, act, potency, maximum, minimum, matter and body, form and soul – all are one, which harkens back to Neoplatonist themes. However, in the fifth dialogue, Bruno challenges this praise of unity by raising the question of how it is at all possible to have individual or particular items under the canopy of oneness. He pronounces the adage “It is profound magic to draw contraries from the point of union,” in other words: how is plurality at all possible if all is one?
In his Frankfurt trilogy, Bruno unfolds the interconnection of nature, understanding, metaphysics, and mathematics. In his dedication to Duke Henry Julius, Bruno announces its contents: De minimo explains the principles of understanding as the foundational project while relying on sensation; it belongs to mathematics and deals with minimum and maximum in geometrical figurations. De monade, numero, et figura traces imagination and experience at the basis of research, which conforms to language and is quasi-divine; its theme is the monad as the substance of things and the basis of unity, diversity, and relationality. De immenso, innumerabilibus et infigurabili shows the order of the worlds with proofs that factually nature is visible, changing, and composed of elements, heaven, and earth, and yet an infinite whole (Bruno 1962, vol. I 1, 193-199; 2000b, 231–39). The theory of monads encompasses three elements: the geometrical properties of points, the physical reality of minimal bodies (atoms), and the cognitive method of starting with the ultimate simple when understanding complexity. In this function, the monad is the link between the intellectual and the physical realms and provides the metaphysical foundation of natural and geometrical investigations. Thus the monad is what makes everything particular as something singular; at the same time it constitutes the wholeness of the universe, which is made up of infinitely many singular things and necessarily infinite. With his monadology, and the research into the minimal constituents of thought and being, Bruno revived ancient atomism, as adopted from Lucretius. There is no birth nor death in the world, nothing is truly new, and nothing can be annihilated since all change is but a reconfiguration of minimal parts, monads, or atoms (depending on the context of the discourse). The concept of mathematical, geometrical, and physical atoms is the methodical channel to relate things and ideas with one another and to explain the existence of distinctions out of the One, thus turning geometrical and physical theories into metaphysics.
Mathematics, for Bruno, is geometry in the first place because arithmetic performs quantitative calculations of measurements of figures that are defined geometrically. Therefore, in his Articuli adversus mathematicos of 1588, he establishes the methodical and ontological chain of concepts that leads from the mind – via ideas, order, definitions, and more – down to geometrical figures, the minimum, and the relation of larger and smaller. Bruno’s geometry, inspired by Euclid and Nicholas of Cusa, becomes the paradigmatic discipline that encompasses the intelligibility and reality of the world. Precisely for the sake of intelligibility, Bruno does admit infinitude in the extension of the world, but not in the particulars: the infinitely small would be unintelligible, and therefore there is a minimum or atom that terminates the process of ever smaller division. Not only is the earth one star among many, the sphere in a geometrical sense becomes the infinite as such if taken to be universal; ‘universal’ now meaning ubiquitous, logically referring to everything, and physically encompassing the universe. Since such infinity escapes human measurement, if not intelligence, any quantity has to originate from the minimal size that is not just below sensibility, rather, it is the foundational quality that constitutes any finite measurement. Without the minimum, there is no quantity by analogy to the thought that unity is the substance of number and the essence of anything existing. It is the minimum that constitutes and drives everything.
One consequence of this line of thinking is Bruno’s understanding of geometry in the technical sense. In his De minimo and also in his Padua lectures of 1591, he explains that all geometrical figures, from lines to planes and bodies, are composed of minima. A line, then, is the continuation of a point, which is defined as first part or as end, and the line is the fringe of a plane. This entails that angles are also composed of minimal points which then build up the diverging lines. Bruno suggests angles are generated by gnomons (equal figures that when added combine to a like figure), namely minimal circles that surround a minimal circle; thus gnomons create a new circle, and through tangents, at the points of contact these create lines that spread out. Invoking the ancient atomist Democritus, Bruno created an atomistic geometry and claimed to find a mathematical science that is not merely arithmetic but conforms with the essence of the world (Bruno 1962, vol. I 3, 284 and 183-186; 1964).
Bruno’s scholarly language abounds with analogies, parallels, repetitions, metaphors, and specifically with serial elaboration of properties; after all, the mnemotechnic and Lullist systems were permutations of terms, too. Bruno also never misses a chance to play with words. For instance, he spells the Italian word for reason as ‘raggione’ with duplicated letter g: in this spelling, reason and ray (ragione and raggio) become associated and suggest that thought is a search light (notice the recurrence of the term lampas – torchlight) into reality and illuminated by truth. This poetic and ingenious handling of language parallels his understanding of geometry that makes up reality: visible reality is structured by measurable, construable, and retraceable proportions and is intelligible only with recourse to the perfection of geometrical relations. Thus, all understanding consists in appropriation of the unfolding of the minimum, which is a creative process that reconstructs what there is. This also applies to human life. In his Heroic Frenzies, he uses the motto Amor instat ut instans (Love is an insistent instant): the momentary experience of love lasts and keeps moving, it is like the minimal point in time that lasts forever and keeps pushing (Bruno 1958, 1066–69).
d. Religion and Politics
The question Bruno had to face through his life, until his condemnation as a heretic, was in what sense he agreed with basic tenets of Christianity. He advocated some kind of natural theology, that is, an understanding of God that is revealed not only in sacred texts but first and foremost in what is theologically to be called creation, and philosophically the contingent and finite world. Reality and human understanding contain clues for the ultimate meaning and demand means of explanation (such as the absolute, infinity, minimum, creativity, power). Revelation as that of the Bible competes with literary and mythological sources and with hidden powers as investigated in magic (in that regard, Bruno is an heir to Renaissance humanism). Much of this can be found in his depositions during the Inquisition trials. His major work on the question of religion was the Expulsion of the Triumphant Beast, a staged conversation among the Greek Gods about virtues and religion. Bruno sets religion into a historical and relativist context: Christianity, Egyptian, and Greek or Jewish traditions have in common that they try to represent the divine in the world and to edify the souls of humans. Thus, religious symbolism works like mnemotechnic imagery that may and must be permutated according to the circumstances of the community. The altar (i.e., the place of the eucharistic mystery) represents the core of religious meaning and is, in Bruno’s age of Protestantism and Counter Reformation, the relic (as he terms it) of the “sunken ship of religion and divine cult” (Bruno 1958, 602). Since “nature is God in things” (Bruno 1958, 776), the divine is as accessible through nature as the natural is through the divine. Consequently, worship means the continuous reformation of the individual towards better understanding the world and the self. Established religious cults have two valid purposes: inquiry into the ultimate principle of reality and communal order for all members of society who lack the means of true philosophy. Inspired by the debates among humanists and reformers (for instance, Thomas More, Erasmus of Rotterdam, Martin Luther, John Calvin, Juan Valdès), Bruno inaugurated two branches of religion, namely, natural religion and political theology. It was probably the discovery of this dialogue of the Triumphant Beast by the inquisitors in 1599 that sealed Bruno’s fate as a heretic (Firpo 1993, 100, 341).
3. Reception
Bruno’s afterlife as a character and a philosopher (Canone 1998; Ricci 1990; 2007) shows a great variety, not mutually exclusive, of interpretations of his philosophy. The history of his fame encompasses many possible interpretations of his philosophical aims. During the trial of Bruno in Rome, the German recent convert Kaspar Schoppe wrote a witness report of the testimony, which was first published in print by the Hungarian Calvinist Peter Alvinczi to vilify the Catholics (Alvinczi 1621, 30–35; Schoppe 1998). With his fate, Bruno began to symbolize the struggle between religious persecution and freedom of thought, thus inaugurating philosophical modernity. Bruno was heralded as the individualist “knight errant of philosophy” (Bayle 1697, 679) and associated with Baruch Spinoza’s alleged atheism, according to which all things are attributes of God. Already in 1624, Marin Mersenne included Bruno in his polemic against deists, libertines, and atheists. He reported that Bruno taught that circle and straight line, point and line, surface and body are all the same thing, advocated the transmigration of souls, denied God’s freedom for the sake of infinite worlds, but also held that God could have made a different world, and he reduced miracles to natural data (Mersenne 1624, pts. I, 229–235; II, passim). To this Charles Sorel responded by emphasizing the literary style of Bruno’s Latin poems. Both interpretations inspired an anonymous treatise by the title J. Brunus redivivus (Bruno revived, 1771) that discussed whether Bruno’s cosmology was atheistic (Sorel 1655, 238–43; Del Prete 1995; Schröder 2012, 500–504). The historian Jacob Brucker (Brucker 1744, IV 2:12–62) presented Bruno as the first thinker to promote modern eclecticism while also pointing to the Neoplatonic elements of his thought. Around 1700, John Toland identified the Spaccio as a manifesto against the Church and published Schoppe’s report and an English translation of the introductory letter to De l’infinito; in correspondence with Gottfried Wilhelm Leibniz, he branded Bruno as pantheist (Begley 2014). A debate in German Protestant circles in the late 18th century on Spinozism as the epitome of atheism prompted Friedrich Heinrich Jacobi to publish letters to Moses Mendelssohn, to which, in 1789, he added excerpts from Bruno’s De la causa with the intent to prove thereby that Spinoza was an atheist. These excerpts were actually free paraphrases from dialogues II to V that culminated in the praise of the infinite essence that is cause and principle, one and all. As a consequence of the fascination with Bruno, Nicholas of Cusa was rediscovered, being among his philosophical sources. Friedrich Schelling, contrary to Jacobi’s intentions, incorporated Bruno in the program of German Idealism in a book of 1802 with the title Bruno presenting Bruno’s purported pantheism as a step towards idealist philosophy of unity and identity. Using excerpts from Jacobi’s paraphrase, he termed the above-quoted adage about “drawing contraries from the point of union” the symbol true philosophy. From there Georg Wilhelm Friedrich Hegel discovered Bruno’s Lullism and art of memory as a “system of universal determinations of thought” or the coincidence of nature and creative mind (Hegel 1896, II 3 B 3, p. 130), which prompted the Catholic philosopher Franz Jakob Clemens to present Bruno as a precursor of Hegel. In England, Bruno’s brief sojourn left an impression that was less academically philosophical and more literary and legendary, which fits his disastrous appearance in Oxford. There is an ongoing debate on to what extent he might have influenced Christopher Marlowe and William Shakespeare. Henry Percy, Earl of Northumberland, collected Bruno’s works and wrote essays that appear to be inspired by Bruno’s literary style and themes, while members of his circle appreciated Bruno in the context of the rise of modern science. It was the philosopher-poet Samuel Taylor Coleridge who later, in the wake of the Spinoza debate, brought Bruno back from Germany to England. Rita Sturlese discovered that the reception of Bruno can be retraced through the fate of single copies of his early editions (Sturlese 1987). The 19th century inaugurated philological and historical research on Bruno and his Renaissance context. Italian scholars and intellectuals made Bruno a hero of national pride and anticlericalism (Rubini 2014; Samonà 2009): Vincenzo Gioberti and Bertrando Spaventa claimed that the German idealism from Spinoza to Hegel had its antecedent in Bruno, the Italian renegade (Molineri 1889; Spaventa 1867, 1:137–267). The edition of his Latin works was started in 1879 as a national project, and a monument was erected on Campo de’Fiori in Rome. In the second half of the 20th century, Bruno was perceived both as the turning point into modernity and as the heir of ancient occultism (Blumenberg 1983 [originally 1966], pt. 4; Yates 2010 [originally 1964]).
4. References and Further Reading
a. Bruno Editions
Bruno, Giordano. 1950. “On the Infinite Universe and Worlds [De l’infinito Universo et Mondi, English].” In Giordano Bruno, His Life and Thought, by Dorothea Waley Singer, 225–380. New York: Schuman.
Bruno, Giordano. 1957. Due dialoghi sconosciuti e due dialoghi noti: Idiota triumphans – De somnii interpretatione – Mordentius – De Mordentii Circino. Edited by Giovanni Aquilecchia. Roma: Ed. di Storia e Letteratura.
Bruno, Giordano. 1958. Dialoghi italiani. Edited by Giovanni Gentile and Giovanni Aquilecchia. Firenze, Sansoni.
Bruno, Giordano. 1962. Jordani Bruni Nolani opera latine conscripta. Edited by Francesco Fiorentino and Felice Tocco. [Napoli / Firenze, 1879-1891]. Stuttgart-Bad Cannstatt: Frommann-Holzboog.
Bruno, Giordano. 1964. Praelectiones geometricae, e Ars deformationum. Edited by Giovanni Aquilecchia. Roma: Edizioni di Storia e letteratura.
Bruno, Giordano. 1991. De umbris idearum. Edited by Maria Rita Pagnoni-Sturlese. Firenze: Olschki.
Bruno, Giordano. 1993-2003. Oeuvres complêtes = Opere complete. Edited by Giovanni Aquilecchia. 7 vols. Paris: Les Belles Lettres.
Bruno, Giordano. 1998. Cause, Principle, and Unity and Essays on Magic. Translated by Robert De Lucca and Richard J. Blackwell. Cambridge, U.K.: Cambridge University Press.
Bruno, Giordano. 2000a. Opere magiche. Edited by Simonetta Bassi, Scapparone, Elisabetta., and Nicoletta Tirinnanzi. Milano: Adelphi.
Bruno, Giordano. 2000b. Poemi filosofici latini. Ristampa anastatica delle cinquecentine. Edited by Eugenio Canone. La Spezia: Agorà.
Bruno, Giordano. 2001. Corpus iconographicum: le incisioni nelle opere a stampa. Edited by Mino Gabriele. Milano: Adelphi.
Bruno, Giordano. 2002. The Cabala of Pegasus. Translated by Sidney L. Sondergard and Madison U. Sowell. New Haven: Yale University Press.
Bruno, Giordano. 2004. Opere mnemotecniche. Edited by Marco Matteoli, Rita Sturlese, and Nicoletta Tirinnanzi. Vol. 1. 2 vols. Milano: Adelphi.
Bruno, Giordano. 2007-2019. Werke [Italienisch – Deutsch]. Edited by Thomas Leinkauf. 7 vols. Hamburg: Meiner.
Bruno, Giordano. 2009. Opere mnemotecniche. Edited by Marco Matteoli, Rita Sturlese, and Nicoletta Tirinnanzi. Vol. 2. 2 vols. Milano: Adelphi.
Bruno, Giordano. 2012. Opere lulliane. Edited by Marco Matteoli. Milano: Adelphi.
Bruno, Giordano. 2013. On the Heroic Frenzies. Degli Eroici Furori. Tranlsated by Ingrid D. Rowland. Toronto: University of Toronto Press.
Bruno, Giordano. 2018. The Ash Wednesday Supper. Translated by Hilary Gatti. (Italian/English). Toronto: University of Toronto Press.
b. Other Primary Sources
[Alvinczi, Péter]. 1621. Machiavellizatio, qua unitorum animos Iesuaster quidam dissociare nititur. Saragossa [Kassa]: Ibarra.
Bayle, Pierre. 1697. Dictionaire historique et critique: par Monsieur Bayle. Tome premier premiere partie. A-B. Rotterdam: Leers.
Brucker, Jakob. 1744. Historia critica philosophiae: A tempore resuscitatarum in occidente litterarum ad nostra tempora. Vol. IV 2. Leipzig: Breitkopf.
Clemens, Franz Jakob. 2000. Giordano Bruno und Nicolaus von Cusa. Eine philosophische Abhandlung [1847]. Edited by Paul Richard Blum. Bristol: Thoemmes Press.
Firpo, Luigi. 1993. Il processo di Giordano Bruno. Edited by Diego Quaglioni. Roma: Salerno.
Hegel, Georg Wilhelm Friedrich. 1896. Lectures on the History of Philosophy: Medieval and Modern Philosophy. Translated by E. S. Haldane and Frances H. Simson. Vol. 3. London: Paul, Trench, Trübner.
J. Brunus redivivus, ou traité des erreurs populaires, ouvrage critique, historique & philosophique, imité de Pomponace: Premiere partie. 1771.
Jacobi, Friedrich Heinrich. 2000. Über die Lehre des Spinoza in Briefen an den Herrn Moses Mendelssohn. Edited by Irmgard Maria Piske and Marion Lauschke. Hamburg: Meiner.
Mercati, Angelo, ed. 1988. Il sommario del processo di Giordano Bruno; con appendice di documenti sull’ eresia e l’Inquisizione a Modena nel secolo XVI. Reprint of the 1942 edition. Modena: Dini.
Mersenne, Marin. 1624. L’ impiété des deistes, et des plus subtils libertins découverte, et refutee par raisons de theologie, et de philosophie. Paris: Billaine.
Molineri, G. C., ed. 1889. Vincenzo Gioberti e Giordano Bruno. Due lettere inedite. Torino: Roux.
Schelling, Friedrich Wilhelm Joseph. 1984. Bruno, or, On the Natural and the Divine Principle of Things. Translated by Michael G. Vater. [Origninally 1802]. Albany: State University of New York Press.
Schelling, Friedrich Wilhelm Joseph. 2018. Bruno oder über das göttliche und natürliche Princip der Dinge: Ein Gespräch. [2nd. ed., Berlin: Reimer, 1842]. Berlin: De Gruyter.
Schoppe, Kaspar. 1998. “Brief an Konrad Rittershausen.” Edited by Frank-Rutger Hausmann. Zeitsprünge. Forschungen zur frühen Neuzeit 2 (3-4: Kaspar Schoppe): 459–64.
Sorel, Charles. 1655. De la perfection de l’homme. Paris: de Nain.
Spaventa, Bertrando. 1867. Saggi di critica filosofica, politica e religiosa. Vol. 1. Napoli: Ghio.
Toland, John. 1726. A Collection of Several Pieces of Mr. John Toland: Now Just Published from His Original Manuscripts, with Some Memoirs of His Life and Writings. 2 vols. London: Peele.
c. Secondary Sources
Begley, Bartholomew. 2014. “John Toland’s On the Manner, Place and Time of the Death of Giordano Bruno of Nola.” Journal of Early Modern Studies, Bucharest 3 (2): 103–15.
Blum, Elisabeth. 2018. Perspectives on Giordano Bruno. Nordhausen: Bautz.
On religion, politics and language.
Blum, Paul Richard. 2012. Giordano Bruno: An Introduction. Translated by Peter Henneveld. Amsterdam: Rodopi.
Blum, Paul Richard. 2016. Giordano Bruno Teaches Aristotle. Translated by Peter Henneveld. Nordhausen: Bautz.
The specific reception of Aristotle’s philosophy.
Blumenberg, Hans. 1983. The Legitimacy of the Modern Age. Translated by Robert M. Wallace. [Originally 1966]. Cambridge, Mass.: MIT Press.
Cusanus and Bruno as landmarks of modernity.
Bönker-Vallon, Angelika. 1995. Metaphysik und Mathematik bei Giordano Bruno. Berlin: Akademie Verlag.
The importance of mathematics.
Bruniana & Campanelliana (Journal since 1995)
Canone, Eugenio, ed. 1992. Giordano Bruno: Gli anni napoletani e la “peregrinatio” europea: immagini, testi, documenti. Cassino: Università degli studi. [Excerpts at “Archivio Giordano Bruno – Studi e Materiali”].
An edition of biographic documents.
Canone, Eugenio, ed. 1998. Brunus Redivivus: Momenti della fortuna di Giordano Bruno nel XIX secolo. Pisa: IEPI.
Canone, Eugenio, and Germana Ernst, eds. 2006. Enciclopedia bruniana e campanelliana. 3 vols. Pisa – Roma: IEPI – Serra.
A dictionary of concepts.
Carannante Salvatore. 2018. Unigenita Natura. Edizioni di Storia e Letteratura.
Nature and God in Bruno.
Catana, Leo. 2008. The Historiographical Concept “System of Philosophy”: Its Origin, Nature, Influence and Legitimacy. Leiden: Brill.
Catana, Leo. 2017. The Concept of Contraction in Giordano Bruno’s Philosophy. Routledge.
Ciliberto, Michele. 2002. L’occhio di Atteone: Nuovi studi su Giordano Bruno. Ed. di Storia e Letteratura.
Epistemological questions in Bruno.
De Bernart, Luciana. 2002. Numerus quodammodo infinitus: Per un approccio storico-teorico al dilemma matematico nella filosofia di Giordano Bruno. Roma: Edizioni di storia e letteratura.
The role of mathematics.
Del Prete, Antonella. 1995. “L‘univers Infini: Les Interventions de Marin Mersenne et de Charles Sorel.” Revue Philosophique de La France et de l’Étranger 185 (2): 145–64.
Bruno’s reception in France.
Eusterschulte, Anne, and Henning S. Hufnagel. 2012. Turning Traditions Upside Down: Rethinking Giordano Bruno’s Enlightenment. Budapest: Central European University Press.
Faracovi, Ornella Pompeo, ed. 2012. Aspetti della geometria nell’opera di Giordano Bruno. Lugano: Agorà.
Geometry in Bruno’s philosophy.
Gatti, Hilary. 1999. Giordano Bruno and Renaissance Science. Ithaca: Cornell University Press.
Gatti, Hilary, ed. 2002. Giordano Bruno: Philosopher of the Renaissance. Aldershot: Ashgate.
Gatti, Hilary. 2011. Essays on Giordano Bruno. Princeton, N.J.: Princeton University Press.
Gatti, Hilary. 2013. The Renaissance Drama of Knowledge: Giordano Bruno in England. London: Routledge.
Granada, Miguel Ángel, and Dario Tessicini, eds. 2020. Giordano Bruno, De immenso. Letture critiche. Pisa – Roma: Serra.
Detailed interpretations of the single books of De immenso.
Granada, Miguel Angel, Patrick J. Boner, and Dario Tessicini, eds. 2016. Unifying Heaven and Earth: Essays in the History of Early Modern Cosmology. Barcelona: Edicions de la Universitat de Barcelona.
Kodera, Sergius. 2020. “The Mastermind and the Fool. Self-Representation and the Shadowy Worlds of Truth in Giordano Bruno’s Candelaio (1582),” Aither – Journal for the Study of Greek and Latin Philosophical Traditions 23 (7): 86–111.
Matteoli, Marco. 2019. Nel tempio di Mnemosine: L’arte della memoria di Giordano Bruno. Pisa: Edizioni della Normale.
The study of memory in Bruno.
Mendoza, Ramon G. 1995. The Acentric Labyrinth: Giordano Bruno’s Prelude to Contemporary Cosmology. Shaftesbury: Element Books.
Mertens, Manuel. 2018. Magic and Memory in Giordano Bruno. Leiden: Brill.
Omodeo, Pietro Daniel. 2011. “Helmstedt 1589: Wer exkommunizierte Giordano Bruno?” Zeitschrift für Ideengeschichte 5 (3): 103–14.
On the excommunication of Bruno by Lutherans.
Omodeo, Pietro Daniel. 2014. Copernicus in the Cultural Debates of the Renaissance: Reception, Legacy, Transformation. Leiden: Brill.
Ordine, Nuccio. 1996. Giordano Bruno and the Philosophy of the Ass. New Haven: Yale University Press.
On the metaphor of ‘asininity’ and its history.
Ricci, Saverio. 1990. La fortuna del pensiero di Giordano Bruno, 1600-1750. Firenze: Le Lettere.
Bruno’s reception in the 17th and 18th centuries.
Ricci, Saverio. 2007. Giordano Bruno nell’Europa del Cinquecento. Milano: Il Giornale.
Historic contexts of Bruno’s activities.
Rowland, Ingrid D. 2008. Giordano Bruno: Philosopher/Heretic. New York: Farrar, Straus and Giroux.
Rubini, Rocco. 2014. The Other Renaissance: Italian Humanism between Hegel and Heidegger. Chicago: University of Chicago Press.
Saiber, Arielle. 2005. Giordano Bruno and the Geometry of Language. Aldershot: Ashgate.
Language and mathematics.
Samonà, Alberto, ed. 2009. Giordano Bruno nella cultura mediterranea e siciliana dal ’600 al nostro tempo: Atti della Giornata nazionale di studi, Villa Zito, Palermo, 1 marzo 2008. Palermo: Officina di Studi Medievali.
Schröder, Winfried. 2012. Ursprünge des Atheismus: Untersuchungen zur Metaphysik- und Religionskritik des 17. und 18. Jahrhunderts. Stuttgart: Frommann-Holzboog.
A history of atheism from 1600 through 1900.
Spampanato, Vincenzo. 2000. Vita di Giordano Bruno con documenti editi ed inediti. Edited by Nuccio Ordine. [First ed. 1921]. Paris/Torino: Les Belles Lettres, Aragno.
Most complete biography of Bruno.
Sturlese, Rita. 1987. Bibliografia censimento e storia delle antiche stampe di Giordano Bruno. Firenze: Olschki.
Bibliography of every single copy of the first prints of Bruno’s works.
Tessicini, Dario. 2007. I dintorni dell’infinito: Giordano Bruno e l’astronomia del Cinquecento. Pisa: Serra.
Bruno’s Copernicanism in 16th-century context.
Traversino, Massimiliano. 2015. Diritto e teologia alle soglie dell’età moderna: Il problema della potentia Dei absoluta in Giordano Bruno. Napoli: Editoriale scientifica.
Bruno in the context of juridical and theological debates.
Yates, Frances. 2010. Giordano Bruno and the Hermetic Tradition. [First 1964]. London: Routledge.
Bruno’s reading of occultist sources related to Pseudo-Hermes Trismegistus.
d. Online Resources
“Archivio Giordano Bruno – Studi e Materiali.” http://www.iliesi.cnr.it/AGB/.
“Bibliotheca Bruniana Electronica: The Complete Works of Giordano Bruno.” The Warburg Institute. https://warburg.sas.ac.uk/research/research-projects/giordano-bruno/download-page.
“Enciclopedia Bruniana e Campanelliana.” http://www.iliesi.cnr.it/EBC/entrate.php?en=EB.
“La biblioteca ideale di Giordano Bruno. L’opera e le fonti.” http://bibliotecaideale.filosofia.sns.it.
Author Information
Paul Richard Blum
Email: prblum@loyola.edu
Palacký University Olomouc
Czech Republic
and
Loyola University Maryland
U.S.A.
The Bhagavad Gītā
The Bhagavad Gītā occurs at the start of the sixth book of the Mahābhārata—one of South Asia’s two main epics, formulated at the start of the Common Era (C.E.). It is a dialog on moral philosophy. The lead characters are the warrior Arjuna and his royal cousin, Kṛṣṇa, who offered to be his charioteer and who is also an avatar of the god Viṣṇu. The dialog amounts to a lecture by Kṛṣṇa delivered on their chariot, in response to the fratricidal war that Arjuna is facing. The symbolism employed in the dialog—a lecture delivered on a chariot—ties the Gītā to developments in moral theory in the Upaniṣads. The work begins with Arjuna articulating three objections to fighting an impending battle by way of two teleological theories of ethics, namely Virtue Ethics and Consequentialism, but also Deontology. In response, Kṛṣṇa motivates Arjuna to engage in battle by arguments from procedural ethical theories—specifically his own form of Deontology, which he calls karma yoga, and a radically procedural theory unique to the Indian tradition, Yoga, which he calls bhakti yoga. This is supported by a theoretical and metaethical framework called jñānayoga. While originally part of a work of literature, the Bhagavad Gītā was influential among medieval Vedānta philosophers. Since the formation of a Hindu identity under British colonialism, the Bhagavad Gītā has increasingly been seen as a separate, stand-alone religious book, which some Hindus treat as their analog to the Christian Bible for ritual, oath-swearing, and religious purposes. The focus of this article is historical and pre-colonial.
The Bhagavad Gītā (Gītā) occurs at the start of the sixth book of the Mahābhārata —one of South Asia’s two main epics. Like the Rāmāyaṇa, it depicts the god Viṣṇu in avatāra form. In the Rāmāyaṇa, he was Rāma; in the Mahābhārata he is Kṛṣṇa. This time, Viṣṇu is not the protagonist of the whole epic, but unlike the Rāmāyaṇa, here he shows awareness of his own identity as Īśvara or Bhagavān: Sovereignty. While moral theory is a topic of discussion in both epics, the Bhagavad Gītā is a protracted discourse and dialog on moral philosophy. The text itself, as an excerpt from an epic, was received variously in South Asian traditions. To some philosophers, such as those who grounded their theorizing on the latter part of the Vedas, a position known as Vedānta, the Bhagavad Gītā, though a smṛti (a historical document) and not a śruti (revealed text like the Vedas or scripture), nevertheless plays a prominent role in constituting a source of argument and theory. The major Vedānta philosophers, Śaṅkara, Rāmānuja and Madhva, all wrote commentaries on the Gītā. Importantly, the Bhagavad Gītā is very much part of South Asia’s history of popular philosophy explored in literature, which unlike the Vedas, was widely accessible. It informs South Asian understandings of Kṛṣṇa, the warrior philosopher, who is a prominent incarnation of Viṣṇu. What is unique about this exploration of philosophy is that it happens on a battlefield, prior to a fratricidal war, and it addresses the question of how we can and should make tough decisions as the infrastructure of conventions falls apart.
2. The Eighteen Chapters of the Gītā
The BhagavadGītā contains eighteen chapters (books), which were originally untitled. Hence, editions and translations frequently include title headings that are created for publication. The Śrī Vaiṣṇava philosopher, Yāmunācārya (916-1041 CE), in his Summary of the Import of the Gītā (Gītārtha-saṅgraha), divides the Gītā into three parts, each with six chapters. The first hexad concerns, on his account, an emphasis on karma yoga (a deontological perfection of duty) and jñānayoga (the Gītā’s metaethics, or elucidation of the conditions of ethical reasoning). The middle hexad emphasizes bhakti yoga, the Gītā’s label for the position also called Yoga in the Yoga Sūtra and other philosophical texts: The right is action in devotion to the procedural ideal of choice (Sovereignty), and the good is simply the perfection of this practice. This engagement in bhakti yoga, according to Yāmunācārya’s gloss on the Gītā, is brought about by karma yoga and jñānayoga (v.4). The last hexad “which subserves the two preceding hexads,” concerns metaphysical questions related to the elaboration of Yoga. Specifically, it explores and contrasts nature (prakṛti), or explanation by causality, and the self (puruṣa), or explanation by way of responsibility. Īśvara, or sovereignty, is the proper procedural ruler of both concerns. The last hexad summarizes earlier arguments for karma yoga, bhakti yoga, and jñānayoga. What follows below summarizes the chapters.
Chapter 1 concerns Arjuna’s lament: Here, we hear Arjuna’s three arguments against fighting the impending war, each based on one of the three theories of conventional morality: Virtue Ethics, Consequentialism, and Deontology.
Chapter 2 initiates Kṛṣṇa’s response. Kṛṣṇa extols a basic premise of Yoga: Selves (persons) are eternal abstractions from their lives, and hence cannot be confused with the empirical contingencies that befall them. This is meant to offset Arjuna’s concern for the welfare of those who would be hurt as a function of the war. Here we hear of the first formulations of karma yoga and bhakti yoga.
Kṛṣṇa here articulates the idea that blameless action is done without concern for further outcome, and that we have a right to do what we ought to do, but not to the further outcomes of activity (Gītā 2.46-67). This radical procedural frame for moral reasoning Kṛṣṇa defines as “yoga” (Gītā 2.48), which is skill in action (2.50).
Chapter 3 introduces karma yoga in further detail. The chapter begins with Arjuna concerned about a contradiction: Kṛṣṇa apparently prefers knowledge and wisdom, and yet advocates fighting, which produces anxiety and undermines clarity. Kṛṣṇa’s response is that action is unavoidable: no matter what, we are choosing and doing (even if we choose to sit out a fight). Hence, the only way to come to terms with the inevitability of choice is to choose well, which is minimally to choose to do what one ought to do, without further concern for outcome. This is karma yoga. Here we learn the famous formula of karma yoga: better one’s own duty poorly performed than someone else’s performed well (Gītā 3.35). Kṛṣṇa, the ideal of action (Sovereignty), is not exempt from this requirement either. Rather, the basic duty of Kṛṣṇa is to act to support a diversity of beings (Gītā 3.20-24). This too is the philosophical content of all duty: Our duty constitutes our contribution to a diverse world and a pedagogic example to others to follow suit. Chapter 4 focuses on bhakti yoga, or the practice of devotion. As Kṛṣṇa is the ideal of right action, whose activity is the maintenance of a diverse world of sovereign individuals responsible for their own actions, the very essence of right action is devotion to this ideal of Sovereignty. Chapter 5 introduces jñānayoga, or the metaethical practice of moral clarity as a function of the practice of karma yoga. Chapter 6 picks up threads in previous comments on yoga, bringing attention to practices of self-regulation that support the yogi, or one engaging in skillful action.
Chapter 7 shifts to a first-person account of Sovereignty by Kṛṣṇa and the concealment of this procedural ideal in a world that is apparently structured by nonnormative, causal relations. Chapter 8 distinguishes between three classes of devotees. Chapter 9 explores the primacy of the ideal of Sovereignty and its eminence, while Chapter 10 describes the auspicious attributes of this ideal. Chapter 11 explores Arjuna’s dramatic vision of these excellences, but it is one that shows that the moral excellence of the procedural Ideal of the Right is not reducible to the Good, and logically consistent with both the Good and the Bad. Chapter 12 returns to the theme of bhakti yoga and its superiority.
Chapter 13 turns to the body and it being a tool and the seat of responsibility: the self. Chapter 14 explores a cosmological theory closely associated with Yoga, namely the idea that nature (prakṛti) is comprised of three empirical properties—sattva (the cognitive), rajas (the active), and tamas (the inert)—and that these empirical characteristics of nature can conceal the self. In chapter 15, the supreme Self (Sovereignty) is distinguished from the contents of the natural world. Chapter 16 contrasts the difference between praiseworthy and vicious personality traits. Chapter 17 focuses on the application and misapplication of devotion: Outcomes of devotion are a direct function of the procedural excellence of what one is devoted to. Devotion to Sovereignty, the ultimate Self, is superior to devotion to functionaries. Chapter 18 concludes with the excellence of renouncing a concern for outcomes via Yoga. Kṛṣṇa, speaking as the ideal, exhorts Arjuna to not worry about the content of ethics (dharma): He should focus instead on approximating the procedural ideal as the means of avoiding all fault.
3. Just War and the Suppression of the Good
The Gītā and the Mahābhārata have garnered attention for their contribution to discussions of Just War theory (compare Allen 2006). Yet, as most accounts of South Asian thought are fuelled by an interpretive approach that attempts to understand the South Asian contribution by way of familiar examples from the Western tradition, the clarity of such accounts leaves much to be desired (for a review of this phenomenon in scholarship, see Ranganathan 2021). Explicated, with a focus on the logic of the arguments and theories explored as a contribution to philosophical disagreement—and not by way of substantive beliefs about plausible positions—we see that the Mahābhārata teaches us that the prospects of just war arise when moral parasites inflict conventional morality on the conventionally moral as a means of hostility. Parasites effect this hostility by acting belligerently against the conventionally moral, while relying on the goodness of the conventionally moral to protect them from retaliation in response to their belligerence. Any such retaliation would be contrary to the goodness of conventional morality and hence out of character for the conventionally moral. The paradox here is that, from the perspective of the conventionally moral, this imposition of conventional moral standards is not wrong and is good. However, it is the means by which moral parasites exercise their hostility to the disadvantage of the conventionally moral. Prima facie, it would be just for the conventionally moral to retaliate as moral parasites act out of the bounds of morality. However, the moment that the conventionally moral engage such parasites in war, they have departed from action as set out by conventional morality, and it would appear that they thereby lack justification. This standing relative to conventional moral expectations is the same as the parasite’s. This was Arjuna’s problem at the start of the Gītā. Arjuna indeed explicitly laments that fighting moral parasites would render him no better (Gītā 1.38-39).
A procedural approach to ethics, such as we find in the Gītā, transcends conventional morality especially as it deprioritizes the importance of the good (karma yoga). Indeed, it rejects the good as a primitive moral notion in favour of the right (bhakti yoga) and thereby provides an account of the justice of those who wage war on moral parasites: The justice of the war of Arjuna and other devotees of Sovereignty should be measured by their fidelity to procedural considerations of the right, and not to considerations of the good. Arjuna and other just combatants fight as part of their devotion to Sovereignty and hence conform their behavior to an ultimate ideal of justice: that all concerned should be sovereign and thus made whole. Hence, just war (jus ad bellum) and just conduct in war (jus in bello) come to the same thing: For the just cause is devotion to the ideal, and right action is the same. In contrast, those who are not devoted to the regulative ideal fail to have a just cause, or just action in war. Jeff McMahan’s conclusion in his Killing in War (2009), that those who fight an unjust cause do wrong by fighting those whose cause is just, is entailed by bhakti yoga. However, McMahan appears to claim that the justice of a war is accounted for not by a special set of moral considerations that come into effect during war, but the same considerations we endorse during times of peace. Yet in times of peace it appears that conventional morality wins the day, vitiates against war, and all parties depart from it when they wage war—or at least, this seems to be the analysis of the Mahābhārata. It is because there are two competing moral frames—conventional morality of the good and the proceduralism of the right, or Yoga/Bhakti—that we can continue to monitor the justice of war past the breakdown of conventional moral standards (for more on the just war theory here, see Ranganathan 2019). It is because of the two standards that Yoga/Bhakti can constitute an ultimate standard of moral criticism of the right even as the conventional moral standards of the good that characterize peace deteriorate under the malfeasance of parasites.
With respect to success, we see that the Gītā also has a story to tell about which side wins the war. As the bhakti yogi is committed to a process of devotion to sovereignty, their behavior becomes sovereign in the long run and hence their success is assured. Moral parasites, in contrast, are not engaged in an activity of self-improvement. Their only means of survival—taking advantage of the conventionally moral—now lacking (as the conventionally moral have renounced conventional morality to become devotees of Sovereignty), renders them vulnerable to defeat by devotees of Sovereignty. Moral parasites only have the one trick of taking advantage of the conventionally moral, and the transition to bhakti yoga on the part of the formerly conventionally moral deprives parasites of their victims and source of sustenance.
4. Historical Reception and the Gītā’s Significance
The relationship of the Gītā to what is known as Hinduism, and to what we understand as religion, is more complicated and problematic than a straightforward philosophical study of the Gītā. In a world dominated by Western imperialism, it is common to take religious designations at face value, as though they are dispositive of the “religious” traditions and not an artifact of colonialism. An historical claim commonly made, as we find in the Encyclopedia of World Religions, is that the “Bhagavad-Gītā” is “perhaps the most widely revered of the Hindu scriptures.” The expectation that the Gītā is a religious work leads to the notion that there is some type of thematic religious development in the text that is distinct from the philosophy it explores. So, for instance, the same entry suggests that the religious theme of the opening lines of the Gītā is to be found when Arjuna (the protagonist) is faced with a fratricidal war. “The problem for Arjuna is that many other revered figures, such as Arjuna’s teacher, are fighting for his cousins. Seeing in the ranks of the enemy those to whom he owes the utmost respect, Arjuna throws down his bow and refuses to fight” (Ellwood and Alles 2008: 49-50). That is not at all how events unfold, however. Arjuna, upon arriving at the battlefield, provides three distinct arguments based on three prominent ethical theories that comprise what we might call conventional morality (Virtue Ethics, Consequentialism, and Deontology) and then concludes on the strength of these objections that he should not fight. Expecting to distinguish the thematic development from the philosophy in the Gītā is like attempting to distinguish the thematic development in a Platonic dialogue from the philosophy: It cannot be done without great violence—and the fact that we might expect this as possible in the case of South Asian philosophy but not in the case of Plato is inconsistent. Moreover, the gloss that the Gītā is scripture is mistaken on points of history. Historically, and in the South Asian tradition, the Gītā was not thought of as scripture. Indeed, “scripture” is often reserved to designate texts that are thought to have a revelatory character, like the Vedas, and are called śruti (what is heard). The Gītā, in contrast, was thought to be an historical or commemorative document, or smṛti (what is remembered), as the Mahābhārata, of which it is a part, was regarded as such historical literature. Calling it scripture is ahistorical. The motivation to regard the Gītā as a religious text is no doubt derivable from the uncritical acceptance of the Gītā as a basic text of Hinduism. By analogy to other religions with central texts, the Gītā would apparently be like a Bible of sorts. In this case, the confusion arises because of the ahistorical projection of the category, “Hinduism,” on to the tradition.
As Western powers increased their colonial hold on South Asia, there was pressure to understand the South Asian traditions in terms of a category of understanding crucial to the West’s history and methodology of alterity: religion (Cabezón 2006). Historical research shows that it was under the British rule of South Asia that “Hindu”—originally a Persian term meaning “Indus” or “India”—was drafted to identify the indigenous religion of South Asia, in contrast to Islam (Gottschalk 2012). By default, hence, anything South Asian that is not Islam is Hinduism. Given its baptismal context that fixes its referent (compare Kripke 1980), “Hinduism” is a class category, the definition of which (something like “South Asian, no common founder”) need not be instantiated in its members, and the function of which is rather to describe Hindu things at the collective level. “Hinduism” as a class category is much like the category “fruit salad”: Fruit salad is a collection of differing pieces of fruit, but members of a collection that is fruit salad need not be, and would not be, a collection of different pieces of fruit. Indeed, it would be a fallacy of composition to infer from the collective definition of “fruit salad” that there is something essentially fruit salad about pieces of fruit salad. Similarly, at the collective level, we might include the Gītā among Hindu texts because the collection is definable as being South Asian but with no common founder. It would be a fallacy of composition, though, to infer that the Gītā bears defining traits of being Hindu, or even religious, for that matter, as these characterize the collection, not the members. If, as history shows, the only things that world religious traditions share is that they have non-European origins, that the philosophical diversity across all things religious is equivalent to philosophical diversity as such, and that religious identity was manufactured as a function of the Western tradition’s inability to explain and ground non-Western philosophical positions in terms of the Western tradition (Ranganathan 2018b), then Hindu texts would be treated as essentially religious and not primarily philosophical because of their South Asian origins. This depiction of texts such as Gītā as religious, however, like the historical event of defining Hinduism, is a straightforward artifact of Western colonialism, and not a trait of the texts being studied under the heading of Hinduism.
Historically, to be Hindu is apparently to share nothing except philosophical disagreements on every topic: One can be an evolutionary materialist and atheist, as we find in the SāṅkhyaKārikā, or take a deflationary view about the reality of the Gods while endorsing Vedic texts, as we find in PūrvaMīmāṃsā works, and be a most orthodox Hindu merely because one’s philosophical views are South Asian and because they can be grouped in the category of South Asian, with no common founder (Ranganathan 2016a, 2018b). Yet, the common expectation is that religions are kinds, not classes, that specify criteria of inclusion that are instantiated by their members, as this is true of virtually every other religion. Under this particular set of expectations—that examples of Hindu things must exemplify something distinctly Hindu—the Bhagavad Gītā has come to be valued not merely as a popular contribution to moral philosophy, but as the Hindu equivalent to the Christian Bible, something one can swear oaths on, and can look to for religious advice (compare Davis 2015). Attempting to project this colonial development back onto the tradition, though commonplace, is mistaken. It generates the perception that what we have in the Gītā is not primarily philosophy, as we have decided to ignore it. The depiction of the Gītā as essentially religious, and not contingently religious given the colonial artefact of religious identity, is a self-fulfilling prophecy that arises when we do not pay attention to the history of South Asian philosophy as relevant to understanding its texts because we have assumed, as a function of the colonial history that makes up religious identity, that such texts are religious.
5. Vedic Pre-History to the Gītā
While tempting to read the Gītā in a vacuum, knowing something about the development of moral theory in South Asian traditions sheds light on many aspects of the Gītā. It constitutes a response to the Jain (Virtue Ethics), Buddhist (Consequentialism), and Pūrva Mīmāṃsā (Deontology) options (Ranganathan 2017a), but it also constitutes a slight criticism of Deontology too, which it provisionally endorses (explored at greater length in section 7, Basic Moral Theory and Conventional Morality). The Jain and Buddhist options, as options of Virtue Ethics and Consequentialism, are versions of teleology: They prioritize the Good over the Right in moral explanation. Deontology and Yoga/Bhakti are versions of proceduralism: They prioritize the Right over the Good in moral explanation. The critical move away from teleology to proceduralism constitutes the history of moral reasoning in the Vedic tradition.
The very earliest portions of the Vedic tradition begin with the Mantra (chants) and Brāhmaṇa (sacrificial instruction manuals) sections, along with forest books (Āraṇyaka) that provide theoretical explanations of the sacrifices. All, and especially the Mantra section, speak of and to an Indo-European, nomadic culture. Like all early Indo-European cultures, whether in ancient Persia or Greece, there is evidence of the worship of nature gods as a means of procuring benefits. The logic of this paradigm is teleological: The good ends of life, such as victory over enemies, safety for one’s kin and self, as well as the material requirements for thriving (food and land) are the goal, and the worship of gods of nature are hypothesized as the means. One section of the AitareyaĀraṇyaka is revealing of a proto-empirical hypothesis: that the need for eating is generated by fire, and it is (fire) that is the consumer of food (I.1.2.ii). The sacrificial offering just is food (I.1.4,vii). If it is ultimately fire that is hungry, and the sacrifice is how we enact feeding our debt to fire, then the sacrifice is the ritualization of metabolism: the burning of calories.
The key to actualizing this flourishing according to the Aitaraya Brāhmaṇa is a distinction between sacrifice and victim. This distinction requires a certain moral sensitivity. Hence, the presiding priests at the end of an animal sacrifice mutter, “O Slaughterers! may all good you might do abide by us! and all mischief you might do go elsewhere.” This knowledge allows the presiding priest to enjoy the flourishing made possible by the sacrifice (AitarayaBrāhmaṇa 2.1.7, p. 61).
One of the curious features of the worldview that acknowledges that it is forces of nature that create such requirements is that, in feeding them, we are really transferring an evil that would befall us onto something else. Hence, in order to avoid being undermined by the forces of nature ourselves, we need to find a sacrificial victim, such as an animal, and visit that evil on it: That allows us to pay our debt to the forces of nature and thrive. It is no longer the forces of nature and their propitiation that lead us to the good life: It is rather a matter of the ritual of feeding natural requirements that secures the good life. In this equation, one element that is not reducible is evil itself. Indeed, the very rationale for the ritual is to avoid an evil of scarcity. The Brāhmaṇa quoted already notes that during the course of a sacrifice, the blood of the victim should be offered to evil demons (rākṣasas). This is because, by offering blood to the demons, we keep the nourishing portion of the sacrifice for ourselves (AitarayaBrāhmaṇa 2.1.7, pp. 59-60). This is an admission that appeasing the gods of nature is part of a system of ressentiment, where we must understand the goods in life as definable in relation to evils we want to avoid (for further exploration of these texts and themes, see Ranganathan 2018c).
The total appreciation not only of the goods of life that we wish to achieve, the pressure to achieve them by way of natural forces, and the desire to appease such forces in order to gain the goods, leaves much to be desired. The system creates a crisis that is managed by feeding it. Furthermore, as the system is teleological, it organizes moral action around the good, which unlike the right (what we can do) is outside of our control.
What we find in the Upaniṣads (dialogues)—the latest installation to the corpus of Vedic texts—is a radical reorientation of practical explanation. Whereas the earlier part was concerned primarily with the good as a source of justifying right procedure, we find a switch to the focus on the center of agency and practical rationality, the Self or ātmā, but also a related substance that it is often identified with: Development, Growth, Expansion (Brahman). Interpreted from a Eurocentric backdrop, Brahman is like a theistic God, for Brahman appears to play a role similar to a theistic God in the belief system of theists. Explicated—that is if we understand this theoretical entity as a contribution to philosophical disagreement—its identification with the Self entails a theory where the primary explanation of reality is not by way of a good, but a procedure (of Development) that is identifiable with the paradigm Self, or what it is to be an agent. While the Upaniṣads do not all agree or say exactly the same thing about the self and Brahman—often it seems to talk about many selves related to Brahman, sometimes only one paradigm self and Brahman)—it is often raised in relationship to ideas we find central to yoga, or meditation, such as the concept of breath, itself a procedure internal to animal agency.
One of the more revealing dialogues in the Upaniṣads that sheds light on this procedural shift is the Kaṭha Upaniṣad, specifically the dialogue concerning the young boy Nachiketa.
In the famous Kaṭha Upaniṣad, the young boy Nachiketa is condemned to death by his father (conducting a solemn sacrifice to the gods) in response to the boy’s pestering question: “To whom will you sacrifice me?” “To death,” his father utters in irritation. It is in an official context. So the boy is sacrificed and travels to the abode of the God of Death, Yama, who is absent. Upon returning after three days, Yama offers the young boy three boons to make up for his lack of hospitality. Two boons are readily granted: The first is returning to his father, and the second is knowledge of a sacrifice that leads to the high life. Last, Nachiketa wants to know: What happens to a person after they die—do they cease to exist, or do they exist? Yama tries to avoid answering this question by offering wealth—money, progeny, and the diversions of privilege. Nachiketa rejects this, on the grounds that “no one can be made happy in the long run by wealth,” and “no one can take it with them when they come to you [that is, Death].” He objects that such gifts are short-lived. Death is inevitable, so he wants the answer. The boy is persistent, and Yama relents. He begins his response by praising the boy for understanding the difference between the śreya (control) and pre-ya (literally “advance-movement,” that is, utility, the offering for or gain of the sacrifice): the foolish are concerned with the preya (what Yama tried to give the boy), but the wise with control.
Yama continues with his allegory of the chariot. According to Yama, the body is like a Chariot in which the Self sits. The intellect (buddhi) is like the charioteer. The senses (indriya) are like horses, and the mind (mānasa) is the reins. The Enjoyer is the union of the self, senses, mind, and intellect. The objects of the senses are like the roads that the chariot travels. People of poor understanding do not take control of their horses (the senses) with their minds (the reins). Rather, they let their senses draw them to objects of desire, leading them to ruin. According to Yama, the person with understanding reins in the senses with the mind and intellect (Kaṭha Upaniṣad I.2). This is explicitly called Yoga (Kaṭha Upaniṣad 2.6). Those who practice yoga reach their Self in a final place of security—Viṣṇu’s abode. This is the place of the Great Self (Kaṭha Upaniṣad I.3). There is no evil here.
What we have in the Kaṭha Upaniṣad is a very early articulation of the philosophy of Yoga as we find it in Patañjali’s Yoga Sūtra and the Gītā’s defense of bhakti yoga. In Patañjali’s Yoga Sūtra (a central, systematic formulation of Yoga philosophy), we find no mention of Viṣṇu. However, we do find that Patañjali defines Yoga in two ways. First, he defines it as an end: the (normative, or moral) stilling of external mental waves of influence (YS I.2). This involves bringing one’s senses and mind under one’s rational control. Second, Patañjali identifies yoga as accomplished by an approximation to Sovereignty, which is analyzable into unconservativism and self-governance (cf. Ranganathan, 2017b). This fits the pattern of the theory of Bhakti/Yoga, which identifies and defines right action as the approximation to a procedural ideal. When Patañjali moves to describe yoga not as an end (the stilling of external waves of influence) but as a practice, he further analyzes the project of Yoga into three procedural ideals: Īśvara praṇidhāna (approximating Sovereignty, unconservativism, and self-governance), tapas (unconservativism), and svā-dhyāya (self-governance) (YS II.1). Rarely noted, the three procedural ideals are celebrated in a popular South Asian model of Ādi Śeṣa (the cosmic snake) floating over a sea of external waves of influence depicted as the Milk Ocean (the ends of yoga) as he is devoted not only to Viṣṇu (a deity depicted as objectifying himself as harmful manifestations such as the disk and mace, which do not constrain him, thereby showing himself to be untouched by his own choices and thereby unconservative) but also Viṣṇu’s partner Lakṣmī: the goddess of intrinsic value and wealth, shown as a lotus, sitting on herself, and holding herself, thereby self-governing. Thus devotion to Sovereignty (Ādi Śeṣa) analyzes Sovereignty into two further procedural ideals—unconservativism (Viṣṇu) and its partner, self-governance (Lakṣmī)—all the while floating over receding waves of influence. What this common tableau of South Asian devotional practice literally depicts is the absolute priority of the right procedure (the three procedural ideals floating) over the good outcome (the stilling of waves of external influence).
In the model we find from Death in the KaṭhaUpaniṣad, there is no explicit reference to Lakṣmī on her own, but much is made of self-governance as something geared toward a realm controlled by Viṣṇu. Hence, already in the Vedas, we have a theory of radical procedural ethics, governed by an approximation to a procedural ideal of Viṣṇu (tapas, self-challenge, unconservativism), and such a model is implicit in the other great work of Yoga of South Asian traditions, the Yoga Sūtra.
One of the outcomes of Death’s argument, as he explicitly states, is that life lived wisely gets rid of teleological reasoning, and replaces it with a procedural emphasis on self-governance and control. Looking back on the very beginnings of the Vedic literature, a dialectic becomes apparent, which takes us from teleological reasoning to procedural reasoning. The motivation to move to a procedural approach is to get rid of luck from the moral equation and replace it with the ideals of unconservativism and self-governance. The Kaṭha Upaniṣad then represents a trend in the Vedic tradition to treat teleological considerations—practical arguments focused on the good—as a foil for a procedural approach to practical rationality. Ranganathan has called this dialectic the Moral Transition Argument (MTA): the motivation of a procedural approach to practical rationality on the basis of a dissatisfaction with a teleological approach. Freedom, mokṣa, is an ambiguous condition of this process, but a certain outcome of perfecting a procedural approach to life. Brahman, Development, is the metaphysical formalization of this idea that reality is not an outcome or a good, but a process to be perfected (Ranganathan 2017c).
There are of course further problems that arise from MTA, such as the paradox of development. We need to be free to engage in a procedural approach to life, for such practice is a matter of self-determination, and yet, as people who have not mastered a procedural ethic, we are less than free to do as we choose. By analogy, we can consider the challenge of learning an art, such as playing the guitar. We need some degree of freedom to approximate the procedural ideal of playing a guitar, and this approximation constitutes practicing the guitar, but in a state of imperfection, we cannot play any tune or composition on the guitar we wish: The freedom to engage in this craft and art is something that is the outcome of much practice. It is, all things considered, a state with a low expected utility: Even if we do practice regularly, there is no guarantee that we will become as proficient as Jimi Hendrix or Pat Metheny. The movement to a procedural metaphysics—of understanding reality not as a good outcome, but as a work in progress (Brahman), gives some reason for optimism: It is in the very nature of reality to be dynamic, and so we should not assume that our current state of incapacity is a metaphysical necessity. However, and more practically, Yoga provides an additional response: It is commitment to the regulative ideal of a practice—the Lord—that makes our freedom to do as we choose possible, but this freedom is not a good that we can organize our practice around: It is rather a side effect of our commitment to the ideal.
The authors of the Mahābhārata and especially the Gītā, which appears to be an interpolation in the wider epic, must have been quite conscious of the KaṭhaUpaniṣad (Jezic 2009); hence, the deliberate use of a chariot as a scene for the discourse of the Gītā, where Kṛṣṇa (Viṣṇu) delivers arguments reminiscent of Death’s lecture to Nachiketa, is no accident. Yet, whereas the KaṭhaUpaniṣad depicts Viṣṇu as a ruler of a distant realm, which we attain when we have mastered the rigors of yoga, here in the Gītā itself, Viṣṇu is the one who delivers the lecture, but also the advice that he ought to be sought after, as the ideal to be approximated and emulated. Also, whereas in the Kaṭha Upaniṣad the charioteer is the intellect, here Kṛṣṇa’s assumption of the role of the charioteer furthers the role he plays in the Gītā to be the voice of reason in the face of adversity and peril. In using the KaṭhaUpaniṣad as the metaphorical backdrop of the dialogue, the authors of the Gītā script Kṛṣṇa to elaborate Death’s lesson to the boy Nachiketa. Death’s argument was that in facing the possibility of danger as something to be avoided, we survive Death, not as a personal misfortune, but as a potential public mishap that we avoid by taking a procedural approach to life. Life after Death is not brought about by avoiding struggle or danger, but by mastering oneself. Just as in the KaṭhaUpaniṣad there is a criticism of the earlier teleological goals of the Vedas, so too in the Gītā do we find Kṛṣṇa persistently criticizing the language and goading rationality of the Vedas, which motivates by way of selfish desires. But in the case of the Gītā, the authors use these elements to bring into the picture the teleological considerations of the earlier Vedic tradition, but also Buddhist and Jain arguments, not to mention a refined Pūrva Mīmāṃsā Deontology—as seen in Arjuna’s three arguments for not fighting. What these arguments have in common is that they appeal to the good in some form, and together they mark out the scope of conventional morality—morality that can be conventionalized in so far as it is founded around a moral outcome, the good. What follows after Arjuna’s recitation of these arguments is a sustained argument from Kṛṣṇa to the effect that moral considerations that appeal to outcomes and ends are mistaken, and that one should adopt a procedural—yogic—approach to practical rationality. Hence, the Bhagavad Gītā from start to finish is the MTA as a dialectic that goes from teleological considerations, through Deontology (karma yoga) to the extreme proceduralism of Bhakti (yoga) via a metaethical bridge it calls jñānayoga. It hence serves as both a summary of the teleological precursors to a procedural approach to morality and its refutation. It serves also as a historical summary of the dialectic of the Vedic tradition, but in argument form: with the radical proceduralism of bhakti yoga being the conclusion.
6. Mahābhārata: Narrative Context
The Bhagavad Gītā is itself a dialogue, but one of a philosophical character. That is, there are no plot or thematic developments of the text apart from the dialectic it explores, couched in argument. This is quite easy to miss if one does not begin a reading of the text with attention to the arguments provided by its protagonists, Arjuna and Kṛṣṇa, and if one expects that there is some uniquely religious content to the text that is distinct from the philosophy. Ignoring the philosophy certainly generates a reading of the text that is mysterious, not founded in reason, and opaque, which could be taken as evidence of its religious significance—but that would be an artifact of ignoring the philosophy and not anything intrinsic to the text. The text begins at the battlefield of the fratricidal war that is itself the climax of the Mahābhārata. Hence, to understand the motivation for the arguments explored in the Gītā, one needs to understand the events that unfold in the epic prior to the fateful conversation between Kṛṣṇa and Arjuna.
The Mahābhārata (the “Great” war of the “Bhāratas”) focuses on the fratricidal tensions and all-out war of two groups of cousins: the Pāndavas, numbering five, the most famous of these brothers being Arjuna, all sons of Pāṇḍu; and the Kauravas, numerous, led by the oldest brother, Duryodhana, all sons of Dhṛtarāṣṭra. Dhṛtarāṣṭra, though older than Pāṇḍu and hence first in line for the throne, was born blind and hence sidelined in royal succession, as it was reasoned that blindness would prevent Dhṛtarāṣṭra from ruling. Pāṇḍu, it so happens was the first to have a son, Yudhiṣṭhira, rendering the throne all but certain to be passed down via Pāṇḍu’s descendants. Yet Pāṇḍu dies prematurely, and Dhṛtarāṣṭra becomes king as the only appropriate heir to the throne, as the next generation are still children.
As the sons of Pāṇḍu and Dhṛtarāṣṭra grow up, Pāṇḍu’s sons distinguish themselves as excellent warriors, and also virtuous individuals, who are not without their flaws. The Kauravas, in contrast, are less able in battle, but mostly without moral virtues or graces. The rivalry between the two sets of cousins is ameliorated only by the Pāṇḍava’s inclination to compromise and be deferential to their cousins—this despite attempts on the Pāṇḍava’s lives by the Kauravas. Matters turn for the worse when the Pāṇḍavas accept a challenge to wager their freedom in a game of dice, rigged by the Kauravas. The Pāṇḍavas seem unable to restrain themselves from participating in this foolish exercise, as it is consistent with conventional pastimes of royalty. After losing everything, and even wagering their common wife, Draupadī, who is thereby publicly sexually harassed, their freedom is granted back by Dhṛtarāṣṭra, who caves into Draupadī’s lament. Once the challenge of the wager—taking a chance—is brought up again, the Pāṇḍavas again lose everything and must subsequently spend fourteen years in exile, and the final year incognito, and if exposed must repeat the fourteen years of exile. They complete it successfully and return to reclaim their portion of the kingdom, at which point the Kauravas refuse to allow the Pāṇḍavas any home area so that they might eke out a livelihood as rulers. Despite repeated attempts by the Pāṇḍavas at conciliation, mediated by their mutual cousin Kṛṣṇa, the Kauravas adopt a position of hostility, forcing the Pāṇḍavas into a corner where they have no choice but to fight. Alliances, loyalties, and obligations are publicly reckoned and distinguished, and the two sides agree to fight it out on a battlefield with their armies.
What is noteworthy about the scenario described in the Mahābhārata is that the Pāṇḍavas, but for imprudent decisions, conform their actions to standards of conventional moral expectations for people in their station and caste—including rising to the occasion of risky public challenges, as is the lot of warriors. Engaging in activities that follow from good character traits (including courage—a Virtue Theoretic concern), engaging in activities with a promise of a good outcome (such as winning at dice—a Consequentialist concern), and agreeing to be bound by good rules of procedure (such as those that condition the game of dice—a Deontological concern). Spelled out, even the imprudence of the Pāṇḍavas is an outcome of their conventional moral practice (of Virtue Ethics, Consequentialism and Deontology). This self-constraint by the Pāṇḍavas, characteristic of conventional moral practice, renders them vulnerable to the Kauravas, who are moral parasites: People who wish others to be restrained by conventional moral expectations so that they may be abused, but have no expectations of holding themselves to those standards. Ever attempting both compromise and conciliation, the Pāṇḍava’s imprudent decisions are not the reason for their predicament; rather, the hostility of the Kauravas is the explanation. But for this hostility, exemplified by the rigged game of dice and the high stakes challenge they set, the Pāṇḍavas would have lived a peaceful existence and would never have been the authors of their own misfortune.
With all attempts at conciliation dashed by the Kaurava’s greed and hostility, war is a fait accompli. Kṛṣṇa agrees to be Arjuna’s charioteer in the fateful battle. What makes the impending war especially tragic is that the Pāṇḍava are faced with the challenge of fighting not only tyrannical relatives that they could not care less for, but also fight loved ones and well-wishers, who, through obligations that arise out of patronage and professional loyalty to the throne, must fight with the tyrants. Bhīṣma, the granduncle of the Pāṇḍavas and the Kauravas, and an invincible warrior (gifted, or cursed, with the freedom to choose when he will die), is an example of one such well-wisher. He repudiated the motives of the Kauravas, sympathizes with the Pāṇḍavas, but due to an oath that precedes the birth of his tyrannical grandnephews (the Kauravas), he remained loyal to the throne on which the Kaurava father, Dhṛtarāshtra, presided. Arjuna, who looked upon Bhīṣma and others like him as a loving elder, had to subsequently fight him. The conflict and tender feelings between these parties was on display when, prior to the war, Arjuna’s eldest brother, Yudhiṣṭhira, wanted the blessings of Bhīṣma on the battlefield to commence the war, and Bhīṣma, his enemy, and leader of the opposing army, blessed him with victory (Mahābhārata 6.43).
What follows prior to the battle are two important philosophical moves. First, Arjuna provides three arguments against fighting, each based on three basic ethical theories that comprise conventional morality: Virtue Ethics, Consequentialism, and Deontology. The essence of this package is the importance and centrality of the good (outcome) to an articulation and definition of the right (procedure). This is then followed by Kṛṣṇa’s prolonged response that consists in making a case for three philosophical alternatives: karma yoga (a form of Deontology), bhakti yoga (a fourth ethical theory, more commonly called Yoga, which does not define the Right by the Good), and jnana yoga (a metaethical theory that provides a justification for the previous two). Indeed, Kṛṣṇa’s exploration of the three options constitutes the dominant content of the 18 chapters of the Gītā. The preoccupation with the good, characteristic of conventional morality, allows moral parasites to take advantage of conventionally good people, for conventionally good people will not transcend the bounds of the good to retaliate against moral parasites. Kṛṣṇa’s arguments, in contrast, are focused not on the Good, which characterizes conventional moral expectation, but the Right. With an alternate moral framework of Yoga that does not define the Right in terms of the Good, Kṛṣṇa is able to counsel Arjuna and the Pāṇḍavas to victory against the Kauravas: For as Arjuna and the Pāṇḍava brothers abandon the good of conventional morality, they are no longer sitting targets for the malevolence of the Kauravas. Moreover, this frees the Pāṇḍavas to take preemptive action against the Kauravas, resorting to deception and treachery to win the war. At the end of the war, the Pāṇḍavas are accused by the surviving Kauravas of immorality in battle at Kṛṣṇa’s instigation (Mahābhārata 9.60.30–34)—and indeed, the Pāṇḍavas do resort to deception and what might be thought of as treachery, given conventional moral practice. Kṛṣṇa responds that there would have been no prospect of winning the war if constrained by conventional moral expectations (Mahābhārata: 9.60.59). This seems like a shocking admission until we remember that war is the very dissolution of such conventions, and it is the Pāṇḍavas capacity to pivot to an alternate moral paradigm (Yoga) that defines the Right without respect to the good which allows for their victory both with respect to the battle and with respect to Just War. A new dharma, or a new ethical order, is the perfection of the practice, not the means. Unlike the Kauravas, who had no moral code and were parasites, the Pāṇḍavas do have an alternate moral code to conventional morality, which allows them to re-establish a moral order when the old one is undermined by moral parasitism. The kernel of the Gītā Just War Theory is hence indistinguishable from its arguments for Yoga.
7. Basic Moral Theory and Conventional Morality
To understand the Gītā is to understand its contribution to South Asian and world philosophy. This contribution consists in a criticism of conventional morality that prioritizes the Good in a definition of the Right. Conventional morality is comprised of three familiar theories: Virtue Ethics, Consequentialism, and a conventional version of Deontology. The Gītā’s unique contribution is completed by the defense of two procedural ethical theories that prioritize the Right choice over the Good outcome. The first of the two normative theories is the Gītā’s version of Deontology, called karma yoga, a practice of one’s natural duty that contributes to a world of diversity. The second of the two normative theories, and the fourth in addition to the three theories of conventional ethics, is a radically procedural option unique to the South Asian tradition, namely Yoga (compare Ranganathan 2017b), which the Gītā calls bhakti yoga. Yoga/Bhakti is distinguished for defining the Right without reference to the Good: The right thing to do is defined as devotion to the ideal of the Right—Īśvara, Bhagavān—Sovereignty (played by Kṛṣṇa in the dialogue and epic), and the Good is the incidental perfection of this devotion. The Gītā also includes a meta-ethical theory, jñānayoga, that renders the epistemic and metaphysical aspects of the two normative ethical theories clear. The Gītā’s main aim with these procedural ethical theories is to provide an alternate moral framework for action and choice, which liberates the conventionally moral Arjuna—the other protagonist of the Gītā, in addition to Kṛṣṇa—from being manipulated and harassed by moral parasites. Moral parasites have no expectation of holding themselves accountable by conventional moral expectations of good behavior and choice but wish others, like Arjuna, to abide by such expectations so that they are easy to take advantage of. At the precipice of moral conventions, undermined by moral parasites, the Bhagavad Gītā recommends bhakti yoga, devotion to Sovereignty—played by Viṣṇu’s avatāra, Kṛṣṇa—as the means of generating a new moral order free of parasites (compare Gītā 4.8), supported by an attention to the practice of a duty that allows one to contribute to a world of diversity. The key transition to this radical proceduralism of bhakti yoga is an abandonment of concerns for Good outcomes in favour of the ideal of the Right. By abandoning a concern for the Good, one is no longer self-constrained to act in good ways and will hence no longer be the easy target of parasites who take advantage of the conventionally good because of their goodness. Unlike moral parasites, the devotee of Sovereignty is not amoral or merely in life for themselves. As devotees of Sovereignty, they act in a manner that is in the interest of those devoted to sovereignty and are hence able to engage in just relationships of friendship and loyalty and to cut away relationships of manipulation and parasitism. This shift to the Right and away from the Good constitutes the kernel of the Gītā’s important contribution to Just War Theory. The just cause is the cause waged as part of a devotion to Sovereignty. The unjust’s cause is steeped in moral parasitism. As devotion to Sovereignty constitutes a practice of transforming one’s behavior into sovereign behavior, success is assured in the long run.
Much confusion about the Gītā and its argument for bhakti yoga (both with the expectation that there is some type of essentially religious theme afoot) and with respect to the connection of bhakti yoga as a response to previous moral and political concerns brought about by social conflict persists because basic normative ethical options are not spelled out as theories about the Right or the Good.
The question of the Right or the Good is central to the rigged game of dice, which includes engaging in activities that follow from good character traits (including courage), engaging in activities with a promise of a good outcome (such as winning at dice), and agreeing to be bound by good rules of procedure (such as those that condition the game of dice). Spelled out, even the imprudence of the Pāṇḍavas is an outcome of their conventional moral practice, which was motivated by a concern for the good. These three aspects of the game of dice exemplify the concerns of three prominent ethical theories: Virtue Ethics, which prioritizes a concern for good character, Consequentialism, which prioritizes good outcomes, and Deontology, which prioritizes good rules. Arjuna, at the start of the Gītā, provides three arguments against fighting, each based on three basic ethical theories that comprise conventional morality: Virtue Ethics, Consequentialism, and Deontology. Arjuna’s philosophical intuitions are indistinguishable from those that motivated him and his brothers to participate in the rigged game of dice. Kṛṣṇa’s response involves a sustained criticism of teleology, which includes Virtue Ethics and Consequentialism, and a rehabilitation of Deontology. Properly spelled out, we also see why the common view that bhakti yoga is a case for theism is mistaken. Clarity is had by defining and spelling out these theories as positions on the Right or the Good.
The Good causes the Right: Roseland Hursthouse identifies Virtue Ethics as the view that the virtues, or states of goodness, are the basic elements in moral theory, and that they support and give rise to right action (Hursthouse 1996, 2013). Hence on this account, Virtue Ethics is this first moral theory. A reason for objecting to this characterization of Virtue Ethics is that the prioritization of virtue does not entail that right action is what follows from the virtues: An appropriate omission or non-action may be the proper outcome of virtue. This is consistent with the idea of Virtue Ethics that treats the virtues as the primary element in moral explanation. Yet, Virtue Ethical theories credit states of goodness (the virtues) with living well, which is in a broad sense right, so in this case, any theory that prioritizes virtue in an account of the life well lived endorses some version of the notion that the right procedure follows from good character. In South Asian traditions, the paradigm example of Virtue Ethics—which also denies that right action follows from the virtues, but a well-lived life does follow from the virtues—is the ancient tradition of Jainism. According to the Jains, an essential feature of each sensory being (jīva) is virtue (vīrya), and this is clouded by action (karma). We ought to understand ourselves in terms of virtue, which is benign and unharmful, and not in terms of action, which intrudes on the rights of others. Jains are historically the staunchest South Asian advocates of strict vegetarianism and veganism as a means of implementing non-harm. As Jains identify all action as harmful, they idealize a state of non-action, sullekana, which (accidentally) results in death: This is the fruit of Jain moral observance (Soni 2017).
The Good justifies the Right: This category analyzes what is essential to Consequentialist theories. Accordingly, the right action or omission of action only has an instrumental value relative to some end, the good, and hence the good serves the function of justifying the right. Hence, an action or an omission of an action can be morally equivalent in so far as they are equally justified by some end. The right can be a rule or a specific action, but either way, it is justified by the ends. When the end is agent neutral, we could call the theory Utilitarianism. The most famous example of this type of theory in South Asian traditions is Buddhism, which takes the welfare of sentient beings as the source of obligation (Goodman 2009). In its classical formulation in the Four Noble Truths, it is duḥkha—discomfort or disutility—that is to be minimized by the elimination of agent-relative evaluation and desire, by way of a scripted program, the Eight-Fold Path: a project justified by agent-neutral utility. Yet, interpreted by common beliefs, the classical Buddhist doctrine seems like a hodgepodge of ethical commitments. For instance, the Buddha is recorded in the Aṅguttara Nikāya-s (I 189–190) as distinguishing between two kinds of dharmas, or ethical ends—those that are wholesome, such as moral rules, and those that are not, such as pathological emotions. It thus seems that dharma has more than one unrelated theoretical sense here. By explicating the reasons that comprise Buddhist theory and entail its controversial claims, we see how this is part of the project of Consequentialism: Basic to all dharma is the end of harm reduction or welfare, but whereas some such dharmas, such as agent-neutral moral teachings, justify themselves as means to harm reduction, some dharmas, such as pathological emotions, that appear agent relative justify the meditational practice of mindfulness, thereby relieving us of having to treat these dharmas as possessing emulative or motivational force.
The Right justifies the Good: This is the inverse of the previous option, and while it may not be a popular way to think about the issue, it sheds light on the role of Deontological theories in moral disagreement. The goods of moral theory, on this account, may be actions or omissions: the former are often called duties, the later, rights—these are moral choices. Whatever counts as a moral choice is something good and worth preserving in one’s moral theory. Yet, the reason that they are theoretically worth endorsing has to do with procedural criteria that are distinct from the goods of moral theory. Hence, this category makes use of a distinction between the definition of moral choices (by definition, good), and their justification: “Deontological theories judge the morality of choices by criteria different from the states of affairs those choices bring about” (Alexander and Moore Winter 2012). The right (procedure) is hence prior not only to the goods of moral theory (moral choices) but also to their further consequences. This way of thinking about moral theory lays to rest a confusion: That if Deontologists consider the good outcomes in identifying duties or rights, they are thereby Consequentialists. This is a mistake that rests on a failure to distinguish between the substance of moral choice and the prior criteria that justify them.
In South Asian traditions, famous deontologists abound, including the Pūrva Mīmāṃsā tradition, and the Vedānta tradition. Pūrva Mīmāṃsā is a version of Deontological particularism (Clooney 2017), but also moral nonnaturalism, that claims that moral precepts are defined in terms of their beneficial properties but are justified by intuition (śruti), which, on its account, is the ancient body of texts called the Vedas (Ranganathan 2016b). Authors in the Vedānta tradition also often endorse Deontology and a procedural approach to ethics by way of criticizing problems with teleology, namely, that it apparently makes moral luck an irreducible element of the moral life (Ranganathan 2017c). The Pūrva Mīmāṃsā is the tradition in which ancient practices of animal sacrifices, largely abandoned and criticized, are defended as part of the content of good actions that we engage in for procedural considerations, though here too there is often an appreciation of the superiority of nonharmful interactions with nonhuman animals.
The Bhagavad Gītā is a prominent source of deontological theorizing, especially for the Vedānta tradition that draws upon it. As we shall see, in the Gītā, a deontological approach to ethical reasoning is formulated as karma yoga—good practice, which is to be endorsed for procedural reasons, namely that its perfection is a good thing. A distinguishing feature of karma yoga as a form of deontology is that it understands good action as something suited to one’s nature, which one can perfect, that contributes to a world of diversity (Gītā 3.30-24). One’s reason for endorsing the duty is not its further outcome, but rather that it is appropriate for one to engage in. The rationale for why something counts as one’s duty, though, has everything to do with its place within a tapestry of diverse action, of diverse beings, contributing to a world of diversity.
The Right causes the Good: This is a fourth moral option that is radically procedural. Whereas the previous options in moral theory define the things to be done, or valued, by the good, this fourth defines it by the Right. It is also the mirror opposite of the first option, Virtue Ethics. The salient example of this theory is Yoga, as articulated in Patañjali’s Yoga Sūtra, or Kṛṣṇa’s account of Bhakti yoga in the Gītā. Accordingly, the right thing to do is defined by a procedural ideal—the Lord of the practice—and approximating the ideal brings about the good of the practice, namely its perfection. Yoga acknowledges a primary Lord of practices as such: Sovereignty. This is Īśvara (the Lord, Sovereignty) or Bhagavān (especially in the Gītā), and it is defined by two features: It is untouched by past choice (karma) and is hence unconservative, and it is externally unhindered and is hence self-governing. In the Yoga Sūtra, these two features are further analyzed into two procedural ideals of disciplined practice—tapas (heat-producing, going against the grain, unconservativism) and svā-dhyāya (literally “self-study” which, in the context of the Yoga Sūtra that claims that to know is to control objects of knowledge, amounts to self-control or self-governance). Devotion to Sovereignty—what Patañjali calls Īśvara praṇidhāna, or approximating Sovereignty, called “bhakti yoga” in the Gītā —hence takes on the two further procedural ideals of unconservativism and self-governance (Ranganathan 2017b). According to the Yoga Sūtra, the outcome of such devotion is our own autonomy (kaivalya) in a public world. In the language of the Gītā, the outcome of such devotion is freedom from evil (Gītā 18:66).
Failing to be transparent about the four possible, basic ethical theories causes two problems. First, the possibilities of the Gītā are then understood in terms of the three ethical theories familiar to the Western tradition (Virtue Ethics, Consequentialism and Deontology), which are ironically the positions that are, together, constitutive of conventional morality that the Gītā is critical of. One outcome of this interpretive orientation that treats the Gītā as explainable by familiar ethical beliefs in the Western tradition is that arguments for Yoga are understood as Consequentialist arguments (compare Sreekumar 2012), as though Yoga/Bhakti is an exhortation to be devoted for the sake of a good outcome. This is ironic, as the argument for Yoga is largely predicated on a criticism of Consequentialism in the Gītā itself, where action done for the sake of consequences is repeatedly criticized. The second irony that follows from this interpretation of the Gītā along familiar ethical theories is the re-presentation of the argument as one rooted in theism—itself playing a role in the depiction of the text as religious, if by religion one means theism.
While Yoga/Bhakti seems superficially like theism, it is not. Theists regard God as the paradigmatic virtuous agent, which is to say Good. Right action and teaching emanate from God the Good. Theism is hence a version of Virtue Ethics, with God playing the paradigm role of the virtuous agent. For Yoga/Bhakti, the Lord is Right: goodness follows from our devotion to Sovereignty. Hence, our role is not to obey the instructions of God on this account, but to approximate Sovereignty as our own procedural ideal. The outcome is ourselves as successful, autonomous individuals who do not need to take direction from others. The Good of life is hence an outcome of the perfection of the practice of devotion to the procedural ideal of being a person: Sovereignty.
With the four basic normative ethical options transparent, we are in a position to examine the beginning of the Gītā, and specifically Arjuna’s three arguments against fighting in the impending war, and Kṛṣṇa’s response.
8. Arjuna’s Three Arguments Against Fighting
Prior to the commencement of the battle, on the very battlefield where armies are lined up in opposition, and with Kṛṣṇa as his charioteer, Arjuna entertains three arguments against fighting.
First, if he were to fight the war, it would result in death and destruction on both sides, including the death of loved ones. Even if he succeeds, there would be no joy in victory, for his family will largely have been decimated as a function of the war (Gītā 1.34-36). This is a Consequentialist and more specifically Utilitarian argument. In the South Asian context, this would be a prima facie classical Buddhist argument, in so far as Buddhist theory seeks the minimization of duḥkha (suffering) and the maximization of nirvāṇa—freedom from historical constraints that lead to discomfort. According to such arguments, the right thing to do is justified by some good (harm reduction, or the maximization of happiness), and here Arjuna’s reasoning is that he should skip fighting so as to ensure the good of avoiding harm.
Second, if the battle is between good and evil, his character is not that of the evil ones (the Kauravas), but yet, fighting a war would make him no better than his adversaries (Gītā 1.38-39). This is a Virtue Ethical argument. According to such arguments, the right thing to do is the result of a good, the virtues, or strength of character. Not only is this a Virtue-Ethical argument, it is a classical Jain position: The correct response to sub-optimal outcomes is not more action, but restraint in conformity to the virtues.
Third, war results in lawlessness, which undermines the virtue and safety of women and children (Gītā 1.41). This might be understood as an elaboration of the first Consequentialist argument: Not only does war end in suffering, which should be avoided, but it also leads to undermining the personal safety of women and children, and as their safety is good, we ought to avoid war so as to protect it. The argument can also be understood as a version of Kantian style Deontology.
An essential feature of Deontology is the identification of goods, whether these are actions (duties) or freedoms (rights), as what require justification on procedural grounds. A duty is hence not only something that is good to do, and a right not only something good to have, but something we have reason to do or allow. Such goods, duties, and rights constitute the social fabric and are justified, as Kant reasoned, in so far as they help us relate to each other in a Kingdom of Ends. Deontology is hence the inverse of Consequentialism: Whereas Consequentialism holds that the good outcome justifies the procedure, the Deontologist holds that some good state of affairs (actions, freedoms) is justified by a procedural consideration. This way of clarifying duty is wholly in keeping with the Pūrva Mīmāṃsā position that ethics (dharma) is a command distinguished by its value (Mīmāṃsā Sūtra I.2).
What Consequentialism, Virtue Ethics and Deontology have in common is the idea that the good—the valuable outcome—is an essential feature of making sense of the right thing to do. Morality defined or explained by way of the good is something that can be established as an outcome of reality, and can hence be conventionalized. Thinking about morality by way of the good helps us identify an area of moral reasoning we might call conventional morality: consisting of actions that are justified in so far as they promise to maximize the good (Consequentialism), lifestyle choices motivated by good character (Virtue Ethics), and good actions that we have reason to engage in (Deontology). War disrupts conventional morality as it disrupts the good. This is indeed tragic, in so far as conventional morality is organized around the Good.
But there is indeed another side to the story rendered clear by the narrative of the Mahābhārata. It was conventional morality that made it possible for the Kauravas to exercise their hostility against the Pāṇḍavas by restricting and constraining the Pāṇḍavas. The Pāṇḍavas could have rid themselves of the Kauravas by killing them at any number of earlier times when they had the chance in times of peace, and everyone who survived would be better off for having been rid of moral parasites as rulers and having the benevolent Pāṇḍavas instead. They could have accomplished this most easily by assassinating the Kauravas in secret or perhaps openly when they were not expecting it, for the Kauravas never worried about or protected themselves from such a threat because they counted on the virtue of the Pāṇḍavas. Yet, the Pāṇḍava fidelity to conventional morality created a context for the Kauravas to ply their trade of deceit and hostility. The game of dice that snared the Pāṇḍavas is a metaphor for conventional morality itself: a social practice that promises a good outcome (Consequentialism), constituted by good rules that all participants have reason to endorse (Deontology), and laudable actions that follow from the courage and strength of its participants to meet challenges head-on (Virtue Ethics).
The lesson of the Mahābhārata generalizes: Conventional morality places constraints on people who are conventionally moral, and this enables the maleficence of those who act so as to undermine conventional morality by undermining those who bind themselves with it. The only way to end this relationship of parasitism is for the conventionally moral to give up on conventional morality and engage moral parasites in war. This would be a just war—dharmyaṃ yuddham—and the essence of a just war, for the cause would be to rid the world of moral parasites (Ranganathan 2019). Yet, from the perspective of conventional morality, which encourages mutually accommodating behavior, this departure is wrong and bad. Indeed, relying purely on conventional standards that encourage social interaction for the promise of a good, an argument for pacifism, such as the Jain argument, is more easily constructed than an argument for war. Hence, Arjuna’s three arguments against war.
9. Kṛṣṇa’s Response
Prior to the serious arguments, which Kṛṣṇa pursues to the end of the Gītā, he begins with considerations that are in contrast less decisive, and which he does not dwell on, except sporadically, through the dialogue. Kṛṣṇa responds immediately by mocking Arjuna for his loss of courage. Indeed, if maintaining his virtue is a worry, appealing to Arjuna’s sense of honor is to motivate him via Virtue Ethical concern (Gītā 2.2-3, 2.33-7), intimating that Virtue Ethics is not uniquely determinative (justifying both the passivist and activist approach to war). He also makes the claim that paradise ensues for those who fight valiantly and die in battle (Gītā 2.36-7). This would be a Consequentialist consideration, intimating that Consequentialist considerations are not uniquely determinative (justifying both arguments to fight and to not fight). He also appeals to a Yogic metaphysical view: As we are all eternal, no one kills anyone, and so there are no real bad consequences to avoid by avoiding a war (Gītā 2.11-32). The last thesis counters the third and last of Arjuna’s arguments: If good practice that entrenches the welfare of women and children is in order, then the eternality of all of us should put to an end to any serious concern about war on these grounds.
These three considerations serve the purpose of using the very same theoretical considerations that Arjuna relies on to argue against war to motivate fighting, or at least deflate the force of the original three arguments. The last claim, that we are eternal, is perhaps the more serious of the considerations. This is indeed a very basic thesis of a procedural approach to ethics for the following reason. People cannot be judged as outcomes, but rather procedural ideals themselves—ideals of their own life—and as such they are not reducible to any particular event in time. Hence, moving to a procedural approach to ethics involves thinking about people as centers of practical rationality that transcend and traverse the time and space of particular practical challenges. Buddhists are famous for arguing, as we find in the famous Questions of King Milinda, that introspection provides no evidence for the self: All one finds are subjective experiences and nothing that is the self. Indeed, the self as a thing seems to be reducible out of the picture—it seems like a mere grouping for causally related and shifting bodily and psychological elements. Such a Buddhist argument is aligned with a Consequentialist Ethic, geared toward minimizing discomfort. However, if we understand the self as charged with the challenges of practical rationality, and the challenge of morality to rein in the procedural aspects of our life, we have no reason to expect that the self is an object of our own experiences: It is rather an ideal relative to which we judge our practical challenge. It is like the charioteer, who is conscious not of him- or herself, but is engaged in driving. For the charioteer to look for evidence of him- or herself from experience would be to look in the wrong direction, but as the one responsible for the experiences that follow, the charioteer is in some sense always outside of the contents of his or her experience, transcending times and places.
Kṛṣṇa, as the driver of Arjuna’s battle-ready chariot, has the job of supporting Arjuna in battle, and so his arguments that aim at motivating Arjuna to fight are an extension of his literal role as charioteer in the battle, but also his metaphorical role as the intellect of the chariot, as set out in the KaṭhaUpaniṣad. One of the problems with the frame of conventional moral expectations that Arjuna brings to the battlefield is that it frames the prospects of war in terms of the good, but war is not good: It is bad. Even participants in a war do not desire the continuity of the war: They desire victory, which is the cessation of a war. So thinking about war in terms of the good gives us no reason to fight. Moreover, war is a dynamic, multiparty game that no one person can uniquely determine. The outcome depends upon the choices of many players and many factors that are out of the control of any single player. Game theoretically, this is debilitating, especially if we are to choose a course of action with the consequences in view. However, if the practical challenge can be flipped, then ethical action can be identified on procedural grounds, and one has a way by which to take charge of a low-expected-utility challenge via a procedural simplification: The criterion of moral choice is not the outcome; rather, it is the procedure. This might seem unconvincing. If I resort to procedure, it would seem imprudent because then I am letting go of winning (the outcome). However, there are two problems with this response. First, the teleological approach in the face of a dynamic circumstance results in frustration and nihilism—or at least, this is what Arjuna’s monologue of despondency shows. Thus, focusing upon a goal in the face of challenge is not a winning strategy. Indeed, when one thinks about any worthwhile pursuit of distinction (whether it is the long road to becoming an award-winning scientist or recovering from an illness), the a priori likelihood of success is low, and for teleological reasons, this gives one reason to downgrade one’s optimism, which in turn depletes one’s resolve. Focusing on outcomes ultimately curtails actions that can result in success in cases where the prospects of success are low. Call this the paradox of teleology: Exceptional and unusual outcomes that are desirable are all things considered unlikely, and hence we have little reason to work toward such goals given their low likelihood. Rather, we would be better off working toward usual, mundane outcomes with a high prospect of success, though such outcomes have a lower utility than the unusual outcomes. Second, if we can distinguish between the criterion of choice and the definition of duty—Deontology—then we have a way to choose duties that result in success, defined by procedural reasons. This insulates the individual from judging the moral worth of their action in terms of the outcome and hence avoids the paradox of teleology while pursuing a winning strategy (Gītā 2.40). The essence of the strategy, called yoga (discipline), is to discard teleology as a motivation (Gītā 2.50). Indeed, to be disciplined is to abandon the very idea of good (śubha) and bad (aśubha) (Gītā 12.17). In effect, practical rationality moves away from assessing outcomes.
To this end, Kṛṣṇa distinguishes between two differing normative moral theories and recommends both: karma yoga and bhakti yoga. Karma yoga is Deontology formulated as doing duty without the motive of consequence. Duty so defined might have beneficial effects, and Kṛṣṇa never tires of pointing this out (Gītā 2.32). However, the criterion of moral choice on karma yoga is not the outcome, rather it is the fittingness of the duty as the thing to be done that justifies its performance: Hence, better one’s own duty poorly performed than someone else’s well performed (Gītā 2.38, 47, 18.47). Yet, one’s duty properly done is good, so one can have confidence in the outcomes of one’s struggles if one focuses on perfecting one’s duty. Bhakti yoga in turn is Bhakti ethics: performance of everything as a means of devotion to the regulative ideal that results in one’s subsumption by the regulative ideal (Gītā 9.27-33). Metaphorically, this is described as a sacrifice of the outcomes to the ideal. Ordinary practice geared toward an ideal of practice, such as the practice of music organized around the ideal of music, provides a fitting example: The propriety of the practice is not to be measured by the quality of one’s performance on any given day, but rather by fidelity to the ideal that motivates a continued commitment to the practice and ensures improvement over the long run. In the very long run, one begins to instantiate the regulative ideal of the practice: music. Measuring practice in terms of the outcome, especially at the start of ventures like learning an instrument, is unrewarding as one’s performance is suboptimal. At the start, and at many points in a practice, one fails to instantiate the ideal. Given Bhakti, one finds meaning and purpose through the continuity of one’s practice, through difficulty and reward, and one’s enthusiasm and commitment to the practice remain constant, as it is not measured by outcomes but by fidelity to the procedural ideal—a commitment that is required to bring about a successful outcome.
Kṛṣṇa also famously entertains a third yoga: jñānayoga. This is the background moral framework of bhakti yoga and karma yoga: What we could call the metaethics of the Gītā. Jñāna yoga, for instance, includes knowledge of Kṛṣṇa himself as the moral ideal, whose task is to reset the moral compass (Gītā 4.7-8, 7.7). It involves asceticism as an ancillary to ethical engagement—asceticism here is code, quite literally, for the rejection of teleological considerations in practical rationality. The proceduralist is not motivated by outcomes and hence attends to their duty as an ascetic would if they took up the challenge of action. What this procedural asceticism reveals is that the procedural ideal (Sovereignty) subsumes all of us, and hence, jñāna yoga yields an insight into the radical equality of all persons (Gītā 5.18).
Kṛṣṇa, Sovereignty, sets himself up as the regulative ideal of morality in the Gītā in two respects. First, he (Kṛṣṇa) describes his duty as lokasaṃgraha, the maintenance of the welfare of the world, and all truly ethical action as participating in this function (Gītā 3.20-24). To this extent, he must get involved in life to re-establish the moral order, if it diminishes (Gītā 4.7-8). Second, he acts as the regulative ideal of Arjuna, who is confused about what to do. The outcome of devotion (bhakti) to the moral ideal—Kṛṣṇa here—is freedom from trouble and participation in the divine (Gītā 10.12), which is to say, the regulative ideal of ethical practice—the Lord of Yoga (Gītā 11.4). This, according to Kṛṣṇa, is mokṣa—freedom for the individual. Liberation so understood is intrinsically ethical, as it is about participation in the cosmic regulative ideal of practice—what the ancient Vedas called Ṛta.
In identifying his own Sovereignty with his function as protector of the world, Kṛṣṇa allows for a way of thinking about Deontological action, karma yoga, as not disconnected from the fourth ethical theory: Bhakti.
Given such considerations, it is not surprising that to some commentators, such as M. K. Gandhi, a central concept of the Gītā is niṣkāmakarma—acting without desire. This in turn is closely related to sthitaprajña—literally “still knowing” (Gandhi 1969: vol. 37, 126). Gandhi goes so far as to claim that these doctrines imply that we should not even be attached to good works (Gandhi 1969: vol. 37, 105). While this sounds dramatic and perhaps paradoxical, it is a rather straightforward outcome of procedural ethical thinking. Even in the case of Deontology, where duty is defined as a good thing to be done, one endorses such actions for procedural reasons, and not merely because they are good. Hence, clarity with respect to the procedural justification for duty deprives us the possibility of being motivated by a desire for the duty in question, for such a desire treats the goodness of the action as a motivating consideration and this is exactly what Deontology denies, for there may be many competing good things to do, but not all count as our duty, and our duty is what has a procedural stamp of approval. In the case of Bhakti, however, the distance between the desire for good works and moral action is even sharper, for goodness is an outcome of right procedure and not an independent moral primitive that one could use to motivate action.
Given the initial challenge of motivating Arjuna to embrace war, Kṛṣṇa’s move to a radically procedural moral framework as we find in jñāna yoga undermines the motivational significance of the various arguments from conventional morality and against fighting, which give pride of place to the good. Yet, in shifting the moral framework, Kṛṣṇa has not abandoned dharma as such, but has rather proceduralized it.
Hence, the morality of engaging in the war, and engaging in any action, can be judged relative to such procedural considerations. To this end, he leaves Arjuna with two competing procedural normative theories to draw from: karma yoga (Deontology) and bhakti yoga (Yoga/Bhakti).
10. Gītā’s Metaethical Theory
The previous section reviewed the obvious normative implications of the Gītā, as it provides two competing theories that Kṛṣṇa endorses: karma yoga (Deontology) and bhakti yoga (Bhakti/Yoga). Both are procedural theories, but Deontology identifies what is to be done as a good, and bhakti dispenses with the good in understanding the right: It is devotion to the procedural ideal that defines the right. They are united in a common metaethical theory. Metaethics concerns the assumptions and conditions of normative ethics, and the metaethics of the Gītā is jñāna yoga: the discipline of knowledge (jñāna). One of the entailments of the Gītā, as one can find in most all South Asian philosophy, is that morality is not a fiction: Rather, there are facts of morality that are quite independent of our perspective, wishes, hopes, and desires. This is known as Moral Realism.
a. Moral Realism
In Chapter 4, Kṛṣṇa specifies himself as the moral ideal whose task is to reset the moral compass (Gītā 4.7-8). This section discusses the characteristics of the moral ideal, but also its relationship to ordinary practice. What follows in the fifth chapter and beyond seems increasingly esoteric but morally significant. The fifth chapter discusses the issue of renunciation, which in the Gītā amounts to a criticism of Consequentialism and teleology on the whole. The various ascetic metaphors in the Gītā are in short ethical criticisms of teleology. One of the implications of this criticism of teleology is the equality of persons understood properly: All of us are equal in moral potential if we renounce identifying morality with virtue (Gītā 5.18). The sixth chapter ties in the broader practice of Yoga to the argument. The continuity between Devotion (Bhakti) and Discipline (Yoga) as a separate philosophy is seamless, and if we were to study the Yoga Sūtra, we would find virtually the same theory as the Bhakti yoga account in the Gītā. Here in chapter six, we learn about the equanimity that arises from the practice of yoga (Gītā 6.9). Indeed, as we abandon the paradox of teleology, we ought to expect that ethical practice results in stability: No longer allowing practical rationality to be destabilized by a desire for unlikely exceptional outcomes, or likely disappointing outcomes, we settle into moral practice geared towards the stability of devotion to our practice and the regulative ideal. Chapter seven returns to the issue of Bhakti in full, with greater detail given to the ideal (Gītā 7.7). Chapter eight brings in reference to Brahman (literally “Development”) and ties it with the practice of yoga. Moral Realism has many expressions (Brink 1989; Shafer-Landau 2003; Brink 1995; Sayre-McCord Spring 2015 Edition; Copp 1991), but one dominant approach is that moral value is real. Chapter nine introduces the element of Moral Realism: all things that are good and virtuous are subsumed by the regulative ideal (Gītā 9. 5). The ideal is accessible to anyone (Gītā 9.32).
i. Good and Evil
Chapter ten contends with the outward instantiation of the virtues of the ideal. It is claimed that the vices too are negative manifestations of the ideal (Gītā 10.4-5). This is an acknowledgment of what we might call the moral responsibility principle. This is the opposite of the moral symmetry principle, which claims that two actions are of the same moral worth if they have the same outcome. The moral responsibility principle claims that different outcomes share a procedural moral value if they arise from devotion to the same procedural ideal. As outcomes, vices are a consequence of a failure to instantiate the moral ideal. Hence the moral ideal is responsible for this. This only shows that devotion to the ideal is preferable (Gītā 10.7).
It may seem counterintuitive that we should not understand the moral ideal only in terms of good outcomes. To define the procedural ideal by way of good outcomes, as though the good outcome is a sign of the procedural ideal, reverses the explanatory order of Bhakti by treating the Good as a primitive mark of the right and morality as such, and hence to avoid this rejection of Bhakti we have to accept that doing the right thing can result in further consequences that are bad. For instance, practicing the violin every day will likely yield lots of bad violin performances, especially in the beginning, and this is something that is not an accident, but a function of one’s devotion to the procedural ideal of practicing the violin. In the Western tradition, the notion that dutiful action can result in further consequences that may be bad is often known as the Double Effect, and it has been used as a defense of scripted action in the face of suboptimal outcomes—according to this Doctrine of Double Effect, one is only responsible for the primary effect, and not the secondary negative effects. Yet, as noted, this doctrine can be used to justify any choice, for one could always relegate the negative effects of a choice to the secondary category and the primary effect to one’s intended effect (Foot 2007). The moral responsibility principle is in part a response to such a concern: One cannot disown the effects of one’s actions as though they were secondary to one’s intention. The good and the bad are a function of devotion to the ideal and must be affirmed as right, though not necessarily good. This parsing deprives the possibility of analyzing choice into a primary effect and a secondary one: There is the primary choice of the action which is right, and not an outcome, and then there is the outcome, good or bad.
With karma yoga of the Gītā too, however, double effect is reduced out of the picture. Good action that we endorse for procedural reasons (karma yoga) might result in bad secondary outcomes, which we are responsible for, yet in this case we have not perfected our duty, and when perfected, there are no deleterious secondary outcomes. However, in so far as double effect is a sign of failure to execute one’s duty properly, one cannot take credit for a primary effect while disavowing a secondary effect. Double effect is brought into the picture precisely when we fail in some way to do our duty. In so far as such failure is a function of one’s devotion to one’s duty, as something to be perfected, one must be responsible for all the effects of one’s actions.
Kṛṣṇa in the Gītā recommends treating outcomes as such as something to be renounced, and this may seem to vitiate against the notion that we are responsible for the outcome of our choices. On a procedural approach to action, though, we renounce the outcomes of actions precisely because they are not, on the whole, anything to calibrate moral action to, as they may be good or bad.
Chapter eleven refers to the empirical appreciation of the relation of all things to the regulative ideal. Here, in the dialog, Kṛṣṇa gives Arjuna special eyes to behold the full outcome of the regulative ideal—his cosmic form, which Arjuna describes as awe inspiring and terrifying. Good and bad outcomes of reality are straightforwardly acknowledged as outcomes of the regulative ideal.
Chapter twelve focuses on the traits of those who are devoted to the regulative ideal. They are friendly and compassionate and do not understand moral questions from a selfish perspective (12.13). Importantly, they renounce teleological markers of action: good (śubha) and evil (aśubha) (12.17). Yet they are devoted to the welfare of all beings (Gītā 12.4). The Bhakti theory suggests that these are not inconsistent: If the welfare of all beings is the duty of the regulative ideal, Kṛṣṇa (Gītā 3.24), then ethical practice is about conformity to this duty. And this is not arbitrary: If the procedural ideal (the Lord) of unconservatism and self-governance accounts for the conditions under which a being thrives, then the welfare of all beings is the duty of the ideal. The outcome is not what justifies the practice; the good outcome is the perfection of the practice.
ii. Moral Psychology
Moral psychology in the Western tradition is often identified with the thought processes of humans in relationship to morality. In the South Asian context, mind is often given an external analysis: The very content of mind is the content of the public world. Hence, psychology becomes coextensive with a more generally naturalistic analysis of reality. To this extent, moral psychology is continuous with a general overview of nature.
Chapter thirteen emphasizes the distinction of the individual from their body—this follows from the procedural analysis of the individual as morally responsible yet outside of the content of their experiences. Chapter fourteen articulates the tri-guṇa theory that is a mainstay of Sāṅkhya and Yoga analyses. Accordingly, aside from persons (puruṣa), nature (prakṛti) is comprised of three characteristics: sattva (the cognitive), rajas (activity), and tamas (inertia). Nature so understood is relativized to moral considerations and plays an explanatory role that ought to be downstream from regulative choices. Chapter fifteen is an articulation of the success of those who adhere to Bhakti: “Without delusion of perverse notions, victorious over the evil of attachment, ever devoted to the self, turned away from desires and liberated from dualities of pleasure and pain, the undeluded go to that imperishable status” (Gītā 15.5).
Chapter sixteen is an inventory of personalities relative to the moral ideal. Chapter seventeen returns to the issue of the three qualities of nature, but this time as a means of elucidating moral character. Most importantly, it articulates the bhakti theory in terms of śraddhā (commitment), often also identified with faith: “The commitment of everyone, O Arjuna, is in accordance with his antaḥ karaṇa (inside helper, inner voice). Everyone consists in commitment. Whatever the commitment, that the person instantiates” (Ch 17.3). Here, we see the theory of bhakti universalized in a manner that abstracts from the ideal. Indeed, we are always making ourselves out in terms of our conscience—what we identify as our moral ideal—and this warrants care, as we must choose the ideal we seek to emulate carefully. The three personality types, following the three characteristics of nature, choose differing ideals. Only the illuminated choose deities as their ideals. Those who pursue activity as an ideal worship functionaries in the universe (yakṣa-s and rākṣasas), while those who idealize recalcitrance worship those that are gone and inanimate things (Gītā 17. 4).
b. Transcending Deontology and Teleology
In the final chapter, Kṛṣṇa summarizes the idea of renunciation. Throughout the Gītā, this has been a metaphor for criticizing teleology. The practical reality is that action is obligatory as a part of life, and yet, those who can reject being motivated by outcomes as the priority in ethical theory are true abandoners (Gītā 18: 11). Unlike those who merely choose the life of the recluse (being a hermit, or perhaps joining a monastery), the true renunciate has got rid of teleology. A new paradox ensues. Those who operate under the regulative ideal are increasingly challenged to account for their action in terms of the ideal. This means that it becomes increasingly difficult to understand oneself as deliberating and thereby choosing. As an example, the musically virtuous, who has cultivated this virtue by devotion to the ideal of music, which abstracts from particular musicians and performances, no longer needs to explain his or her performance by entry-level rules and theory taught to beginners. Often one sees this narrative used to motivate Virtue Ethics (Annas 2004), but in the case of Bhakti, the teacher is not a virtuous person, but rather our devotion to the regulative ideal: This yields knowledge, and the ideal is procedural, not actual dispositions or strengths. This explains how it is that virtuosos can push their craft and continually reset the bar for what counts as good action—for in these cases there is no virtuous teacher to defer to but leaders themselves in their fields setting the standards for others. Their performance constitutes the principles that others emulate. Bhakti is the generalization of this process: Devotion to the procedural ideal leads to performances that reset one’s own personal standards and in the long run everyone’s standards. Thus, Bhakti is not obviously a version of moral particularism, which often denies the moral importance of principles (Dancy 2017). Surely they are important, but they are generated by devotion to the ideal of the Lord, unconservativism, and self-governance. One of the implications of this immersion in procedural practical rationality is the utter disavowal of teleological standards to assess progress. In this light, claims like “He who is free from the notion “I am the doer,” and whose understanding is not tainted—slays not, though he slays all these men, nor is he bound” (Gītā 18:17), are explicable by the logic of devotion to a procedural ideal. In the path of devotion, the individual themselves cannot take credit, lest they confuse the morality of their action with the goodness of their performance. Hence, we find “that agent is said to be illuminated (sattvika) who is free from attachment, who does not make much of himself, who is imbued with steadiness and zeal and is untouched by successes and failure” (Gītā 18:26).
At this juncture, Kṛṣṇa introduces an explicitly deontological account of duty, which cashes duty out in terms of goodness of something to be done relative to one’s place in the moral order. Duty, caste duty specifically, is duty suited to one’s nature (Gītā 18.41). He also recalls the procedural claim: better one’s own duty poorly done than another’s well done (Gītā 18.47). Kṛṣṇa claims that the right thing to do is specified by context transcendent rules that take into account life capacities and situate us within a reciprocal arrangement of obligations and support (Gītā 12.5-13, 33-35). These are moral principles. He also further argues that good things happen when people stick to their duty (Gītā 2.32). Deontologically, this is to be expected if duty is good action. Yet, Kṛṣṇa has also defended a more radical procedural ethic, of Bhakti, and this is the direction of his dialectic, which allows the individual to be subsumed by the moral ideal (Gītā 18.55). However, this subsumption leads not only to renouncing outcomes to the ideal, in the final analysis, it should also lead to giving up on moral principles—good rules—as a sacrifice to the ideal (Gītā 18.57). Indeed, especially if the good actions and rules are themselves a mere outcome of devotion, moral progress would demand that we abandon them as standards to assess our own actions, as we pursue devotion.
In the Western tradition, going beyond duty in service of an ideal of morality is often called supererogation (for a classic article on the topic, see Urmson 1958). Here, Kṛṣṇa appears to be recommending the supererogatory as a means of embracing Bhakti. This leads to excellence in action that surmounts all challenges (Gītā 18:58). This move, however, treats Deontology and its substance—moral rules, principles, and good actions—as matters to be sacrificed for the ideal. Hence, in light of the tension between bhakti and karma yoga, and that bhakti yoga is the radically procedural option, which does away with teleological considerations altogether—the very considerations that lead to Arjuna’s despondency in the face of evil (the moral parasitism of the Kauravas)—Kṛṣṇa recommends bhakti and the idea that Arjuna should abandon all ethical claims (dharmas) and come to him, and that he (the procedural ideal, the Lord) will relieve Arjuna of all evil (Gītā 18:66). This seems like an argument for moral nihilism in the abstract, but it is consistent with the logic of the moral theory of bhakti, which Kṛṣṇa has defended all along.
This conclusion is exactly where the MTA should end if it is a dialectic that takes us from teleological considerations to a radical proceduralism. The MTA is an argument that takes us to a procedural ethics on the premise of the failures of teleological approaches. In the absence of this overall dialectic, the concluding remarks of Kṛṣṇa seem both counterintuitive but also contradictory to the main thrust of his argument. He has after all spent nearly sixteen chapters of the Gītā motivating Arjuna to stick to his duty as a warrior and here he recommends abandoning that for the regulative ideal: himself. If right action is ultimately defined by the regulative ideal, however, then devotion to this ideal would involve sacrificing conventional moral expectations.
When assessing the moral backdrop of the BhagavadGītā in the Mahābhārata, it is quite apparent that it is conventional morality, organized around the good, that creates the context in which the Pāṇḍavas are terrorized, and hence Kṛṣṇa’s recommendation that Arjuna should simply stop worrying about all moral considerations and simply approximate him as the regulative ideal is salutary, insofar as Kṛṣṇa represents the regulative ideal of yoga: unconservativism united with self-governance. It is also part of a logic that pushes us to embrace a radical procedural moral theory at the very breakdown of conventional morality: war.
After the Bhagavad Gītā ends, and the Pāṇḍavas wage war on the Kauravas, Kṛṣṇa in the epic the Mahābhārata counsels the Pāṇḍavas to engage in acts that violate conventional moral expectations. Viewed from the lens of conventional morality, this seems to be bad, and wrong. However, it was conventional morality that the Kauravas used as a weapon against the Pāṇḍavas, so Kṛṣṇa’s argument was not only expedient, it also permitted an alternative moral standard to guide the Pāṇḍavas at the breakdown of conventional morality. Kṛṣṇa himself points out that winning against the Kauravas would not have been possible had the Pāṇḍava’s continued to play by conventional morality (compare Harzer 2017). The argument for the alternative, though, made no appeal to success or the good as a moral primitive. It appealed to the radical proceduralism of Bhakti.
11. Scholarship
Influential scholarship on the Bhagavad Gītā begins with famous Vedānta philosophers, who at once acknowledge the Gītā as a smṛti text—a remembered or historical text—but treated it on par with the Upaniṣads of the Vedas: a text with intuited content (śruti). In the context of the procedural ethics and Deontology of the later Vedic tradition, as we find in the Pūrva Mīmāṃsā and Vedānta traditions, the Vedas is treated as a procedural justification for the various goods of practical rationality. For Vedānta authors, concerned with the latter part of the Vedas, the Gītā is an exceptional source, as it not only summarizes the teleological considerations of the earlier part of the Vedas but also pursues the Moral Transition Argument (from teleology to proceduralism) to a conclusion that we find expressed in the latter part of the Vedas, while ostensibly also endorsing a caste and Brahmanical frame that made their scholarship and activity possible. Yet, the commentaries on the Gītā differ significantly.
Competing commentaries on the Gītā are interesting independent of their fidelity to the text (compare Ram-Prasad 2013). Yet, as a matter of readings, they are different in accuracy and relevance. If interpretation—explanation by way of what one believes—is the default method of reading philosophical texts, then we should expect that the various commentaries on the Gītā from such philosophers would differ in accordance with the beliefs of the interpreter. Interpreted, there are as many accounts of the Gītā as there are belief systems of interpreters. The standard practice in philosophy (explication), however, employs logic to tease out reasons for controversial conclusions, so that contributions can be placed within a debate. This allows philosophers who disagree the ability to converge on philosophical contributions as contributions to a disagreement. Or put another way, in allowing us to understand a text such as the Gītā in terms of its contribution to philosophical debate, an explicatory approach allows us to formulate divergent opinions about the substantive claims of a text such as the Gītā. (For more on the divergence between interpretation and explication, see Ranganathan 2021.) In the South Asian tradition, the term often used to capture the challenge of philosophical understanding of texts is mīmāṃsā: investigation, reflection. This idea parts way with interpretation, as investigation is not an explanation of what one already believes, and shares much more with the explicatory activity of philosophy. Not all traditional scholars were equally skilled in differentiating their beliefs from the challenge of accounting for the Gītā, however.
Śaṅkara, in his famous preamble to his commentary on the BrahmaSūtra, argues in general that ordinary reality as we understand it is a result of a superimposition of subjectivity and the objects of awareness, resulting in an ersatz reality, which operates according to definite natural regularities (compare Preamble Śaṅkara [Ādi] 1994). Śaṅkara’s view is hence an interpretive theory of ersatz reality: It is by this confusion of the perspective of the individual and what they experience that we get the world as we know it. According to Śaṅkara, hence, laudable teachings on dharma, from Kṛṣṇa too, help desiring individuals regulate their behavior within this ersatz reality—a position defended as “desireism” (Marks 2013). However, in his commentary on the Gītā, Śaṅkara claims that those who are interested in freedom (mokṣa) should view dharma as an evil (commentary at 4.21, Śaṅkara [Ādi] 1991), for dharma brings bondage. On the whole, Śaṅkara argues that the point of the Gītā is not a defense of engaged action (that is what Kṛṣṇa defends) but rather non-action—renunciation. The moment we have knowledge, the individual will no longer be able to engage in action (GītāBhāṣya 5)—a position reminiscent of the Sāṅkhya Kārikā (67). The Gītā ’s theme, in contrast, is that jñāna yoga provides insight that is ancillary to karma yoga and especially bhakti yoga. When Kṛṣṇa argues at the end (Gītā 18:66) that we should abandon all dharmas and approach him, this seems like a vindication of Śaṅkara’s gloss. If Śaṅkara had adopted the policy of explication, Gītā 18:66 and other controversial claims in the texts would have to be explained by general theories entailed by what is said everywhere else in the Gītā. Interpreters in contrast seize on claims in isolation—the ones that reflect their doxographic commitments—and use these as a way to make sense of a text, and that indeed seems to be Śaṅkara’s procedure: If he were to elucidate a controversial claim as we find at Gītā 18:66 by way of the rest of the text, he would have to take at face value the various positive declarations and endorsements of action by Kṛṣṇa. In short, it is unclear whether one can provide Śaṅkara’s reading of the Gītā if one were not already committed to the position. For explicated (explained as an entailment of other theses expressed or entailed by the text), Gītā 18.66 as a controversial claim would not have the status of a first principle of reading the Gītā, but rather something to be logically explained by theories defended elsewhere in the text.
Rāmānuja, perhaps the most influential philosopher in India in the early twenty-first century, though not popularly thought to be so (Potter 1963: 252-253), endorses Kṛṣṇa’s arguments for karma yoga and bhakti yoga, but is challenged at the end when Kṛṣṇa recommends that we abandon all dharmas and seek out Kṛṣṇa, for this contradicts the idea of Kṛṣṇa as the moral ideal, who we please by way of dutiful action—an idea that Kṛṣṇa has some part in encouraging in the Gītā. So Rāmānuja argues in his GītāBhāṣya (commentary) on 18:66 (Rāmānuja 1991) that there are two possible ways of reading this ironic ending given everything else that has preceded it in the Gītā: (a) Kṛṣṇa is recommending the abandonment of Deontology and the fixation on good rules (as we find set out in the secondary, Vedic literature), in favor of an ethics of devotion; (b) or, Kṛṣṇa is providing encouragement to his friend, who is overwhelmed by an ethic of devotion, an encouragement that is itself consistent with the ethic of devotion. Rāmānuja’s two readings are outcomes of attempting to apply the theories of karma yoga and bhakti yoga to this last claim. If one assumes karma yoga as filling out duties, then indeed, Kṛṣṇa seems to be rejecting this in some measure. If one assumes bhakti, then 18:66 seems to be a straight entailment of the theory. However, Rāmānuja does insist, in keeping with what is said elsewhere, that this last controversial claim cannot be read as an invitation to abandon all ethical action as such, as this does not follow from the preceding considerations. Doing one’s duty is itself a means of devotion—an argument Kṛṣṇa delivers at the start of the Gītā, and so to this extent, duty cannot be abandoned without abandoning a means of devotion—not to mention that one’s duty is something suitable to oneself that one can perfect. One can and should however abandon conventional moral expectations, also called dharmas. This criticism of conventional dharma is at the root of the motivation for the Gītā.
Modern commentators on the Gītā largely continue the tradition in the literature of interpreting South Asian thought: To interpret South Asian thought is to use one’s beliefs in explaining the content of a text, and these beliefs are often derived from the interpreter’s exposure to Western theories—unsurprising if the theory of interpretation is generated by the historical account of thought in the Western tradition (Ranganathan 2021, 2018b, 2018a). Scholars who hence interpret South Asian philosophy and the Gītā, given their beliefs that are developed within the historical context of Western philosophy, will be inclined to read the Gītā in terms of familiar options of the Western tradition. Here, we find absent Bhakti/Yoga as a moral theory, and instead the main options are Virtue Ethics, Consequentialism, and Deontology. If we had to interpret the Gītā with these options, Kṛṣṇa’s encouragement that those who stick to their duty and are devoted to him will meet a good outcome sounds rather like Rule Consequentialism—the idea that we should adopt a procedure that promises to bring a good result in the long term (Sreekumar 2012; Theodor 2010). Deontological interpretations, by authors such as Amartya Sen, have been floated and criticized (Anderson 2012). Explicated, we look to the perspectives themselves to entail a theory that entails the various controversial claims, and we thereby bypass trying to assess the Gītā by way of familiar options in the Western tradition. One of the outcomes of course is the acknowledgment of a fourth moral theory: Bhakti/Yoga.
Further to this trend of seeing the Gītā via interpretation is the difficulty it causes for making sense of various claims Kṛṣṇa makes. For instance, explicated, the Gītā is a push for a procedural approach to morality that undermines the importance of the good and outcomes in general for thinking about practice. One of the results of this dialectic is that Kṛṣṇa, as the regulative ideal, takes responsibility for outcomes as something that is out of the control of the individual. In the absence of the context of practical rationality, this seems like an argument for hard determinism (Brodbeck 2004). Yet, when we bring back Bhakti/Yoga into the picture, this is consistent with the devaluation of outcomes that comes in tow with thinking about morality as a matter of conforming to the procedural ideal: The ideal accounts for the outcomes, not our effort, so the closer we are to a procedural approach and bhakti, the better the outcomes, but the less one will be able to call upon personal effort as the explanation for the outcomes.
All things considered, reading the Gītā via interpretation renders it controversial, not merely in scope and topic (for all philosophy is controversial in this way) but also in terms of content—it is unclear what the text has to say, for the reading is determined in large measure by the beliefs of the interpreter. Yet, ironically, interpretation deprives us the capacity to understand disagreement as we can only thereby understand in terms of what we believe (and disagreement involves what we do not believe), so the controversy of the conflicting interpretations of the Gītā remains opaque. Explication, an explanation by way of logic that links these with controversial conclusions, renders the content of controversy clear, but this also allows us to converge on a reading though we may substantively disagree with the content of the reading. The Gītā itself displays such explicatory talent as it constitutes an able exploration of moral theoretical disagreement. Students of the text benefit from adopting its receptivity to dissent, both in being able to understand its contribution to philosophy but also in terms of the inculcation of philosophical thinking.
12. References and Further Reading
The Aitareya Brahmanam of the Rigveda. 1922. Translated by Martin Haug. Edited by The Sacred Books of the Hindus. Vol. 2, Allahabad: Sudhindra Nath Vas, M.B.
Alexander, Larry , and Michael Moore. Winter 2012. “Deontological Ethics.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta: http://plato.stanford.edu/archives/win2012/entries/ethics-deontological/.
Allen, Nick 2006. “Just War in the Mahābhārata.” In The Ethics of War: Shared Problems in Different Traditions, edited by Richard Sorabji and David Rodin, 138-49. Hants, England: Ashgate Publishing Limited.
Anderson, Joshua. “Sen and the Bhagavad Gita: Lessons for a Theory of Justice.” Asian Philosophy 22, no. 1 (2012/02/01 2012): 63-74.
Annas, Julia. “Being Virtuous and Doing the Right Thing.” Proceedings and Addresses of the American Philosophical Association 78, no. 2 (2004): 61-75.
Brink, David Owen. 1989. Moral Realism and the Foundations of Ethics. In Cambridge Studies in Philosophy. Cambridge; New York: Cambridge University Press.
Brink, David Owen. 1995. “Moral Realism.” In The Cambridge Dictionary of Philosophy, edited by Robert Audi, 511-512. Cambridge ; New York: Cambridge University Press.
Brodbeck, Simon. “Calling Krsna’s Bluff: Non-Attached Action in the Bhagavadgītā.” Journal of Indian Philosophy 32, no. 1 (February 01 2004): 81-103.
Cabezón, José Ignacio. “The Discipline and its Other: The Dialectic of Alterity in the Study of Religion.” Journal of the American Academy of Religion 74, no. 1 (2006): 21-38.
Clooney, Francis X. 2017. “Toward a Complete and Integral Mīmāṃsā Ethics: Learning with Mādhava’s Garland of Jaimini’s Reasons.” In The Bloomsbury Research Handbook of Indian Ethics edited by Shyam Ranganathan, 299-318 Of Bloomsbury Research Handbooks in Asian Philosophy, edited by. London: Bloomsbury Academic.
Dancy, Jonathan. 2017. “Moral Particularism.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta. https://plato.stanford.edu/archives/win2017/entries/moral-particularism/.
Davis, Richard H. 2015. The Bhagavad Gita: a Biography. Princeton: Princeton University Press.
Ellwood, R.S., and G.D. Alles. 2008. “Bhagavad-Gita.” In The Encyclopedia of World Religions: Facts on File.
Foot, Philipa. 2007. “The Problem of Abortion and The Doctrine of the Double Effect.” In Ethical Theory: An Anthology, edited by Russ Shafer-Landau, 582-589 Of Blackwell Philosophy Anthologies, edited by. Malden, MA: Blackwell Pub.
Fritzman, J. M. “The Bhagavadgītā, Sen, and Anderson.” Asian Philosophy 25, no. 4 (2015): 319-338.
Gandhi, M. K. 1969. The Collected Works of Mahatma Gandhi. New Delhi: Publication Division.
Goodman, Charles. 2009. Consequences of Compassion an Interpretation and Defense of Buddhist Ethics. Oxford: Oxford University Press.
Gottschalk, Peter. 2012. Religion, Science, and Empire: Classifying Hinduism and Islam in British India. Oxford: Oxford University Press.
Harzer, Edeltraud. 2017. “A Study in the Narrative Ethics of the Mahābhārata.” In The Bloomsbury Research Handbook of Indian Ethics edited by Shyam Ranganathan, 321-340. London: Bloomsbury Academic.
Hursthouse, Rosalind. 1996. “Normative Virtue Ethics.” In How Should One Live?, edited by Roger Crisp, 19–33. Oxford: Oxford University Press.
Hursthouse, Rosalind. 2013. Virtue Ethics. Edited by Edward N. Zalta. In The Stanford Encyclopedia of Philosophy. http://plato.stanford.edu/archives/fall2013/entries/ethics-virtue/.
Jezic, Mislav. 2009. “The Relationship Between the Bhagavadgītā and the Vedic Upaniṣads: Parallels and Relative Chronology.” In Epic Undertakings, edited by R.P. Goldman and M. Tokunaga, 215-282: Motilal Banarsidass Publishers.
Kripke, Saul A. 1980. Naming and Necessity. Cambridge, Mass.: Harvard University Press.
Marks, J. 2013. Ethics Without Morals: In Defence of Amorality. Routledge.
McMahan, Jeff. 2009. Killing in War. Oxford: Oxford University Press.
Potter, Karl H. 1963. Presuppositions of India‘s Philosophies. In Prentice-Hall Philosophy Series. Englewood Cliffs, N.J.: Prentice-Hall.
Ram-Prasad, C. 2013. Divine Self, Human Self: The Philosophy of Being in Two Gita Commentaries. London: Bloomsbury.
Rāmānuja. 1991. Śrī Rāmānuja Gītā Bhāṣya (edition and translation). Translated by Svami Adidevanada. Madras: Sri Ramakrishna Math.
Ranganathan, Shyam. 2016a. “Hindu Philosophy.” In Oxford Bibliographies Online, edited by Alf Hiltebeitel. http://www.oxfordbibliographies.com/.
Ranganathan, Shyam. 2016b. “Pūrva Mīmāṃsā: Non-Natural, Moral Realism (4.14).” In Ethics 1, edited by S. Ranganathan. OfPhilosophy, edited by A. Raghuramaraju.
Ranganathan, Shyam (Ed.). 2017a. The Bloomsbury Research Handbook of Indian Ethics Of Bloomsbury Research Handbooks in Asian Philosophy. London: Bloomsbury Academic.
Ranganathan, Shyam. 2017b. “Patañjali’s Yoga: Universal Ethics as the Formal Cause of Autonomy.” In The Bloomsbury Research Handbook of Indian Ethics edited by Shyam Ranganathan, 177-202. London: Bloomsbury Academic.
Ranganathan, Shyam. 2017c. “Three Vedāntas: Three Accounts of Character, Freedom and Responsibility.” In The Bloomsbury Research Handbook of Indian Ethics, edited by Shyam Ranganathan, 249-274. London: Bloomsbury Academic.
Ranganathan, Shyam. 2018a. “Context and Pragmatics.” In The Routledge Handbook of Translation and Philosophy edited by Philip Wilson and J Piers Rawling, 195-208 Of Routledge Handbooks in Translation and Interpreting Studies edited by. New York: Routledge.
Ranganathan, Shyam. 2018b. Hinduism: A Contemporary Philosophical Investigation. In Investigating Philosophy of Religion. New York: Routledge.
Ranganathan, Shyam. 2018c. “Vedas and Upaniṣads.” In The History of Evil in Antiquity 2000 B.C.E. – 450 C.E., edited by Tom Angier, 239-255 Of History of Evil, edited by C. Taliaferro and C. Meister. London: Routledge.
Ranganathan, Shyam. 2019. “Just War and the Indian Tradition: Arguments from the Battlefield.” In Comparative Just War Theory: An Introduction to International Perspectives edited by Luis Cordeiro-Rodrigues and Danny Singh, 173-190. Lanham, MD: Rowman & Littlefield.
Ranganathan, Shyam. 2021. “Modes of Interpretation.” In Encyclopedia of Religious Ethics, edited by William Schweiker, David A. Clairmont and Elizabeth Bucar. Hoboken NJ: Wiley Blackwell.
Śaṅkara (Ādi). 1991. Bhagavadgita with the commentary of Sankaracarya. Translated by Swami Gambhirananda. Calcutta: Advaita Ashrama.
Śaṅkara (Ādi). 1994. The Vedānta Sūtras with the Commentary by Śaṅkara (Brahma Sūtra Bhāṣya). Translated by George Thibaut. In Sacred books of the East 34 and 38.2 vols.: http://www.sacred-texts.com/hin/sbe34/sbe34007.htm.
Sayre-McCord, Geoff. Spring 2015 Edition. “Moral Realism.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta: http://plato.stanford.edu/archives/spr2015/entries/moral-realism/.
Shafer-Landau, Russ. 2003. Moral Realism: a Defence. Oxford: Clarendon.
Soni, Jayandra. 2017. “Jaina Virtue Ethics: Action and Non-Action” In The Bloomsbury Research Handbook of Indian Ethics edited by Shyam Ranganathan, 155-176. London: Bloomsbury Academic.
Sreekumar, Sandeep. “An Analysis of Consequentialism and Deontology in the Normative Ethics of the Bhagavadgītā.” Journal of Indian Philosophy 40, no. 3 (2012): 277-315.
Theodor, Ithamar. 2010. Exploring the Bhagavad Gitā: Philosophy, Structure, and Meaning. Farnham: Ashgate.
Urmson, J.O. 1958. “Saints and Heroes.” In Essays in Moral Philosophy, edited by A. I. Melden, 198-216. Seattle,: University of Washington Press.
Yāmunācārya. 1991. “Gītārtha-saṅgraha.” In Śrī Rāmānuja Gītā Bhāṣya (edition and translation), translated by Svami Adidevanada, 1-8. Madras: Sri Ramakrishna Math.
Author Information
Shyam Ranganathan
Email: shyamr@yorku.ca
York University
Canada
Humility Regarding Intrinsic Properties
The Humility Thesis is a persistent thesis in contemporary metaphysics. It is known by a variety of names, including, but not limited to, Humility, Intrinsic Humility, Kantian Humility, Kantian Physicalism, Intrinsic Ignorance, Categorical Ignorance, Irremediable Ignorance, and Noumenalism. According to the thesis, we human beings, and any knowers that share our general ways of knowing, are irremediably ignorant of a certain class of properties that are intrinsic to material entities. It is thus important to note that the term ‘humility’ is unrelated to humility in morality, but rather refers to the Humility theorist’s concession that our epistemic capabilities have certain limits and that therefore certain fundamental features of the world are beyond our epistemic capabilities. According to many Humility theorists, our knowledge of the world does not extend beyond the causal, dispositional, and structural properties of things. However, things have an underlying nature that is beyond these knowable properties: these are the sort of properties that are intrinsic to the nature of things, and which ground their existence and the causal-cum-structural features that we can know about. If any such properties exist, they do not fall within any class of knowable properties, and so it follows that human beings are unable to acquire any knowledge of them.
There are at least six questions regarding the Humility Thesis: (a) What exactly is the relevant class of properties? (b) Do such properties really exist? (c) Why are we incapable of knowing of them? (d) Is it true that we are incapable of knowing of them? (e) Even if we are incapable of knowing them, is this claim true only about our general ways of knowing things, with certain idiosyncratic exceptions? (f) How can this thesis be applied to other areas of philosophy and the history of ideas? This article explores some responses to these questions.
To begin with, the question of immediate concern in regard to the Humility Thesis is the nature of the relevant class(es) of properties. Any subsequent discussion is impossible without a proper characterisation of the subject matter under discussion. Furthermore, in order to understand the nature of these properties, a rough idea of why some philosophers believe in their existence also is required, for this helps us to understand the role these properties play in the ontological frameworks of those who believe in their existence.
a. Terminological Variety
There is great terminological variety in the literature on Humility. Different authors discuss different properties: intrinsic properties (Langton 1998), categorical properties (Blackburn 1990; Smith & Stoljar 1998), fundamental properties (Lewis 2009; Jackson 1998), and so on. Very roughly, for our current purposes, the three mentioned kinds of properties can be understood as follows:
Intrinsic properties: Properties that objects have of themselves, independently of their relations with other things (for example, my having a brain).
Categorical properties: Properties that are qualitative, and not causal or dispositional‑namely, not the properties merely regarding how things causally behave or are disposed to causally behave (for example, shape and size).
Fundamental properties: Properties that are perfectly basic in the sense of not being constituted by anything else. (Contemporary physics tells us that mass, charge, spin, and the like are so far the most fundamental properties we know of, but it is an open question as to whether current physics has reached the most fundamental level of reality and whether it could ever reach it.)
Some authors also use the term ‘quiddities’ (Schaffer 2005; Chalmers 2012), which is taken from scholastic philosophy. The term historically stood for properties that made the object ‘what it is’, and was often used interchangeably with ‘essence’. In the contemporary literature on the Humility Thesis, it roughly means:
Quiddities: Some properties—typically intrinsic properties, categorical properties, or fundamental properties—that individuate the objects that hold them, and which ground the causal, dispositional, and structural properties of those objects.
At first glance, looking at the above characterisations of properties, the claim that the Humility Thesis concerns them may seem confusing to some non-specialists. For the list above gave examples of intrinsic properties and of categorical properties, and clearly we have knowledge of these examples. Furthermore, it may seem possible that properties like mass, charge, and spin are indeed fundamental as current physics understands them, and it at least seems conceivable that physics may uncover more fundamental levels of reality in the future and thus eventually reach the most fundamental level. A Humility theorist will answer that we are not irremediably ignorant of all conceivable intrinsic, categorical, or fundamental properties but only of a particular class of them. For example, Langton distinguishes between comparatively intrinsic properties from absolutely intrinsic properties: comparatively intrinsic properties are constituted by causal, dispositional properties, or by structural properties, whereas absolutely intrinsic properties are not so constituted. And her thesis concerns absolutely intrinsic properties, not comparatively intrinsic properties (Langton 1998, pp. 60-62). When Lewis discusses our ignorance of fundamental properties, he explicitly states that in his own view fundamental properties are intrinsic and not structural or dispositional (Lewis 2009, p. 204, 220-221n13) (though he also thinks that Humility spreads to structural properties—see Section 1c for discussion).
With this in mind, despite the terminological variety, one possible way to understand the literature is that the main contemporary authors are in fact concerned with roughly the same set of properties (Chan 2017, pp. 81-86). That is, what these authors describe is often the same set of properties under different descriptions. More specifically, when authors discuss our ignorance of properties which they may describe as intrinsic, categorical, or fundamental, the relevant properties are most often properties that belong to all three families, not only some of them—even though our ignorance of these properties may spread to non-members of the families when they constitute them (see Section 1c for further discussion). Henceforth, for the sake of simplicity, this article will only use the term ‘intrinsic properties’, unless the discussion is about an author who is discussing some other kind of property. But the reader should be aware that the intrinsic properties concerned are of a specific narrower class.
b. The Existence and Characteristics of the Properties Concerned: An Elementary Introduction
A further question is whether and why we should believe in the existence of the relevant intrinsic properties. Answering this question also allows us to understand the role that these properties have in the ontological frameworks of those who believe in their existence. This question concerns the debate between categoricalism (or categorialism) and dispositionalism in the metaphysics of properties. The categorialist believes that all fundamental properties are categorical properties, the latter of which are equivalent to the kind of intrinsic properties discussed in this article (see Section 1a). By contrast, the dispositionalist believes that all fundamental properties are dispositional properties, without there being any categorical properties that are more fundamental. This section surveys some common, elementary motivations for categoricalism.
Importantly, the reader should be aware that the full scope of this debate is impossible to encompass within this article, and thus the survey below is only elementary and thus includes only three common and simple arguments. There are some further, often more technically sophisticated arguments, for and against the existence of categorical properties, some of which are influential in the literature (see the article on ‘properties’).
The three surveyed arguments are interrelated. Many philosophers believe that the most fundamental physical properties discovered by science, such as mass, charge, and spin, are dispositional properties: the measure of an object’s mass is a measure of how the object is disposed to behave in certain ways (such as those observed in experiments). The three arguments are, then, all attempts to show that dispositional properties lack self-sufficiency and cannot exist in their own right, and thereby require some further ontological grounds – which the categorialist takes to be categorical properties. Note, though, that there are also some categorialists who do not posit categorical properties as something further to and distinct from causal, dispositional, and structural properties. Rather, they take the latter properties to be property roles which have to be filled in by realiser properties, in this case categorical properties (Lewis 2009).
i. The Relational Argument
Causal and dispositional properties appear to be relational. Specifically, when we say that an object possesses certain causal and dispositional properties, we are either describing (1) the way that the object responds to and interacts with other objects, or (2) the way that the object transforms into its future counterparts. Both (1) and (2) are relational because they concern the relation between the object and other objects or its future counterparts. The problem is whether an object can merely possess such relational properties. Some philosophers do not think so. For them, such objects would be a mere collection of relations, with nothing standing in the relevant relations. This means that there are brute relations without relata; and this seems implausible to them (Armstrong 1968, p. 282; Jackson 1998, p. 24; compare Lewis 1986, p. x). Hence, they argue, objects involved in relations must have some nature of their own that is independent of their relations to other objects, in order for them, or the relevant nature, to be the adequate relata. The candidate nature that many philosophers have in mind for what could exist independently is categorical properties. It is important to note, though, that some dispositionalists believe that dispositions could be intrinsic and non-relational, and thus reject this argument (Borghini & Williams 2008; Ellis 2014). There are also philosophers who accept the existence of brute relations (Ladyman & Ross 2007).
ii. The Argument from Abstractness
The causal and dispositional properties we find in science are often considered geometrical and mathematical, and hence overly abstract. On the one hand, Enlightenment physics is arguably all about the measure of extension and motion of physical objects: extension is ultimately about the geometry of an object’s space-occupation, and motion is ultimately the change in an object’s location in space. On the other hand, contemporary physics is (almost) exhausted by mathematical variables and equations which reflect the magnitudes of measurements. These geometrical properties and mathematical properties have seemed too abstract to many philosophers. For these philosophers, the physical universe should be something more concrete: there should be something more qualitative and robust that can, to use Blackburn’s phrase, ‘fill in’ the space and the equations (Blackburn 1990, p. 62). If this were not the case, these philosophers argue, there will be nothing that distinguishes the relevant geometrical and quantitative properties from empty space or empty variables that lack actual content (for examples of the empty space argument, see Armstrong 1961, pp. 185-187; Blackburn 1990, pp. 62-63; Langton 1998, pp.165-166; for examples of the empty variable argument, see Eddington 1929, pp. 250-259; Montero 2015, p. 217; Chalmers 1996, pp. 302-304). In this case too, the candidate that these philosophers have in mind to ‘fill in’ the space and the equations is categorical properties. It is important to note, though, that some structuralist philosophers and scientists believe that the world is fundamentally a mathematical structure, and would presumably find this argument unappealing (Heisenberg 1958/2000; Tegmark 2007; cf. Ladyman & Ross 2007).
iii. The Modal Argument
Causal and dispositional properties appear to be grounded in counterfactual affairs. Specifically, it appears that objects could robustly possess their causal and dispositional properties, even when those properties do not manifest themselves in virtue of the relevant behaviours. Consider the mass of a physical object. We may regard it as a dispositional property whose manifestations are inertia and gravitational attraction. Intuitively speaking, it seems that even when a physical object exhibits no behaviours related to inertia and gravitational attraction, it could nonetheless possess its mass. The question that arises is the following: what is the nature of such a non-manifest mass? One natural response is that its existence is grounded in the following counterfactual: in some near possible worlds where the manifest conditions of the dispositional property are met, and in which the manifest behaviours are found, the dispositional property manifests itself. But some philosophers find it very awkward and unsatisfactory that something actual is grounded in non-actual, otherworld affairs (see, for example, Blackburn 1990, pp. 64-65; Armstrong 1997, p. 79). A more satisfactory response, for some such philosophers, is that dispositional properties are grounded in some further properties which are robustly located in the actual world. Again, the candidate that many philosophers have in mind is categorical properties (but see Holton 1999; Handfield 2005; Borghini & Williams 2008).
c. Extending the Humility Thesis
Before continuing, it is worth noting that our irremediable ignorance of the above narrow class of intrinsic properties may entail our irremediable ignorance of some further properties. Lewis, for example, holds a view that he calls ‘Spreading Humility’ (2009, p. 214). He argues that almost all structural properties supervene on fundamental properties, and since we cannot know the supervenience bases of these structural properties, we cannot have perfect knowledge of them either. That is, most properties that we are ignorant of are not fundamental properties. Lewis concludes that we are irremediably ignorant of all qualitative properties, regardless of whether they are fundamental or structural, at least under ‘a more demanding sense’ of knowledge (Lewis 2009, p. 214) (for further discussion, see Section 2). Of course, the Spreading Humility view requires a point of departure before the ‘spreading’ takes place. In other words, the basic Humility Thesis, which concerns a narrower class of properties, must first be established before one can argue that any ‘spreading’ is possible.
2. The Scope of the Missing Knowledge
Throughout the history of philosophy, it has never been easy to posit irremediable ignorance of something. For the relevant theorists seem to know the fact that such things exist and their relations to the known world, as in the case of unknowable intrinsic properties (see Section 1). Specifically, to say that there is a part of the world of which we are ignorant, we at least have to say that the relevant things exist. Furthermore, we only say that the relevant things exist because they bear certain relations to the known parts of the world, and thus help to explain the nature of the latter. But this knowledge appears inconsistent with the alleged irremediable ignorance of such things—this problem traces back to Friedrich Heinrich Jacobi’s famous objection to Kant’s idea of unknowable things in themselves (for the latter, see Section 3c) (Jacobi 1787/2000, pp. 173-175; see also Strawson 1966, pp. 41-42). What adds to this complexity is that some Humility theorists go on and debate the metaphysical nature of the unknowable intrinsic properties, such as whether they are physical or fundamentally mental (see Sections 7a and 8). In order to avoid the above inconsistency, the Humility Thesis should be carefully framed. That is, the scope of our ignorance of intrinsic properties should be made precise.
There is at least one influential idea among contemporary Humility theorists: that the Humility Thesis concerns knowledge-which, to use Tom McClelland’s (2012) term (Pettit 1998; Lewis 2009; Locke 2009; McClelland 2012). More precisely, under Humility we are unable to identify a particular intrinsic property: when facing, say, a basic dispositional property D, we would not be able to tell which precise intrinsic property grounds it. For we are unable to distinguish the multiple possibilities in which different intrinsic properties do the grounding, and to tell which possibility is actual. For example, if there are two possible intrinsic properties I1 and I2 that could do the job, we would not be able to tell any difference and thereby identify the one that actually grounds D. This idea is based upon the multiple realisability argument for Humility, which is discussed in detail in Section 4c. By contrast, the sort of intrinsic knowledge discussed in the previous paragraph concerns only the characterisation, not the identification of intrinsic properties, and is thus not the target of Humility Thesis under this understanding.
Nonetheless, while the knowledge-which understanding of Humility may offer the required precision, it is definitely not conclusive. Firstly, it leads to some objections to Humility which seek to show that the relevant knowledge-which, and so the knowledge-which understanding, are trivial (see Sections 5a and 5b). Secondly, many Humility theorists believe that intrinsic properties possess some unknowable qualitative natures apart from their very exact identities (Russell 1927a/1992; Heil 2004). It remains unclear whether the knowledge-which understanding can fully capture the kind of Humility they have in mind (for further discussion, see Section 5b). Note that such unknowable qualitative natures are especially important to those Humility theorists who want to argue that certain intrinsic properties constitute human consciousness (see Sections 3d and 7) or other mysterious things (see Section 3a). Thirdly, the Humility theorists Rae Langton and Christopher Robichaud (2010, pp. 175-176) hold an even more ambitious version of the Humility Thesis. They argue that we cannot even know of the metaphysical nature of intrinsic properties, such as whether or not they are fundamentally mental. Thus, they dismiss the knowledge-which understanding as too restricted and conservative (for further discussion, see Section 8).
In sum, the scope of Humility remains controversial, even among its advocates, and has led to certain criticisms. In the following discussion, a number of problems surrounding the scope of Humility is explored.
3. A Brief History of the Humility Thesis
Like many philosophical ideas, the Humility Thesis has been independently developed by many thinkers from different traditions over the course of history. This section briefly explores some representative snapshots of its history.
a. Religious and Philosophical Mysticism
Ever since ancient times, the Humility Thesis and similar theories have played an important role in religious and mystical thought. However, most of the relevant thinkers did not fully embrace the kind of ignorance described by the Humility Thesis: they believed that such an epistemic limit is only found in our ordinary perceptual and scientific knowledge, but that it can be overcome by certain meditative or mystical knowledge.
A certain form of Hindu mysticism is paradigmatic of this line of thought. According to the view, there is an ultimate reality of the universe which is called the Brahman. The Brahman has a variety of understandings within Hinduism, but a common line of understanding, found for example in the Upanishads, takes it as the single immutable ground and the highest principle of all beings. The Brahman is out of reach of our ordinary sensory knowledge. However, since we, like all other beings in the universe, are ultimately grounded in and identical to the Brahman, certain extraordinary meditative experiences—specifically the kind in which we introspect the inner, fundamental nature of our own self—allow us to grasp it (Flood 1996, pp. 84-85; Mahony 1998, pp. 114-121).
Arguably, the Brahman may be somewhat analogous to the unknowable intrinsic properties described by the Humility Thesis, for both are possibly the fundamental and non-dynamic nature of things which is out of reach of our ordinary knowledge. Moreover, as we shall see, the idea that we can know our own intrinsic properties via introspection of our own consciousness has been independently developed by many philosophers, including a number of those working in the analytic tradition (see Sections 3d and 7). Of course, despite the possible similarities between the Brahman and intrinsic properties, their important differences should not be ignored: the former is unique and singular, and is also described by Hindu mystics as the cause of everything, rather than being non-causal. Furthermore, as mentioned above, Hindu understandings of the Brahman are diverse, and the aforementioned understanding is only one of them (see Deutsch 1969, pp. 27-45; see also the article on ‘Advaita Vedanta’).
There are certain Western theologies and philosophical mysticisms that resemble the above line of Hindu thought, such as those concerning the divine inner nature of the universe (for example, Schleiermacher 1799/1988) and the Schopenhauerian Will (Schopenhauer 1818/1966). Precisely, according to these views, the ultimate nature of the universe, whatever it is, is also out of reach of our ordinary knowledge, but it can be known via some sort of introspection. Of course, the ultimate nature concerned may or may not be intrinsic, non-relational, non-causal, non-dynamic, and so on; this often depends on one’s interpretation. Nonetheless, there seems to remain some similarities with the Humility Thesis.
b. Hume
18th century Scottish philosopher David Hume is a notable advocate of the Humility Thesis in the Enlightenment period. Even though Hume is not a Humility theorist per se because he is sceptical of the existence of external objects—namely, objects that are mind-independent—let alone their intrinsic properties (T 1.4.2), he does take the Humility Thesis to be a necessary consequence of the early modern materialistic theory of matter, which he therefore rejects due to the emptiness of the resultant ontological framework (T 1.4.4).
Hume’s argument is roughly as follows. Early modern materialism takes properties like sounds, colours, heat, and cold to be subjective qualities which should be attributed to the human subject’s sensations, rather than to the external material objects themselves. This leaves material objects with only two kinds of basic properties: extension and solidity. Other measurable properties like motion, gravity, and cohesion, for Hume, are only about changes in the two kinds of basic properties. However, extension and solidity cannot be ‘possest of a real, continu’d, and independent existence’ (T 1.4.4.6). This is because extension requires simple and indivisible space-occupiers, but the theory of early modern materialism offers no such things (T 1.4.4.8). Solidity ultimately concerns relations between multiple objects rather than the intrinsic nature of a single object: it is about how an object is impenetrable by another object (T 1.4.4.9). Hume concludes that under early modern materialism we are in fact unable to form a robust idea of material objects.
c. Kant
Like Hume, 18th century German philosopher Immanuel Kant is another notable advocate of the Humility Thesis in the Enlightenment period. He makes the famous distinction between phenomena and things-in-themselves in his transcendental idealism. The idea of transcendental idealism is very roughly that all laws of nature, including metaphysical laws, physical laws, and special science laws are, in fact, cognitive laws that rational human agents are necessarily subject to. Since things-in-themselves, which are the mind-independent side of things, must not be attributed any such subjective cognitive features, their nature must be unknowable to us (CPR A246/B303). We can only know of things as they appear to us subjectively as phenomena, under our cognitive laws such as space, time, and causality (CPR A42/B59).
It is important to note that Kant intends transcendental idealism to be a response to some philosophical problems put forward by his contemporaries, and that these philosophical problems are often not the concerns of contemporary Humility theorists. Examples include the subject-object problem and the mind-independent external reality problem put forward by Hume and Berkeley (CPR B274). Furthermore, it is also worth noting that Kant’s views have a variety of interpretations, for interpreting his views is never an easy task—his transcendental idealism is no exception (see the article on ‘Kant’). However, if the nature of things-in-themselves, being free from extrinsic relations to us the perceivers and other extrinsic relations we attribute to them (for example, spatiotemporal relations with other things), can be considered as the intrinsic properties of things, then transcendental idealism entails the Humility Thesis. In addition, no matter what the correct interpretation of Kant really is, Kant as he is commonly understood plays a significant and representative role in the history of the Humility Thesis from his time until now. Unlike Hume, who takes the Humility Thesis to be a reason for doubting the metaphysical theories that imply it, Kant takes the Humility Thesis to be true of the world—even though his German idealist successors like Fichte and Hegel tend to reject this latter part of his philosophy.
Finally, it is noteworthy that one of the most important texts in the contemporary literature on the Humility Thesis, Langton’s book Kantian Humility: Our Ignorance of Things in Themselves (1998), is an interpretation of Kant. In the book, Langton develops and defends the view that Kant’s Humility Thesis could be understood in terms of a more conventional metaphysics of properties, independently of Kant’s transcendental idealism. Specifically, she argues that Kantian ignorance of things-in-themselves should be understood as ignorance of intrinsic properties. The book and the arguments within are discussed in Sections 3f and 4a.
d. Russell
The pioneer of the Humility Thesis in analytic philosophy is one of the founding fathers of the tradition, Bertrand Russell. Historical studies of Russell’s philosophy show that Russell kept on revising his views, and hence, like many of his ideas, his Humility Thesis only reflects his views during a certain period of his very long life (Tully 2003; Wishon 2015). Russell’s version of the Humility Thesis is found in and popularised by his book, Analysis of Matter (1927). Like the Hindu mystic mentioned above, Russell is best described as a partial Humility theorist, for he also believes that some of those intrinsic properties which are unknowable by scientific means constitute our phenomenal experiences, and can thereby be known through introspecting such experiences.
Russell proposes a theory of the philosophy of mind which he calls psycho-cerebral parallelism. According to the theory, (1) physical properties are ‘causally dominant’, and (2) mental experiences are a part of the physical world and are ‘determined by the physical character of their stimuli’ (Russell 1927a/1992, p. 391). Despite this, our physical science has its limits. Its aim is only ‘to discover what we may call the causal skeleton of the world’ (p. 391, emphasis added); it cannot tell us the intrinsic character of matter. Nevertheless, some such intrinsic character can be known in our mental experiences because those experiences are one such character. As Russell remarks in a work published in the same year as The Analysis of Matter, ‘we now realise that we know nothing of the intrinsic quality of physical phenomena except when they happen to be sensations’ (1927b, p. 154, emphasis added).
Russell’s view that scientifically unknowable intrinsic properties constitute what we now describe as qualia is an influential solution to the hard problem of consciousness in the philosophy of mind, known today as ‘Russellian monism’. Before the mid-1990s, this view had already attracted some followers (see, for example, Maxwell 1978; Lockwood 1989, 1992) and sympathisers (see, for example, Feigl 1967), but it was often overshadowed by the dominant physicalist theories of mind (like the identity theory and functionalism). This situation ended with the publication of Chalmers’s book The Conscious Mind (1996), which has effectively promoted Russellian monism to a more general audience. Further discussion of contemporary Russellian monism is in Section 7.
e. Armstrong
Among the next generation of analytic philosophers after Russell, members of the Australian materialist school developed an interest in the problem of Humility as they inquired into the nature of material entities (Armstrong 1961, 1968; Smart 1963; Mackie 1973); and among them, David Armstrong is a representative advocate of the Humility Thesis (Armstrong 1961, 1968). Armstrong begins with the assumption that physical objects are different from empty space, and then investigates what sort of intrinsic properties of physical objects make the difference between them and empty space (1961, p. 185). He then makes use of an argument which, by his own acknowledgement, largely resembles Hume’s (Armstrong 1968, p. 282; see the argument in Section 3b) to conclude that no posited properties in the physicist’s theory can make the difference between physical objects and empty space. Unlike Hume, who is sceptical of the existence of physical objects, however, Armstrong is not a sceptic and thus believes that what makes the difference must be some properties additional to the physicist’s list of properties. What follows is that these properties must not be within the scope of current physics, and thus we have no knowledge of them.
It is important to note, though, that Armstrong accepts the Humility Thesis rather reluctantly. Accepting the Humility Thesis follows from his theory, and he sees this as a difficulty facing his theory of the nature of physical objects. He says he has no solution to this difficulty (1961, p. 190, 1968, p. 283). Hence, despite his belief that intrinsic properties are currently unknown, Armstrong does not go as far as to accept the now popular full-blown version of the Humility Thesis according to which intrinsic properties must be in principle unknowable (1961, pp. 189-190).
f. Contemporary Metaphysics
Here is a sketch of how the debate has panned out in the more recent literature. For a few decades, the Humility Thesis was often an epistemic complaint made by dispositionalists towards categoricalism, such as the version of the view offered by Lewis (1986). For these philosophers, who take it that all fundamental properties are dispositional, the idea of there being more fundamental intrinsic properties implies that we are irremediably ignorant of the relevant properties. They argue that we should not posit the existence of things we cannot ever know about. Therefore, we should not posit the existence of intrinsic properties (see, for example, Shoemaker 1980, pp. 116-117; Swoyer 1982, pp. 204-205; Ellis & Lierse 1994).
Since the 1990s, there was a trend among categorialists to respond positively to the problem of Humility: it has become their mainstream view that while the existence of intrinsic properties is necessary for the existence of matter, we cannot ever know about them. Blackburn’s short article (1990) is a pioneer of this trend; it inspired Langton’s book Kantian Humility: Our Ignorance of Things in Themselves (1998), from which the term ‘Humility’ originated (Langton acknowledges this in her 2015, p. 106). While the book is meant to be an interpretation of Kant, Langton defends the view that Kant’s Humility Thesis could be understood independently of—and perhaps even incompatible with—his transcendental idealism (Langton 1998, p. 143n7, 2004, p. 129). In addition, Langton argues that the thesis is very relevant to contemporary analytic metaphysics. While her interpretation of Kant is controversial and is often called ‘Langton’s Kant’, the interpretation is often considered as an independent thesis, and has attracted many sympathisers and engendered many discussions in the metaphysics of properties. Examples include discussions of Jackson’s (1998) ‘Kantian physicalism’, Lewis’s (2009) ‘Ramseyan Humility’, and Philip Pettit’s (1998) ‘noumenalism’. As Lewis remarks, ‘my interest is not in whether the thesis of Humility, as she conceives it, is Kantian, but rather in whether it is true’ (Lewis 2009, p. 203)—and he thinks that it is true.
4. Arguments for Humility
Some historically significant arguments for Humility were surveyed above; this section offers an introduction to the most influential arguments for Humility in the contemporary literature. While the arguments will be discussed in turn, it is important to note that the arguments are often taken to be interrelated. Furthermore, some influential authors, as discussed below, use some combination of these and do not advocate the view that such combined arguments could work if disassembled into separate arguments.
a. The Receptivity Argument
The receptivity argument is perhaps the most famous argument for Humility (see, for example, Russell 1912/1978; Langton 1998; Jackson 1998; Pettit 1998). Langton (1998) offers a particularly detailed formulation of it. The argument begins with the assumption that we know about things only though receptivity, in which the relevant things causally affect us (or our experimental instruments) and thus allow us to form adequate representations of them. For instance, Langton remarks that ‘human knowledge depends on sensibility, and sensibility is receptive: we can have knowledge of an object only in so far as it affects us’ (Langton 1998, p. 125). An upshot of this assumption is that we could have knowledge of whatever directly or indirectly affects us (p. 126). In light of this, since things affect us in virtue of their causal and dispositional properties, we can know of these properties.
However, the proponents of the receptivity argument continue, such a condition of knowledge would also impose an epistemic limitation on us: we will be unable to know of things that cannot possibly affect us. While things causally affect us in virtue of their causal and dispositional properties, as long as their intrinsic properties are another class of properties, there is a question as to whether we can know them. To answer this question, we must determine the nature of the relationship between things’ causal and dispositional properties and their intrinsic properties, and whether such a relationship allows for knowledge of intrinsic properties in virtue of the relevant causal and dispositional properties. If this is not the case, we need to determine whether this leads to an insurmountable limit on our knowledge. Jackson, for example, believes that the receptivity argument in the above form is incomplete. He argues that we may have knowledge of intrinsic properties—or, in his work, fundamental properties—via the causal and dispositional properties they bear, and that the receptivity argument in the above form can be completed by supplementing it with the multiple realisability argument (Jackson 1998, p. 23). This is discussed in detail in Section 4c.
For Langton, knowledge of intrinsic properties is impossible because causal and dispositional properties are irreducible to intrinsic properties in the sense that any of the former does not supervene on any of the latter (Langton 1998, p. 109). Nonetheless, the irreducibility thesis does not spell an end to this discussion. On the one hand, Langton elsewhere points out that the receptivity argument still works if there are instead necessary connections between the relevant properties—as long as they remain different properties (Langton & Robichaud 2010, p. 173). On the other hand, James Van Cleve worries that Langton’s argument from irreducibility is nevertheless incomplete, for a non-reductive relationship alone does not imply the impossibility of intrinsic knowledge (Van Cleve 2002, pp. 225-226). In sum, regardless of whether Langton’s irreducibility thesis is correct, there are some further questions as to whether or not we are receptive to intrinsic properties.
b. The Argument from Our Semantic Structure
The second argument for Humility appeals to the ways in which the terms and concepts in our language are structured (see, for example, Blackburn 1990; Pettit 1998; Lewis 2009). Depending on particular formulations of the argument, the language concerned could be the language of our scientific theories or all human languages. Nonetheless, all versions of this argument share the common argumentative strategy according to which all terms and/or concepts found in the relevant language(s) capture only causal properties, dispositional properties, and structural properties but not intrinsic properties. The idea is that if our knowledge of the world is formulated by the language(s) concerned, then there will be no knowledge of intrinsic properties.
i. Global Response-Dependence
One version of this argument is developed by Pettit (1998). Note that his commitment to a Humility Thesis under the name of ‘noumenalism’ is also a reply to Michael Smith and Daniel Stoljar’s (1998) argument that his view implies noumenalism. In response to the argument, Pettit accepts noumenalism as an implication of his view (Pettit 1998, p. 130).
Pettit advocates a thesis called global response-dependence (GRD), which he considers to be an a priori truth about the nature of all terms and concepts in our language. According to GRD, all terms and concepts in our language are either (1) defined ostensively by the ways that their referents are disposed to causally impact on normal or ideal subjects in normal or ideal circumstances, or (2) are defined by other terms and concepts which eventually trace back to those of the former kind (Pettit 1998, p. 113-114). If this is so, then it follows that all terms and concepts are in effect defined dispositionally with reference to their referents’ patterns of causal behaviours. If there are any non-dispositional properties that ground the dispositions, then, as Pettit remarks, ‘we do not know them in their essence; we do not know which properties they are’ (pp. 121-122).
Of course, there is a question as to whether GRD is an attractive thesis. It is controversial, and its validity is an independent open question that goes beyond the scope of this article. In Pettit’s case, he commits himself to an epistemology (Pettit 1998, p. 113) that is very similar to Langton’s receptivity thesis that is discussed in Section 4a.
ii. Ramseyan Humility
The most famous version of the argument from our semantic structure is developed by Lewis (2009), even though Blackburn offers an earlier rough sketch of the argument which appeals to the Lewisian semantic theory (Blackburn 1990, p. 63), and Pettit anticipates that such a theory would imply the Humility Thesis just as his GRD does (Pettit 1998, p. 128). The argument is based on the Ramsey-Lewis method of defining theoretical terms in scientific theories, which Lewis develops in his early article ‘How to define theoretical terms’ (1970), and which is in turn inspired by Frank Ramsey—this is why Lewis calls his version of the Humility Thesis ‘Ramseyan Humility’.
Lewis is a scientific realist. He asks us to suppose that there is a final scientific theory T about the natural world. In his view, theory T, like all other scientific theories, consists of O-terms and T-terms. O-terms are the terms that are used in our older and ordinary language, which is outside theory T; T-terms are theoretical terms that are specifically defined in theory T. Each T-term has to be defined holistically in relation with other T-terms by O-terms. The relevant relations include nomological and locational roles in theory T (Lewis 2009, p. 207). Some such nomological and locational roles named by T-terms would be played by fundamental properties, while Lewis assumes that none of these properties will be named by O-terms. He writes, ‘The fundamental properties mentioned in T will be named by T-terms. I assume that no fundamental properties are named in O-language, except as occupants of roles’ (p. 206). Although Lewis in his 2009 article does not make it clear why he assumes so, in his other work (1972) he argues that the use of O-terms is to name and define nomological and locational relations.
With the assumption that the roles played by intrinsic properties are identified solely by relational means, Lewis makes the following argument. While theory T is uniquely realised by a particular set of fundamental properties in the actual world, theory T is incapable of identifying such properties, namely individuating the exact fundamental properties that realise it. This is because, for theory T, fundamental properties are mere occupants of the T-term roles defined by O-terms (Lewis 2009, p. 215), which are, in turn, all about their nomological and locational roles. But then theory T is unable to tell exactly which fundamental property occupies a particular role (p. 215)—as Lewis remarks, “To be the ground of a disposition is to occupy a role, but it is one thing to know that a role is occupied, another thing to know what occupies it” (p. 204). Lewis has much more to say about his argument in relation to the multiple realisability argument, which he takes to be another indispensable core part of his argument, and which will be discussed in detail in section 4c.
Before we go on to the multiple realisability argument, there is again the further question as to why we should accept the underlying semantic theory of the argument—in this case the Ramsey-Lewis model of scientific theories. Indeed, some critics of Lewis’s Ramseyan Humility target the conceptual or scientific plausibility of the semantic theory (Ladyman & Ross 2007; Leuenberger 2010). Rather than a defence of an independent thesis, Lewis’s 2009 article seems to be an attempt to develop the Ramseyan Humility Thesis as a consequence of his systematic philosophy, which he has been developing for decades. In any case, taking into account the influence of the Lewisian systematic philosophy in contemporary analytic philosophy, its entailment of the Humility Thesis is of considerable philosophical significance.
c. The Multiple Realisability Argument
The multiple realisability argument is a particularly popular argument for Humility, and is endorsed by a number of Humility theorists regardless of whether they also offer independent arguments for Humility (see, for example, Lewis 2009; Jackson 1998; Yates 2018; see also Russell 1927a/1992, p. 390; Maxwell 1978, p. 399; Pettit 1998, p. 117). The basic idea is that the causal, dispositional, and structural properties of things with which we are familiar are roles. We can at best know that such roles have some intrinsic properties as their realisers, but we have no idea which intrinsic properties actually do the realizing job. For these roles can also be realised by some alternative possible sets of intrinsic properties, and we cannot distinguish the relevant possibilities from the actual ones.
As mentioned above, some authors such as Jackson and Lewis believe that their receptivity arguments or arguments from our semantic structure are themselves incomplete and have to be supplemented with the multiple realisability argument. For example, Jackson believes that our receptive knowledge is multiply realisable by different sets of fundamental properties (Jackson 1998; see also Section 4a); and Lewis believes that our final scientific theory is multiply realisable by different sets of fundamental properties (Lewis 2009; see also Section 4b). Multiple realisability is for them the reason why we cannot possibly know of intrinsic properties via our receptive knowledge or via the final scientific theory. Here we see that the multiple realisability argument is often considered as an indispensable component of more complex arguments.
Whereas certain formulations of the multiple realisability argument appeal to metaphysical possibilities (Lewis 2009), Jonathan Schaffer—a critic of the argument—argues that epistemic possibilities alone suffice to make the argument work, since its aim is to determine the nature of our knowledge (Schaffer 2005, p. 19). Hence, the argument cannot be blocked by positing a metaphysically necessary link between intrinsic properties and their roles that eliminates the metaphysical possibilities suggested by the proponents of the argument.
Lewis and Jackson offer detailed discussion of how some forms of multiple realisation are possible. Three corresponding versions of the multiple realisability argument are briefly surveyed in turn below.
The permutation argument is offered by Lewis (2009). It begins with the assumption that the laws of nature are contingent (p. 209). Lewis argues that a scenario in which the realisers of two actual dispositional roles are swapped will not change anything else, including the nomological roles they play and the locations they occupy. Hence, a permutation of realisers is another possible realisation of our scientific theory. Since our science cannot distinguish between the actual realisation of nomological roles and its permutations, we do not know which properties it consists of.
The replacement argument is also offered by Lewis (2009). Unlike the permutation argument, this argument does not appeal to an exchange of roles. Instead, it begins with the assumption that the realisers of dispositions are replaceable by what Lewis calls idlers and aliens. Idlers are among the fundamental properties within the actual world, but they play no nomological role; and aliens are fundamental properties that only exist in nonactual possible worlds (p. 205). Multiple realisability then follows. Again, Lewis argues that replacing the realisers of the actual nomological roles with idlers and aliens will not change anything else; what we have is simply other possible realisations of our scientific theory. And again, since our science cannot distinguish between the actual realisation of nomological roles and its replacements, we do not know which properties realise these roles in the actual world.
The succession argument is offered by Jackson (1998). The argument appeals to the possibility of there being two distinct fundamental properties realizing the same nomological role in our science in succession (Jackson 1998, pp. 23-24). For Jackson, it is impossible for our science to distinguish whether or not this possibility is actualised—specifically, it is impossible for our science to distinguish whether the nomological role is actually realised by one or two properties. This reveals that we do not know which property actually plays the nomological role.
5. Arguments against Humility
We have seen above some influential arguments for Humility in the literature. In what follows, the main arguments against the thesis will be surveyed.
a. The Objection from Reference-Fixing
An immediate objection considered by Pettit and Lewis in their defence of Humility is the objection from reference-fixing (Pettit 1998, p. 122; Lewis 2009, p. 216; but see Whittle 2006, pp. 470-472). The idea is that we can refer to an intrinsic property as the bearer of a dispositional property, and thereby identify it and know of it. For example, when asked what the bearer of dispositional property D is and whether we have knowledge of it, we may respond in the following way: ‘The bearer of D is whatever bears D; and we know that it bears D.’
Unsurprisingly, Pettit and Lewis are not convinced. Pettit responds, ‘under this picture it is no surprise that we are represented as knowing those very properties, not in their essence, but only their effects’ (Pettit 1998, p. 122). Lewis, on the other hand, dismisses the objection as ‘cheating’ (Lewis 2009, p. 216). Consider the answer concerning the bearer of dispositional property D above. On Lewis’s view, while that answer is undoubtedly true, we simply have no idea which particular proposition is expressed by the answer. Some of the relevant issues are discussed in Section 2.
Ann Whittle, an advocate of the objection from reference-fixing, argues that Humility theorists like Lewis set an unreasonably high bar for the condition of identification (Whittle 2006, pp. 470-472; but see Locke 2009, p. 228). For it seems that, in the case of our ordinary knowledge, we typically identify things in virtue of their effects and connections to us. For example, when we have knowledge of the historical figure Napoleon, we identify him via the great things he has done and the spatiotemporal connections he has with us. By contrast, it is difficult for our knowledge to single out a particular person across possible worlds as Lewis’s condition requires us to, for someone else might have done the same things as Napoleon did. And if we allow for knowledge about Napoleon according to our ordinary conditions of identification, there seems to be no reason for not allowing for knowledge of intrinsic properties under the same consideration.
b. The Objection from Vacantness
The objection from vacantness is developed by Whittle (2006, pp. 473-477; but see Locke 2009, p. 228). This objection specifically targets Humility theorists like Lewis and Armstrong. According to Whittle, Lewis and Armstrong have the background belief that fundamental intrinsic properties are simple and basic to the extent that they are featureless in themselves, with the only exception of their bare identities. With this in mind, the only interesting nature of these properties is their being bearers of causal, dispositional, or structural properties, and nothing else. If this is so, we are actually not going to miss out on anything even if we grant the Humility Thesis to be true. Lewis’s and Armstrong’s Humility theses, then, at best imply that we would be ignorant of the bare identities of intrinsic properties. Hence, ‘there is no reason to regard it as anything more than a rather esoteric, minimal epistemic limitation’ (p. 477).
While Whittle’s charge of esotericism is debatable, it is noteworthy that her interpretation of Lewis’s and Armstrong’s metaphysical frameworks is shared by some other philosophers (Chalmers 2012, p. 349; Stoljar 2014, p. 26)—Chalmers, for example, calls them a ‘thin quiddity picture’ (Chalmers 2012, pp. 349). Nonetheless, it is also important to note that, as these philosophers point out, there are some alternative versions of the Humility Thesis which count as ‘thick quiddity pictures’, and according to which intrinsic properties have substantial qualities (for example, Russell 1927a/1992; Heil 2004).
c. The Objection from Overkill
The Humility Thesis is an attempt to draw a very specific limit to our knowledge: its aim is to show that knowledge of intrinsic properties is impossible, despite the fact that other knowledge remains possible. Specifically, if we can know of intrinsic properties, then the thesis fails; but by contrast, if the purported ignorance goes too far and applies equally to our ordinary knowledge, then the resultant scepticism would render the thesis trivial and implausible. For one thing, if we are ignorant of everything, then it would be very surprising that knowledge of intrinsic properties is an exception. For another, scepticism seems to be an unacceptable conclusion which should be avoided.
The objection from overkill, then, is that the specific boundary cannot be achieved: the claim is that there are no good arguments that favour the Humility Thesis but exclude scepticism of some other kind; a possible further claim is that there is no way to avoid this wider scepticism without rendering the Humility Thesis weak or erroneous (Van Cleve 2002; Schaffer 2005; Whittle 2006; Cowling 2010; cf. Langton 2004; but see Locke 2009). For example, Van Cleve argues that Langton’s argument from receptivity and irreducibility is too strong and must have something wrong with it. For if Hume is correct that causal laws are not necessary, then nothing necessitates their effects on us – namely, these effects are irreducible to the relevant things. But if we follow Langton’s argument, then such irreducibility means that we know nothing (Van Cleve 2002, pp. 229-234). Schaffer argues that Lewis’s appeal to the distinction between appearance and reality, and the multiple realisability of appearance, is shared by external world sceptics (Schaffer 2005, p. 20). In addition, Schaffer argues that the standard responses to external world scepticism such as abductionism, contextualism, deductionism, and direct realism apply equally to the Humility Thesis (pp. 21-23).
In response, a possible counter-strategy would be to argue that standard responses to scepticism do not apply to the Humility Thesis (Locke 2009). For example, Dustin Locke argues that when we do abductions, we identify the distinguishing features of competing hypotheses, and thereby pick out the best hypothesis among them. But the different intrinsic realisations of our knowledge considered by the multiple realisation argument exhibit no such distinguishing features (Locke 2009, pp. 232-233).
6. Alternative Metaphysical Frameworks
Rather than offering straightforward arguments against Humility, some critics of Humility instead develop alternative metaphysical frameworks to the Humility Thesis and the kind of categoricalism that underlies it (see Section 1b). These alternative frameworks, if true, undercut the possibility of there being unknowable intrinsic properties. In what follows, some such metaphysical frameworks are surveyed.
a. Ontological Minimalism: Appealing to Phenomenalism, Dynamism, or Dispositionalism
Philosophers have a very long tradition of avoiding ontological commitments to unobservables and unknowables, such as substance, substratum, Kantian things in themselves, the intrinsic nature of things, and the divine ground and mover of everything. Among these philosophers, phenomenalists and idealists have taken the perhaps most extreme measure: with a few exceptions, anything beyond immediate mental phenomena is eliminated (Berkeley 1710/1988; Hume 1739/1978; Mill 1865/1996; Clifford 1875/2011; Mach 1897/1984). Among such approaches to ontology, a phenomenalism that rejects matter is indeed Hume’s response to the Humility problem: he is altogether sceptical about the existence of matter, together with its unknowable intrinsic nature (Hume 1739/1978; see Section 3b). In other words, while Hume agrees that if matter exists then we are ignorant of its intrinsic nature, he does not believe there is such a thing in the world for us to be ignorant of.
Although many other philosophers regard phenomenalism and idealism as far too radical, the ontological minimalist attitude is nonetheless available to philosophers with a more realist and naturalist stance. The idea is that if the dynamics of things—their motions, forces, dynamic processes, relational features, and so forth—are their only scientifically accessible features, then we should attribute to them only such features and no further mysterious features. Moreover, we should identify these former features as their sole natures. This minimalist dynamist line of thought is not uncommonly found in the thoughts of the modern scientific naturalists—philosophers and scientists alike (see, for example, Diderot 1770/1979; d’Holbach 1770/1820, Pt. I, Ch. 2; Faraday 1844; Nietzsche 1887/2006, Ch. 1.3; Schlick 1925b/1985, Pt. III.A; see also a discussion of Michael Faraday’s dynamism and its contemporary significance in Langton & Robichaud 2010, pp. 171-173).
The most prominent incarnation of dynamism in contemporary metaphysics is dispositionalism—the idea that all fundamental properties are dispositional properties (see Section 1b). Contemporary dispositionalists have independently discovered the ontological minimalist attitude in their debates with their rivals, the categorialists, who believe that all fundamental properties are intrinsic, categorical properties. The interesting fact here is that the mainstream dispositionalists and categorialists in contemporary metaphysics actually share an agreement regarding Humility: many from both sides agree that if intrinsic properties of the kind described by categoricalism exist, then we are irremediably ignorant of them (Shoemaker 1980, pp. 116-117; Swoyer 1982, pp. 204-205; Ellis & Lierse 1994, p. 32; Hawthorne 2001, pp. 368-369; Black 2000, pp. 92-95; Bird 2005, p. 453; Ney 2007, pp. 53-56; see also Whittle 2006, pp. 485-490 for a related argument). However, whereas the categorialists concede such an ignorance, the dispositionalists argue that we should not believe in the existence of something we simply cannot know about. Put simply, much like the categorialists, the dispositionalists too agree that categoricalism implies the Humility Thesis, but they take this as good reason for rejecting categoricalism.
There are at least two issues here related to a dispositionalism that grounds such an ontological minimalist attitude. The first issue can be considered in light of Lewis’s question: ‘Why should I want to block [the Humility argument]? Why is Humility “ominous”? Whoever promised me that I was capable in principle of knowing everything’ (Lewis 2009, p. 211)? The minimalist dispositionalists need some epistemic principle to justify their minimalist attitude. Some of them appeal to some more a priori epistemic principles according to which we should not posit anything that cannot contribute to our knowledge (Shoemaker 1980, pp. 116-117; Swoyer 1982, pp. 204-205; Black 2000, pp. 92-95; Bird 2005, p. 453). Others hold a more scientific attitude according to which our ontological posits should not go beyond science, together with the assumption that all properties mentioned in science are dispositional (Hawthorne 2001, pp. 368-369; cf. Ellis & Lierse 1994, p. 32; Ney 2007, pp. 53-56).
The second issue is that the status of the ontological minimalist argument is one of the many questions in the debate between categoricalism and dispositionalism. Hence, it seems that the argument must be considered alongside other arguments—such as the ones mentioned in Section 1b—when choosing between the two views.
b. Physics and Scientific Eliminativism
The renowned physicist Werner Heisenberg took Humility to be a consequence of the atomistic theories of the kind defended by the Ancient Greek philosopher Democritus—such theories cannot possibly offer a more fundamental description of the atom than those of the atoms’ motions and arrangements (Heisenberg 1958/2000, pp. 34-35). However, like many other early- to mid-20th century scientists and philosophers (for example, Whitehead 1925/1967; Schlick 1925a/1979), Heisenberg argued that such a conception of matter is already old-fashioned and incompatible with contemporary physics. On his view, quantum mechanics has provided us with a novel metaphysical worldview: the ‘thing in itself’ of a particle is a mathematical structure (Heisenberg 1958/2000, p. 51; but see Eddington 1929).
In contemporary metaphysics, the idea that quantum mechanics leads to a scientific eliminativism about intrinsic properties is defended by James Ladyman and Don Ross (2007). Specifically, Ladyman and Ross argue their ontic structural realism (OSR) is a theoretical framework that is better than categoricalism and the Humility Thesis—including Langton’s, Lewis’s, and Jackson’s versions (Ladyman & Ross 2007, p. 127n53)—and should thus simply replace them. OSR is the view that the relational structure of the world is ontologically fundamental, and is not one that consists of individuals with intrinsic properties. Identity and individuality of objects, on their view, depend only on the relational structure.
OSR is developed from an analysis of empirical science, especially quantum physics where quantum particles are found not to have exact space-time locations. On Ladyman and Ross’s view, quantum particles and field theory should be given a non-individualistic interpretation in which concepts of individual objects should be eliminated (p. 140). We come to have our ordinary concepts of individual objects only because of the distinguishability or discernibility of things, not due to their objective individuality (p. 134). With this in mind, Ladyman and Ross argue that the standard assumptions in metaphysics are all challenged by OSR. They list assumptions as follows:
(i) There are individuals in space-time whose existence is independent of each other. Facts about the identity and diversity of these individuals are determined independently of their relations to each other.
(ii) Each has some properties that are intrinsic to it.
(iii) The relations between individuals other than their spatio-temporal relations supervene on the intrinsic properties of the relata (Humean supervenience).
(iv) The Principle of the Identity of Indiscernibles] is true, so there are some properties (perhaps including spatio-temporal properties) that distinguish each thing from every other thing, and the identity and individuality of physical objects can be accounted for in purely qualitative terms. (Ladyman & Ross 2007, p. 151)
Unsurprisingly, for Ladyman and Ross, Lewis and Jackson are merely some traditional metaphysicians who assume the existence of individuals with intrinsic natures, but ‘our best physics puts severe pressure on such a view’ (Ladyman & Ross 2007, p. 154).
In sum, scientific eliminativists, much like the ontological minimalists discussed in Section 6a, refuse to posit the existence of unknowable intrinsic properties. However, they do not do so only because of ontological parsimony; rather, they believe that categoricalism and the Humility Thesis are attached to some old-fashioned, prescientific worldview, and that our best science has turned out to offer a different, more advanced worldview which simply makes no such commitments. The key is replacement rather than curtailment. It is important to note that Ladyman and Ross’s OSR and their scientific eliminativism are both philosophical interpretations of physics rather than part of the physical theories themselves, and it remains open to debate whether these interpretations are the best ones (compare Eddington 1929; Chalmers 1996).
c. Rationalist categoricalism
Different from the previous two metaphysical frameworks, the third alternative metaphysical framework to the Humility Thesis is a variant rather than a denial of categoricalism. This view might be called a ‘rationalist categoricalism’, following J. L. Mackie’s use of the term ‘rationalist view’ to describe a particular response to Humility, though he rejects the view (Mackie 1973, p. 149). According to this view, intrinsic properties not only exist but could, against the Humility Thesis, also be properly described by our best physical theories or their successors (Smart 1963; Ney 2015; Hiddleston 2019).
Let us suppose that our current physical theories are final and that some of these theories have reached the most fundamental level possible. The rationalist categorialist argues that what the physicist calls fundamental properties, such as mass and charge, are attributed to objects as their intrinsic properties, not as dispositional properties. There is of course no doubt that, as pointed out by the proponents of the receptivity argument for Humility, we always discover the properties of an object in experiments and observations, which means that we measure these properties via their causal effects. Nonetheless, while the properties are measured and defined causally in terms of the relevant dispositions, they could in themselves be intrinsic and categorical properties. For the properties should not be identified with the means of measurements, but rather should be understood as something revealed by them.
Mackie, though not a friend of rationalist categoricalism, nicely illustrates the rationalist categorialist’s interpretation of the relation between mass and its relevant dispositions—which are, presumably, the active gravitational force, the passive gravitational force, and inertia—in the following passage:
Someone who takes what I have called a rationalist view will treat mass as a property which an object has in itself, which is inevitably a distinct existence from most of the force-acceleration combinations which would reveal it, and yet whose presence entails all the conditionals connecting resultant force with acceleration. (Mackie 1973, p. 149)
And in response to the receptivity argument for Humility, J. J. C. Smart, a sympathiser of rationalist categoricalism, argues that the Humility theorist commits herself to verificationism of some kind:
We could explore the possibility of giving a theory of length, mass, and so on, as absolute and not relational.… We do indeed test propositions about length relationally, but that to go on to say that length is purely relational is to be unduly verificationist about meaning. (Smart 1963, p. 74)
Unlike mainstream categoricalism and dispositionalism, rationalist categoricalism remains a minority view, but it has nonetheless attracted some serious sympathisers (Smart 1963; Ney 2015; Hiddleston 2019).
7. The Humility Thesis and Russellian Monism
As was mentioned in the discussion of Russell’s view on Humility in Section 3d, Russell developed a peculiar mind/body theory which is now called Russellian monism (for another pioneer of Russellian monism, see Eddington 1929), and this view has recently gained a wide amount of traction and followers in the philosophy of mind. The current version of the view is typically framed as a solution to the hard problem of consciousness. According to the problem, our consciousness has a particular kind of feature, namely qualia, which seems to persistently resists any physical explanations (Chalmers 1995; see also the article on ‘The hard problem of consciousness’). Qualia are the ‘subjective feels’, ‘phenomenal qualities’, or ‘what it is like’ for a conscious subject to have certain experiences. Russellian monism, then, is the view that those unknowable intrinsic properties described by the Humility Thesis play a role in the constitution of our qualia.
Apart from its own significance in the philosophy of mind, Russellian monism also has a complex relationship with the Humility Thesis. For, on the one hand, it is developed from the Humility Thesis, for it makes use of the unknowable intrinsic properties described by the Humility Thesis to account for qualia. On the other hand, it is sometimes considered and developed as a response to the Humility Thesis, for it leads to the possibility that we may know certain intrinsic properties as we introspect our own qualia.
In what follows, some ontological issues surrounding the constitution of qualia by intrinsic properties are surveyed. Following that there is a survey of the epistemic issues surrounding the introspective knowledge of intrinsic properties.
a. Constitution: from Intrinsic Properties to Qualia
To begin with, there is a question as to why someone would be attracted to the view that unknowable intrinsic properties play a role in the constitution of our qualia. It traces back to the reason why many philosophers think that the hard problem of consciousness is particularly hard to solve. For these philosophers, qualia seem intrinsic and non-causal—it is conceivable that two people might have different qualia, but still exhibit the exactly same neurophysiological and behavioural responses—and thus the standard physical properties which seem causal cannot possibly account for qualia (Levine 1983, 2001; Chalmers 1995; 1996, 2003; Kim 2005; Goff 2017; Leibniz 1714/1989; Russell 1927b). But if intrinsic properties of the kind described by the Humility Thesis are likewise intrinsic and non-causal, then it seems that they can be a part of a good explanation of qualia (Russell 1927b; Goff 2017). Furthermore, the use of intrinsic properties in explaining qualia—unlike most other alternatives to classical physicalism, such as substance dualism—avoids positing idiosyncratic entities which appear to be in conflict with a unified, elegant, and scientifically respectable ontological framework (Chalmers 1996, pp. 151-153; Heil 2004, pp. 239-240; Seager 2009, p. 208; Stoljar 2014, p. 19; Goff 2017).
Russellian monists disagree on what intrinsic properties have to be like in order for these properties to be the constituents of qualia. This leads to the variety of versions of Russellian monism, and there are at least four such major versions: (1) Russellian neutral monism, (2) Russellian panpsychism, (3) Russellian panprotopsychism, and (4) Russellian physicalism. (1) Russellian neutral monism is endorsed by Russell. According to this view, intrinsic properties are neither physical nor mental, but rather are neutral properties that are neutral between the two (Russell 1921/1922; Heil 2004). (2) For the Russellian panpsychist, intrinsic properties that constitute our qualia must themselves also be qualia, albeit being smaller in scale. Since such intrinsic properties are presumably found in fundamental physical entities such as electrons, up quarks, down quarks, gluons, and strings, the Russellian panpsychist also accepts that such entities possess qualia. This thus leads to a commitment to panpsychism, the view that mental properties are ubiquitous (Seager 2009). (3) Russellian panprotopsychism is a view similar to Russellian panpsychism, but it denies that the intrinsic properties that constitute qualia must also be some kind of qualia. Rather, it takes these microscale properties to be ‘proto-qualia’, which are similar in nature to qualia (compare Chalmers 1996, 2015). (4) Finally, for the Russellian physicalist, intrinsic properties should be counted as physical due to their being possessed by physical entities like electrons. Russellian physicalists also disagree with Russellian panpsychists and Russellian panprotopsychists that the raw materials of qualia must themselves be qualia or be similar to qualia, and so distance themselves from panpsychism and panprotopsychism (Stoljar 2001; Montero 2015; see also Section 8). Due to the recent popularity of Russellian monism, the above views are all ongoing research programs. Some readers may see the striking similarity between Russellian panpsychism and some ancient and pre-modern panpsychistic views mentioned in Section 3a. So perhaps surprisingly, the Humility Thesis provides room for panpsychistic views to persist.
Nonetheless, the use of the Humility Thesis and the relevant intrinsic properties in accounting for qualia leads to a list of related discussions. Firstly, philosophers disagree on whether it is really a good explanation of qualia. For one thing, it is questionable whether an explanation that appeals to an unknowable explanans could do any real explanatory work (Majeed 2013, pp. 267-268). Some Russellian monists, in response, argue that our theory of mind should not only aim at explanatory success according to scientific standards, but should also aim at truth (Goff 2017). For another, intrinsic properties may seem to be an adequate and attractive explanans only under the intuitive assumption that qualia are intrinsic and non-causal, but not everyone agrees that consciousness studies should hold onto such intuitive assumptions. And if such assumptions are revisable, then it might be less obvious that intrinsic properties are the adequate explanans of qualia (Chan & Latham 2019; compare Churchland 1996). Of course, there is an old debate in the philosophy of mind as to whether or not our intuitive assumptions concerning qualia are accurate—and whether or not they are accurate enough to support non-naturalistic theories of mind (Levine 1983, pp. 360-361; Chalmers 1997, 2018; contra Churchland 1988, 1996; Dennett 1991, pp. 68-70).
Secondly, if it is the case that the intrinsic properties of physical entities constitute qualia, then the relevant intrinsic properties are supposedly those of fundamental physical entities such as electrons, up quarks, down quarks, gluons, and strings, or those that play the roles of basic physical properties such as mass and charge. But this leads to the question—which is often called ‘the combination problem’—as to how such microphysical intrinsic properties can ever combine into our qualia, which appear to be macro-scale entities (Hohwy 2005; Goff 2006; Majeed 2013; Chalmers 2017; Chan 2020b). In response, Goff (2017) makes use of Humility, and thereby argues that the bonding of intrinsic properties is likewise beyond our grasp. Other sympathisers of Russellian monism argue that all theoretical frameworks in philosophy of mind need further development: Russellian monism is no exception, and thus should not be expected to be capable of accounting for every detail of how our mind works (Stoljar 2001, p. 275; Alter & Nagasawa 2012, pp. 90-92; Montero 2015, pp. 221-222).
Thirdly, there seems to be a gap between intrinsic properties and causal and dispositional properties in the Humility Thesis: the intrinsic properties are making no substantive contribution to the causal makeup of the world apart from grounding it. For many, the use of the Humility Thesis in explaining qualia means that the gap is inherent in the mind/body relation in Russellian monism—namely, the qualia constituted by the intrinsic properties will not be the causes of our cognitive activities and bodily behaviours. This, in turn, means that Russellian monism ultimately collapses into epiphenomenalism (Braddon-Mitchell & Jackson 2007, p. 141; Howell 2015; Robinson 2018; compare Chan 2020a). For most contemporary philosophers of mind, the epiphenomenalist idea that our phenomenal consciousness possesses no causal profile and cannot cause our cognitive activities and bodily behaviours is very implausible. If these philosophers are correct, and if Russellian monism makes the same commitment, then it is equally implausible. In response, some sympathisers of Russellian monism argue that there is a more intimate relationship between intrinsic properties and causal and dispositional properties, and that this relationship makes intrinsic properties causally relevant or efficacious (Chalmers 1996, pp. 153-154; Seager 2009, pp. 217-218, Alter & Coleman 2020).
b. Introspection: from Qualia to Intrinsic Properties
Russellian monism also allows for a possible response to Humility which traces back to ancient religious and philosophical mysticism: the idea that if intrinsic properties constitute our qualia, then we may know of the former via introspection of the latter. The idea is taken seriously by a number of prominent Humility theorists (Blackburn 1990, p. 65; Lewis 2009, pp. 217-218; Langton & Robichaud 2010, pp. 174-175), and is also discussed by some Russellian monists (Russell 1927b; Maxwell 1978, p. 395; Heil 2004, p. 227; Rosenberg 2004; Strawson 2006).
There are currently two major proposals regarding how the introspection of intrinsic properties may work. The first might be called a Schopenhauerian-Russellian identity thesis. The thesis is developed by Russell and its form can be found earlier in Arthur Schopenhauer’s work:
We now realise that we know nothing of the intrinsic quality of physical phenomena except when they happen to be sensations. (Russell 1927b, p. 154, emphasis added)
We ourselves are the thing-in-itself. Consequently, a way from within stands open to us as to that real inner nature of things to which we cannot penetrate from without. (Schopenhauer 1818/1966, p. 195, original emphasis)
What Russell and Schopenhauer seem to be saying is that certain mental experiences and certain intrinsic properties (or, in Schopenhauer’s case, Kantian things in themselves) are the same thing, and that the former are a part of us of which we can obviously know. Hence, since we are capable of knowing the former, then we are automatically capable of knowing the latter.
Another proposal, the identification thesis, is formulated by Lewis. Lewis ultimately rejects it because he finds it incompatible with materialism, though he nonetheless takes it, when combined with Russellian panpsychism, as a possible reply to the Humility Thesis (Lewis 1995, p. 142; 2009, p. 217; for discussion, see Majeed 2017). The thesis concerns the nature of our experience of qualia: as we experience a quale, we will be able to identify it, to the extent that its essence—something it has and nothing else does—will be revealed to us (1995, p. 142). While Lewis believes that the thesis is ‘uncommonly demanding’ (1995, p. 141), he also believes that it is an obvious part of our folk psychology and is thus deserving of serious assessment (but see Stoljar 2009):
Why do I think it must be part of the folk theory of qualia? Because so many philosophers find it so very obvious. I think it seems obvious because it is built into folk psychology. Others will think it gets built into folk psychology because it is so obvious; but either way, the obviousness and the folk-psychological status go together. (Lewis 1995, p. 142)
Humility theorists typically dismiss introspective knowledge of intrinsic properties by doubting Russellian monism (Blackburn 1990, p. 65; Langton & Robichaud 2010, p. 175) or by emphasizing their sympathies to standard physicalism in the philosophy of mind (Lewis 2009, p. 217). Nonetheless, some further surrounding issues have been raised. The first might be called a reversed combination problem (see the discussion on the combination problem in Section 7a). The problem is that even if the Schopenhauerian-Russellian identity thesis or the identification thesis is correct, this only means that we can thereby know some aggregates of fundamental, intrinsic properties—for a quale is supposedly constituted by a large sum of fundamental, intrinsic properties, not a single fundamental, intrinsic property (Majeed 2017, p. 84). Just as we cannot know of fundamental physical particles just by knowing of a cup they constitute, it is likewise not obvious that we can know of fundamental, intrinsic properties via knowing the qualia they constitute. Hence, it is not obvious that the two epistemic theses offer any real solution to Humility, unless we consider a quale as an intrinsic property which is itself a target of the Humility Thesis.
The second issue is related to the alleged similarity between Russellian monism and epiphenomenalism. For many, epiphenomenalism is committed to what Chalmers calls the paradox of phenomenal judgement: if epiphenomenalism is true—if qualia are causally inefficacious—then our judgments concerning qualia cannot be caused by qualia, and thus cannot be considered as tracking them (Chalmers 1996, p. 177). Since, as discussed in Section 7a, Russellian monism appears to share some of crucial theoretical features of epiphenomenalism, certain critics of Russellian monism thereby argue that Russellian monism faces the same paradox as epiphenomenalism does (Hawthorne 2001, pp. 371-372; Smart 2004, p. 48; Braddon-Mitchell & Jackson 2007, p. 141; Chan 2020a). If this is correct, then Russellian monism cannot even allow for knowledge of qualia—including Russellian monism itself—let alone that of intrinsic properties. It is, however, noteworthy that some sympathisers of epiphenomenalism argue that epiphenomenalism can actually account for knowledge of qualia (Chalmers 1996, pp. 196-209).
8. The Humility Thesis and Physicalism
Physicalism is the view that everything in the actual world is physical. Despite the fact that a number of prominent Humility theorists are also famous physicalists (Armstrong 1968; Jackson 1998; Lewis 2009)—Jackson even calls his version of the Humility Thesis ‘Kantian physicalism’ (Jackson 1998, p. 23)—questions have been raised as to whether the Humility Thesis and physicalism are really compatible. Specifically, the questions are of two kinds. The first concerns whether or not we are in a position to know that an unknowable property is physical; the second concerns whether or not there could be an unknowable intrinsic property that is physical.
The first question is raised by Sam Cowling (2010, p. 662), a critic of the Humility Thesis, as a part of his formulation of the objection from overkill (see Section 5c). On his view, if the Humility Thesis is true, then systematic metaphysics is impossible. For we cannot judge whether our world is a physical one or one of Berkeleyian idealism in which all things are ultimately ideas in God’s mind. In fact, Langton and Robichaud (2010, pp. 175-176) positively hold such a radical version of the Humility Thesis.
In response to Cowling, Tom McClelland (2012) argues that the kind of knowledge he discusses is not really what the Humility Thesis concerns. Specifically, on McClelland’s view, the Humility Thesis concerns only our knowledge-which of intrinsic properties, which concerns only the distinctive features that make the property differ from any other (pp. 68-69). In light of this, the knowledge that intrinsic properties are physical does not concern the distinctive features of these intrinsic properties, and it is thus compatible with the Humility Thesis. Of course, as discussed in Section 2, even if McClelland is correct, there remains a question as to whether all important versions of the Humility Thesis concern only knowledge-which, and whether those other versions would nonetheless lead to the problem raised by Cowling—we have at least seen that Langton and Robichaud dismiss the knowledge-which version of the Humility Thesis defended by McClelland.
More philosophers raise the second question concerning the compatibility between the Humility Thesis and physicalism, namely whether or not there could be an unknowable intrinsic property that is physical (Foster 1993; Langton 1998, pp. 207-208, Braddon-Mitchell & Jackson 2007, p. 141; Ney 2007). These philosophers define the physical as whatever is posited by physics, but if the Humility Thesis is true, intrinsic properties are necessarily out of reach of physics, and thereby by definition cannot possibly be counted as physical.
In response, Stoljar (2001) and Barbara Montero (2015) argue that the physicalist should accept some alternative conceptions of physicalism (and the physical) which could accommodate the Humility Thesis. They thus both advocate some top-down conceptions of physicalism (compare Maxwell 1978; Chalmers 2015; for a survey, see Chan 2020b). These top-down conceptions first recognise some things as physical—which are, in Stoljar’s case, paradigmatic physical objects like tables and chairs (Stoljar 2015; for an earlier influential formulation of this conception of physicalism, see also Jackson 1998, pp. 6-8), and in Montero’s case, the referents of physical theories (Montero 2015, p. 217)—and then recognise whatever plays a part in their constitution as physical. In light of this, since intrinsic properties play a part in the constitution of physical objects, they could thereby be counted as physical. Nonetheless, there is a famous problem facing these top-down conceptions of physicalism which is recognised by both proponents (Jackson 1998, p. 7; Stoljar 2001, p. 257n10) and critics (Langton & Robichaud 2010, p. 175; Braddon-Mitchell & Jackson 2007, p. 33). The problem is that if panpsychism, pantheism, idealism, and the like are correct, then things such as the electron’s consciousness and God play a part in the constitution of physical objects, and they should thereby be counted as physical. But it appears that any conception of physicalism (or the physical) that counts such things as physical should not really be considered as physicalism. In response, Stoljar argues that one might supplement constraints to his conception of physicalism to overcome this weakness (Stoljar 2001, p. 257n10).
9. Conclusion
In a frequently cited and discussed article on Humility, Ann Whittle remarks, ‘Perhaps surprisingly, a number of philosophers from disparate backgrounds have felt compelled to deny that we have any [intrinsic] knowledge’ (Whittle 2006, p. 461). This is certainly true. A number of questions surrounding the Humility Thesis were listed in the introductory section, but no matter what one’s answers to these questions are and whether one is convinced by the Humility Thesis or not, as we have seen the Humility Thesis has always explicitly or tacitly played a salient role in the history of ideas, in analytic metaphysics, in the philosophy of science, and even in the philosophy of mind. Particularly, the Humility Thesis is at least important in the following respects: that the thesis and some similar theories are plausibly utilised in the formulations of a number of religious and philosophical mysticisms in history; that the thesis has inspired many historically important thinkers such as Hume, Russell, and perhaps Kant and Schleiermacher; that the thesis is a key concern in the contemporary philosophy of properties; that the thesis implies an understanding of what scientific knowledge is about; and that the thesis is the basis of Russellian monism and some ancient and contemporary versions of panpsychism. Understanding the Humility Thesis thus provides us with a better insight into how a number of important philosophical frameworks and discussions were developed and framed. This will be useful to their inquirers, proponents, and critics alike.
10. References and Further Reading
Alter, T & Coleman, S 2020, ‘Russellian monism and mental causation’, Noûs, online first: https://doi.org/10.1111/nous.12318.
Alter, T & Nagasawa, Y 2012, ‘What is Russellian monism?’, Journal of Consciousness Studies, vol. 19, no. 9-10), pp. 67-95.
Armstrong, D 1961, Perception and the physical world, Routledge, London.
Armstrong D 1968, A materialist theory of the mind, Routledge & Kegan Paul, London.
Armstrong D 1997, A world of states of affairs, Cambridge University Press, Cambridge.
Berkeley, G 1710/1988, Principles of Human Knowledge and Three Dialogues, Penguin Books, London.
Bird, A 2005, ‘Laws and essences’, Ratio, vol. 18, no. 4, pp. 437-461.
Black, R 2000, ‘Against quidditism’, Australasian Journal of Philosophy, vol. 78, no. 1, pp. 87-104.
Blackburn, S 1990, ‘Filling in space’, Analysis, vol. 50, no. 2, pp. 62-65.
Borghini, A & Williams, N 2008, ‘A dispositional theory of possibility’, Dialectica, vol. 62, no. 1, pp. 21-41.
Braddon-Mitchell, D & Jackson, F 2007, The philosophy of mind and cognition, 2nd edn, Blackwell, Malden.
Chalmers, D 1995, ‘Facing up to the problem of consciousness’, Journal of Consciousness Studies, vol. 2, no. 3, pp. 200-219.
Chalmers, D 1996, The conscious mind: in search of a fundamental theory, Oxford University Press, New York.
Chalmers, D 1997, ‘Moving forward on the problem of consciousness’, Journal of Consciousness Studies, vol. 4, no. 1, pp. 3-46.
Chalmers, D 2003, ‘Consciousness and its place in nature’, in DS Stich & F Warfield (eds.), The Blackwell guide to philosophy of mind, Blackwell Publishing, Malden, pp. 102-142.
Chalmers, D 2012, Constructing the world, Oxford University Press, Oxford.
Chalmers, D 2015, ‘Panpsychism and panprotochism’, in T Alter & Y Nagasawa (eds.), Consciousness in the physical world: perspectives on Russellian Monism, Oxford University Press, New York, pp. 246-276.
Chalmers, D 2017, ‘The combination problem for panpsychism’, in G Brüntrup & L Jaskolla (eds.), Panpsychism: contemporary perspectives, Oxford University Press, New York, pp. 179-214.
Chalmers, D 2018, ‘The meta-problem of consciousness’, Journal of Consciousness Studies, vol. 25, no. 9-10, pp. 6-61.
Chan, LC 2017, Metaphysical naturalism and the ignorance of categorical properties, PhD thesis, University of Sydney, retrieved 28 November 2019, <https://ses.library.usyd.edu.au/handle/2123/16555>
Chan, LC 2020a, ‘Can the Russellian monist escape the epiphenomenalist’s paradox?’, Topoi, vol. 39, pp. 1093–1102.
Chan, LC 2020b, ‘Russellian physicalism and its dilemma’, Philosophical Studies, online first: https://doi.org/10.1111/nous.12318.
Chan, LC & Latham AJ 2019, ‘Four meta-methods for the study of qualia’, Erkenntnis, vol. 84, no. 1, pp. 145-167.
Churchland, PS 1988, ‘Reduction and the neurobiological basis of consciousness’, in A Marcel & E Bisiach (eds.), Consciousness in contemporary science, Oxford University Press, New York, pp. 273-304.
Churchland, PS 1996, ‘The hornswoggle problem’, Journal of Consciousness Studies, vol. 3, no. 5-6, pp. 402-408.
Clifford, WK, 1875/2011, ‘The unseen universe’, in L Stephen & F Pollock (eds.), Lectures and essays: volume II, Cambridge University Press, Cambridge.Cowling, S 2010, ‘Kantian humility and ontological categories’, Analysis, vol. 70, no. 4, pp. 659-665.
Cowling, S 2010, ‘Kantian humility and ontological categories’, Analysis, vol. 70, no. 4, pp. 659-665.
d’Holbach, PH 1770/1820 The system of nature, trans. by De Mirabaud, M, retrieved 26 September 2018, <http://www.ftarchives.net/holbach/system/0syscontents.htm>
Dennett, D 1991, Consciousness explained, Penguin Books, London.
Deutsch, E 1969, Advaita Vedanta: a philosophical reconstruction, East-West Center Press, Honolulu.
Diderot, D 1770/1979, ‘Philosophic principles matter and motion’, in Diderot: interpreter of nature, trans. by Stewart, J & Kemp, J, Hpyerion Press, Wesport, pp.127-133.
Eddington, A 1929, The nature of the physical world, Cambridge University Press, Cambridge.
Ellis, B 2014, The philosophy of nature: a guide to the new essentialism, Routledge, London.
Ellis, B & Lierse, C 1994, ‘Dispositional essentialism’, Australasian Journal of Philosophy, vol. 72, no. 1, pp. 27-45.
Faraday, M 1844, ‘A speculation touching electric condition and the nature of matter’, in Experimental researches in Electricity, vol. ii, Richard & John Edward Taylor, London.
Feigl, H 1967, The ‘mental’ and the ‘physical’: the essay and a postscript, University of Minnesota Press, Minneapolis.
Flood, G 1996, An introduction to Hinduism, Cambridge University Press, Cambridge.
Foster, J 1993, ‘The succinct case for idealism’, in H Robinson (ed.), Objections to physicalism, Clarendon Press, Oxford, pp. 293-313.
Goff, P 2006, ‘Experiences don’t sum’, Journal of Consciousness Studies, vol. 13, pp. 53-61.
Goff, P 2017, ‘The phenomenal bonding solution to the combination problem’, in G Bruntrup & L Jaskolla (eds.), Panpsychism: contemporary perspectives, Oxford University Press, New York, pp. 283-303.
Handfield, T 2005, ‘Armstrong and the modal inversion of dispositions’, Philosophical Quarterly, vol. 55, no. 220, pp. 452–461.
Heil, J 2004, Philosophy of mind, 2nd edn, Routledge, New York.
Heisenberg, W 1958/2000, Physics and philosophy: the revolution in modern science, Penguin Books, London.
Hiddleston, E 2019, ‘Dispositional and categorical properties, and Russellian monism’, Philosophical Studies, vol. 176, no. 1, pp. 65-92.
Hohwy, J 2005, ‘Explanation and two conceptions of the physical’, Erkenntnis, vol. 62, no. 1, pp. 71-89.
Holton, R 1999, ‘Dispositions all the way round’, Analysis, vol. 59, no. 1, pp. 9-14
Howell, R 2015, ‘The Russellian monist’s problems with mental causation’, The Philosophical Quarterly, vol. 65, no. 258, pp. 22-39.
Hume, D 1739/1978, A treatise of human nature, Oxford University Press, Oxford.
Jackson, F 1998, From metaphysics to ethics: a defence of conceptual analysis, Oxford University Press, Oxford.
Jacobi, FH 1787/2000, ‘On transcendental idealism’, in B Sassen (ed.), Kant’s early critics: the empiricist critique of the theoretical philosophy, Cambridge University Press, Cambridge, pp.169-175.
Kant, I 1781/1998 Critique of pure reason, trans. by P Guyer & A Wood, Cambridge University Press, Cambridge.
Kim, J 2005, Physicalism, or something near enough, Princeton University Press, Princeton.
Ladyman, J & Ross, D (with Spurrett, D & Collier, J) 2007, Every thing must go: metaphysics naturalized, Oxford University Press, Oxford.
Langton, R 1998, Kantian humility: our ignorance of things in themselves, Oxford University Press, Oxford.
Langton, R 2004, ‘Elusive knowledge of things in themselves’, Australasian Journal of Philosophy, vol. 82, no. 1, pp. 129-136.
Langton, R 2015, ‘The impossible necessity of “filling in space’’’, in R Johnson & M Smith (eds.), Passions and projections: themes from the philosophy of Simon Blackburn, Oxford University Press, Oxford, pp. 106-114.
Langton, R & Robichaud, C 2010, ‘Ghosts in the world machine? Humility and its alternatives’, in A Hazlett (ed.), New waves in metaphysics, Palgrave Macmillan, New York, pp. 156-178.
Leibniz, G 1714/1989, ‘The monadology’, in R Ariew & D Garber (trans. and eds.), Philosophical essays, Indianapolis, Hackett.
Leuenberger, S 2010, ‘Humility and constraints on O-language’, Philosophical Studies, vol. 149, no. 3, pp. 327-354.
Levine, J 1983, ‘Materialism and qualia: the explanatory gap’, Pacific Philosophical Quarterly, vol. 64, pp. 354-361.
Levine, J 2001, Purple Haze: the puzzle of consciousness, Oxford University Press, Oxford.
Lewis, D 1970, ‘How to define theoretical terms’, Journal of Philosophy, vol. 67, no. 13, pp. 427-446.
Lewis, D 1972, ‘Psychophysical and theoretical identifications’, Australasian Journal of Philosophy, vol. 50, no. 3, pp. 249-258.
Lewis, D 1986, Philosophical papers, vol. 2, Oxford University Press, New York.
Lewis, D 1995, ‘Should a materialist believe in qualia?’, Australasian Journal of Philosophy, vol. 73, no. 1, pp. 140-44.
Lewis, D 2009, ‘Ramseyan humility’, in D Braddon-Mitchell & R Nola (eds.), Conceptual analysis and philosophical naturalism, MIT Press, Cambridge, pp. 203-222.
Locke, D 2009, ‘A partial defense of Ramseyan humility’, in D Braddon-Mitchell & R Nola (eds.), Conceptual analysis and philosophical naturalism, MIT Press, Cambridge, MA, pp. 223-242.
Lockwood, M 1989, Mind, brain, and quantum, Blackwell, Oxford.
Lockwood, M 1992, ‘The grain problem’, in H Robinson (ed.), Objections to physicalism, Oxford University Press, Oxford, pp. 271-292.
Mach, E 1897/1984, The analysis of sensations and the relation of the physical to the psychical, trans. by CM Williams, Open Court, La Salle.
Mackie, JL 1973, Truth, probability and paradox, Oxford University Press, Oxford.
Mahony, W 1998, The artful universe: an introduction to the Vedic religious imagination, SUNY Press, Albany.
Majeed, R 2013, ‘Pleading ignorance in response to experiential primitivism’, Philosophical Studies, vol. 163, no. 1, pp. 251-269.
Majeed, R 2017, ‘Ramseyan Humility: the response from revelation and panpsychism’, Canadian Journal of Philosophy, vol. 47, no. 1, pp. 75-96
Maxwell, G 1978, ‘Rigid designators and mind-brain identity’, in W Savage (ed.), Perception and cognition: issues in the foundations of psychology, University of Minnesota Press, Minneapolis, pp. 365-403.
McClelland, T 2012, ‘In defence of Kantian humility’, Thought, vol. 1, no. 1, pp. 62-70.
Mill, JS 1865/1996, An examination of Sir William Hamilton’s philosophy, Routledge, London.
Montero, B 2015, ‘Russellian physicalism’, in T Alter & Y Nagasawa (eds.), Consciousness in the physical world: perspectives on Russellian monism, Oxford University Press, New York, pp. 209-223.
Ney, A 2007, ‘Physicalism and our knowledge of intrinsic properties’, Australasian Journal of Philosophy, vol. 85, no. 1, pp. 41-60.
Ney, A 2015, ‘A physicalist critique of Russellian Monism’, in T Alter & Y Nagasawa (eds.), Consciousness in the physical world: perspectives on Russellian Monism, Oxford University Press, New York, pp. 324-345.
Nietzche, F 1887/2006, On the genealogy of morality, trans. by Diethe, C, Cambridge University Press, Cambridge.
Tegmark, M 2007, ‘The mathematical universe’, Foundations of physics, vol. 38, pp. 101–150.
Tully, RE 2003, ‘Russell’s neutral monism’, in N Griffin (ed.), The Cambridge companion to Bertrand Russell, Cambridge University Press, Cambridge, pp. 332-370.
Pettit, P 1998, ‘Noumenalism and response-dependence’, Monist, vol. 81, no. 1, pp. 112-132.
Robinson, W 2018, ‘Russellian monism and epiphenomenalism’, Pacific Philosophical Quarterly, vol. 99, no. 1, pp. 100–117.
Rosenberg, G 2004, A place for consciousness:probing the deep structure of the natural world, Oxford University Press, Oxford.
Russell, B 1912/1978, The problems of philosophy, Oxford University Press, Oxford.
Russell, B 1921/1922, The analysis of mind, George Allen & Unwin, London.
Russell, B 1927a/1992, The analysis of matter, Routledge, London.
Russell, B 1927b, An outline of philosophy, George Allen & Unwin, London.
Schleiermacher, F 1799/1988, On religion: speeches to its cultured despisers, Cambridge University Press, Cambridge.
Schlick, M 1925a/1979, ‘Outlines of the philosophy of nature’, in Philosophical Papers: Volume II (1925-1936), pp. 1-90.
Schlick, M 1925b/1985, General theory of knowledge, trans. by Blumberg, A, Open Court, Chicago.
Schopenhauer, A 1818/1966, The world as will and representation, vol. 2, trans by EFJ Payne, Dover, New York.
Seager, W 2009, ‘Panpsychism’, in A Beckermann & BP McLaughlin (eds.), The Oxford handbook of philosophy of mind, Oxford University Press, New York, pp. 206-219.
Shoemaker, S 1980, ‘Causality and properties’, in P van Inwagen (ed.), Time and cause, Reidel, Dordrecht, pp. 109-135.
Smart, JJC 1963, Philosophy and scientific realism, Routledge, London.
Smart, JJC 2004, ‘Consciousness and awareness’, Journal of Consciousness Studies, vol. 11, no. 2, pp. 41-50.
Smith, M & Stoljar, D 1998, ‘Global response-dependence and noumenal realism’, Monist, vol. 81, no. 1, pp. 85-111.
Stoljar, D 2001, ‘Two conceptions of the physical’, Philosophy and Phenomenological Research, vol. 62, no. 2, pp. 253-281.
Stoljar, D 2009, ‘The argument from revelation’, in D Braddon-Mitchell & R Nola (eds.), Conceptual analysis and philosophical naturalism, MIT Press, Cambridge, pp. 113-138.
Stoljar, D 2014, ‘Four kinds of Russellian Monism’, in U Kriegel (ed.), Current controversies in philosophy of mind, Routledge, New York, pp. 17-39.
Stoljar, D 2015, ‘Physicalism’, in E Zalta (ed.), Stanford encyclopedia of philosophy, retrieved 8 March 2020, <http://plato.stanford.edu/entries/physicalism/>
Strawson, G 2006, ‘Realistic monism: why physicalism entails panpsychism’, Journal of Consciousness Studies, vol. 13, no. 10-11, pp. 3-31.
Strawson, PF 1966, The bounds of sense: an essay on Kant’s Critique of Pure Reason, Methuen, London.
Swoyer, C 1982, ‘The nature of laws of nature’, Australasian Journal of Philosophy, vol. 60, no. 3, pp. 203-223.
Van Cleve, J 2002, ‘Receptivity and our knowledge of intrinsic properties’, Philosophy and Phenomenological Research, vol. 65, no. 1, pp. 218-237.
Whitehead, AN 1925/1967, Science and the modern world, The Free Press, New York.
Whittle, A 2006, ‘On an argument for Humility’, Philosophical Studies, vol. 130, no. 3, pp. 461-497.
Wishon, D 2015, ‘Russell on Russellian Monism’, in T Alter & Y Nagasawa (eds.), Consciousness in the physical world: perspectives on Russellian Monism, Oxford University Press, New York, pp. 91-120.
Yates, D 2018, ‘Three arguments for humility’, Philosophical Studies, vol. 175, no. 2, pp. 461-481.
Author Information
Lok-Chi Chan
Email: lokchan@ntu.edu.tw
National Taiwan University
Taiwan
Critical Thinking
Critical Thinking is the process of using and assessing reasons to evaluate statements, assumptions, and arguments in ordinary situations. The goal of this process is to help us have good beliefs, where “good” means that our beliefs meet certain goals of thought, such as truth, usefulness, or rationality. Critical thinking is widely regarded as a species of informal logic, although critical thinking makes use of some formal methods. In contrast with formal reasoning processes that are largely restricted to deductive methods—decision theory, logic, statistics—the process of critical thinking allows a wide range of reasoning methods, including formal and informal logic, linguistic analysis, experimental methods of the sciences, historical and textual methods, and philosophical methods, such as Socratic questioning and reasoning by counterexample.
The goals of critical thinking are also more diverse than those of formal reasoning systems. While formal methods focus on deductive validity and truth, critical thinkers may evaluate a statement’s truth, its usefulness, its religious value, its aesthetic value, or its rhetorical value. Because critical thinking arose primarily from the Anglo-American philosophical tradition (also known as “analytic philosophy”), contemporary critical thinking is largely concerned with a statement’s truth. But some thinkers, such as Aristotle (in Rhetoric), give substantial attention to rhetorical value.
The primary subject matter of critical thinking is the proper use and goals of a range of reasoning methods, how they are applied in a variety of social contexts, and errors in reasoning. This article also discusses the scope and virtues of critical thinking.
Critical thinking should not be confused with Critical Theory. Critical Theory refers to a way of doing philosophy that involves a moral critique of culture. A “critical” theory, in this sense, is a theory that attempts to disprove or discredit a widely held or influential idea or way of thinking in society. Thus, critical race theorists and critical gender theorists offer critiques of traditional views and latent assumptions about race and gender. Critical theorists may use critical thinking methodology, but their subject matter is distinct, and they also may offer critical analyses of critical thinking itself.
The process of evaluating a statement traditionally begins with making sure we understand it; that is, a statement must express a clear meaning. A statement is generally regarded as clear if it expresses a proposition, which is the meaning the author of that statement intends to express, including definitions, referents of terms, and indexicals, such as subject, context, and time. There is significant controversy over what sort of “entity” propositions are, whether abstract objects or linguistic constructions or something else entirely. Whatever its metaphysical status, it is used here simply to refer to whatever meaning a speaker intends to convey in a statement.
The difficulty with identifying intended propositions is that we typically speak and think in natural languages (English, Swedish, French), and natural languages can be misleading. For instance, two different sentences in the same natural language may express the same proposition, as in these two English sentences:
Jamie is taller than his father.
Jamie’s father is shorter than he.
Further, the same sentence in a natural language can express more than one proposition depending on who utters it at a time:
I am shorter than my father right now.
The pronoun “I” is an indexical; it picks out, or “indexes,” whoever utters the sentence and, therefore, expresses a different proposition for each new speaker who utters it. Similarly, “right now” is a temporal indexical; it indexes the time the sentence is uttered. The proposition it is used to express changes each new time the sentence is uttered and, therefore, may have a different truth value at different times (as, say, the speaker grows taller: “I am now five feet tall” may be true today, but false a year from now). Other indexical terms that can affect the meaning of the sentence include other pronouns (he, she, it) and definite articles (that, the).
Further still, different sentences in different natural languages may express the same proposition. For example, all of the following express the proposition “Snow is white”:
Snow is white. (English)
Der Schnee ist weiss. (German)
La neige est blanche. (French)
La neve é bianca. (Italian)
Finally, statements in natural languages are often vague or ambiguous, either of which can obscure the propositions actually intended by their authors. And even in cases where they are not vague or ambiguous, statements’ truth values sometimes vary from context to context. Consider the following example.
The English statement, “It is heavy,” includes the pronoun “it,” which (when used without contextual clues) is ambiguous because it can index any impersonal subject. If, in this case, “it” refers to the computer on which you are reading this right now, its author intends to express the proposition, “The computer on which you are reading this right now is heavy.” Further, the term “heavy” reflects an unspecified standard of heaviness (again, if contextual clues are absent). Assuming we are talking about the computer, it may be heavy relative to other computer models but not to automobiles. Further still, even if we identify or invoke a standard of heaviness by which to evaluate the appropriateness of its use in this context, there may be no weight at which an object is rightly regarded as heavy according to that standard. (For instance, is an object heavy because it weighs 5.3 pounds but not if it weighs 5.2 pounds? Or is it heavy when it is heavier than a mouse but lighter than an anvil?) This means “heavy” is a vague term. In order to construct a precise statement, vague terms (heavy, cold, tall) must often be replaced with terms expressing an objective standard (pounds, temperature, feet).
Part of the challenge of critical thinking is to clearly identify the propositions (meanings) intended by those making statements so we can effectively reason about them. The rules of language help us identify when a term or statement is ambiguous or vague, but they cannot, by themselves, help us resolve ambiguity or vagueness. In many cases, this requires assessing the context in which the statement is made or asking the author what she intends by the terms. If we cannot discern the meaning from the context and we cannot ask the author, we may stipulate a meaning, but this requires charity, to stipulate a plausible meaning, and humility, to admit when we discover that our stipulation is likely mistaken.
2. Argument and Evaluation
Once we are satisfied that a statement is clear, we can begin evaluating it. A statement can be evaluated according to a variety of standards. Commonly, statements are evaluated for truth, usefulness, or rationality. The most common of these goals is truth, so that is the focus of this article.
The truth of a statement is most commonly evaluated in terms of its relation to other statements and direct experiences. If a statement follows from or can be inferred from other statements that we already have good reasons to believe, then we have a reason to believe that statement. For instance, the statement “The ball is blue” can be derived from “The ball is blue and round.” Similarly, if a statement seems true in light of, or is implied by, an experience, then we have a reason to believe that statement. For instance, the experience of seeing a red car is a reason to believe, “The car is red.” (Whether these reasons are goodenough for us to believe is a further question about justification, which is beyond the scope of this article, but see “Epistemic Justification.”) Any statement we derive in these ways is called a conclusion. Though we regularly form conclusions from other statements and experiences—often without thinking about it—there is still a question of whether these conclusions are true: Did we draw those conclusions well? A common way to evaluate the truth of a statement is to identify those statements and experiences that support our conclusions and organize them into structures called arguments. (See also, “Argument.”)
An argument is one or more statements (called premises) intended to support the truth of another statement (the conclusion). Premises comprise the evidence offered in favor of the truth of a conclusion. It is important to entertain any premises that are intended to support a conclusion, even if the attempt is unsuccessful. Unsuccessful attempts at supporting a proposition constitute bad arguments, but they are still arguments. The support intended for the conclusion may be formal or informal. In a formal, or deductive, argument, an arguer intends to construct an argument such that, if the premises are true, the conclusion must be true. This strong relationship between premises and conclusion is called validity. This relationship between the premises and conclusion is called “formal” because it is determined by the form (that is, the structure) of the argument (see §3). In an informal, or inductive, argument, the conclusion may be false even if the premises are true. In other words, whether an inductive argument is good depends on something more than the form of the argument. Therefore, all inductive arguments are invalid, but this does not mean they are bad arguments. Even if an argument is invalid, its premises can increase the probability that its conclusion is true. So, the form of inductive arguments is evaluated in terms of the strength the premises confer on the conclusion, and stronger inductive arguments are preferred to weaker ones (see §4). (See also, “Deductive and Inductive Arguments.”)
Psychological states, such as sensations, memories, introspections, and intuitions often constitute evidence for statements. Although these states are not themselves statements, they can be expressed as statements. And when they are, they can be used in and evaluated by arguments. For instance, my seeing a redwall is evidence for me that, “There is a red wall,” but the physiological process of seeing is not a statement. Nevertheless, the experience of seeing a red wall can be expressed as the proposition, “I see a red wall” and can be included in an argument such as the following:
I see a red wall in front of me.
Therefore, there is a red wall in front of me.
This is an inductive argument, though not a strong one. We do not yet know whether seeing something (under these circumstances) is reliable evidence for the existence of what I am seeing. Perhaps I am “seeing” in a dream, in which case my seeing is not good evidence that there is a wall. For similar reasons, there is also reason to doubt whether I am actually seeing. To be cautious, we might say we seem to see a red wall.
To be good, an argument must meet two conditions: the conclusion must follow from the premises—either validly or with a high degree of likelihood—and the premises must be true. If the premises are true and the conclusion follows validly, the argument is sound. If the premises are true and the premises make the conclusion probable (either objectively or relative to alternative conclusions), the argument is cogent.
Here are two examples:
Example 1:
Earth is larger than its moon.
Our sun is larger than Earth.
Therefore, our sun is larger than Earth’s moon.
In example 1, the premises are true. And since “larger than” is a transitive relation, the structure of the argument guarantees that, if the premises are true, the conclusion must be true. This means the argument is also valid. Since it is both valid and has true premises, this deductive argument is sound.
Example 2:
It is sunny in Montana about 205 days per year.
I will be in Montana in February.
Hence, it will probably be sunny when I am in Montana.
In example 2, premise 1 is true, and let us assume premise 2 is true. The phrase “almost always” indicates that a majority of days in Montana are sunny, so that, for any day you choose, it will probably be a sunny day. Premise 2 says I am choosing days in February to visit. Together, these premises strongly support (though they do not guarantee) the conclusion that it will be sunny when I am there, and so this inductive argument is cogent.
In some cases, arguments will be missing some important piece, whether a premise or a conclusion. For instance, imagine someone says, “Well, she asked you to go, so you have to go.” The idea that you have to go does not follow logically from the fact that she asked you to go without more information. What is it about her asking you to go that implies you have to go? Arguments missing important information are called enthymemes. A crucial part of critical thinking is identifying missing or assumed information in order to effectively evaluate an argument. In this example, the missing premise might be that, “She is your boss, and you have to do what she asks you to do.” Or it might be that, “She is the woman you are interested in dating, and if you want a real chance at dating her, you must do what she asks.” Before we can evaluate whether her asking implies that you have to go, we need to know this missing bit of information. And without that missing bit of information, we can simply reply, “That conclusion doesn’t follow from that premise.”
The two categories of reasoning associated with soundness and cogency—formal and informal, respectively—are considered, by some, to be the only two types of argument. Others add a third category, called abductive reasoning, according to which one reasons according to the rules of explanation rather than the rules of inference. Those who do not regard abductive reasoning as a third, distinct category typically regard it as a species of informal reasoning. Although abductive reasoning has unique features, here it is treated, for reasons explained in §4d, as a species of informal reasoning, but little hangs on this characterization for the purposes of this article.
3. Formal Reasoning
Although critical thinking is widely regarded as a type of informal reasoning, it nevertheless makes substantial use of formal reasoning strategies. Formal reasoning is deductive, which means an arguer intends to infer or derive a proposition from one or more propositions on the basis of the form or structure exhibited by the premises. Valid argument forms guarantee that particular propositions can be derived from them. Some forms look like they make such guarantees but fail to do so (we identify these as formal fallacies in §5a). If an arguer intends or supposes that a premise or set of premises guarantee a particular conclusion, we may evaluate that argument form as deductive even if the form fails to guarantee the conclusion, and is thus discovered to be invalid.
Before continuing in this section, it is important to note that, while formal reasoning provides a set of strict rules for drawing valid inferences, it cannot help us determine the truth of many of our original premises or our starting assumptions. And in fact, very little critical thinking that occurs in our daily lives (unless you are a philosopher, engineer, computer programmer, or statistician) involves formal reasoning. When we make decisions about whether to board an airplane, whether to move in with our significant others, whether to vote for a particular candidate, whether it is worth it to drive ten miles faster the speed limit even if I am fairly sure I will not get a ticket, whether it is worth it to cheat on a diet, or whether we should take a job overseas, we are reasoning informally. We are reasoning with imperfect information (I do not know much about my flight crew or the airplane’s history), with incomplete information (no one knows what the future is like), and with a number of built-in biases, some conscious (I really like my significant other right now), others unconscious (I have never gotten a ticket before, so I probably will not get one this time). Readers who are more interested in these informal contexts may want to skip to §4.
An argument form is a template that includes variables that can be replaced with sentences. Consider the following form (found within the formal system known as sentential logic):
If p, then q.
p.
Therefore, q.
This form was named modus ponens (Latin, “method of putting”) by medieval philosophers. p and q are variables that can be replaced with any proposition, however simple or complex. And as long as the variables are replaced consistently (that is, each instance of p is replaced with the same sentence and the same for q), the conclusion (line 3), q, follows from these premises. To be more precise, the inference from the premises to the conclusion is valid. “Validity” describes a particular relationship between the premises and the conclusion, namely: in all cases, the conclusion follows necessarily from the premises, or, to use more technical language, the premises logically guarantee an instance of the conclusion.
Notice we have said nothing yet about truth. As critical thinkers, we are interested, primarily, in evaluating the truth of sentences that express propositions, but all we have discussed so far is a type of relationship between premises and conclusion (validity). This formal relationship is analogous to grammar in natural languages and is known in both fields as syntax. A sentence is grammatically correct if its syntax is appropriate for that language (in English, for example, a grammatically correct simple sentence has a subject and a predicate—“He runs.” “Laura is Chairperson.”—and it is grammatically correct regardless of what subject or predicate is used—“Jupiter sings.”—and regardless of whether the terms are meaningful—“Geflorble rowdies.”). Whether a sentence is meaningful, and therefore, whether it can be true or false, depends on its semantics, which refers to the meaning of individual terms (subjects and predicates) and the meaning that emerges from particular orderings of terms. Some terms are meaningless—geflorble; rowdies—and some orderings are meaningless even though their terms are meaningful—“Quadruplicity drinks procrastination,” and “Colorless green ideas sleep furiously.”.
Despite the ways that syntax and semantics come apart, if sentences are meaningful, then syntactic relationships between premises and conclusions allow reasoners to infer truth values for conclusions. Because of this, a more common definition of validity is this: it is not possible for all the premises to be true and the conclusion false. Formal logical systems in which syntax allows us to infer semantic values are called truth-functional or truth-preserving—proper syntax preserves truth throughout inferences.
The point of this is to note that formal reasoning only tells us what is true if we already know our premises are true. It cannot tell us whether our experiences are reliable or whether scientific experiments tell us what they seem to tell us. Logic can be used to help us determine whether a statement is true, but only if we already know some true things. This is why a broad conception of critical thinking is so important: we need many different tools to evaluate whether our beliefs are any good.
Consider, again, the form modus ponens, and replace p with “It is a cat” and q with “It is a mammal”:
If it is a cat, then it is a mammal.
It is a cat.
Therefore, it is a mammal.
In this case, we seem to “see” (in a metaphorical sense of see) that the premises guarantee the truth of the conclusion. On reflection, it is also clear that the premises might not be true; for instance, if “it” picks out a rock instead of a cat, premise 1 is still true, but premise 2 is false. It is also possible for the conclusion to be true when the premises are false. For instance, if the “it” picks out a dog instead of a cat, the conclusion “It is a mammal” is true. But in that case, the premises do not guarantee that conclusion; they do not constitute a reason to believe the conclusion is true.
Summing up, an argument is valid if its premises logically guarantee an instance of its conclusion (syntactically), or if it is not possible for its premises to be true and its conclusion false (semantically). Logic is truth-preserving but not truth-detecting; we still need evidence that our premises are true to use logic effectively.
A Brief Technical Point
Some readers might find it worth noting that the semantic definition of validity has two counterintuitive consequences. First, it implies that any argument with a necessarily true conclusion is valid. Notice that the condition is phrased hypothetically: if the premises are true, then the conclusion cannot be false. This condition is met if the conclusion cannot be false:
If it is a cat, then it is a mammal.
It is a cat.
Two added to two equals four.
This is because the hypothetical (or “conditional”) statement would still be true even if the premises were false:
If it is blue, then it flies.
It is an airplane.
Two added to two equals four.
It is true of this argument that if the premises were true, the conclusion would be since the conclusion is true no matter what.
Second, the semantic formulation also implies that any argument with necessarily false premises is valid. The semantic condition for validity is met if the premises cannot be true:
Some bachelors are married.
Earth’s moon is heavier than Jupiter.
In this case, if the premise were true, the conclusion could not be false (this is because anything follows syntactically from a contradiction), and therefore, the argument is valid. There is nothing particularly problematic about these two consequences. But they highlight unexpected implications of our standard formulations of validity, and they show why there is more to good arguments than validity.
Despite these counterintuitive implications, valid reasoning is essential to thinking critically because it is a truth-preserving strategy: if deductive reasoning is applied to true premises, true conclusions will result.
There are a number of types of formal reasoning, but here we review only some of the most common: categorical logic, propositional logic, modal logic, and predicate logic.
a. Categorical Logic
Categorical logic is formal reasoning about categories or collections of subjects, where subjects refers to anything that can be regarded as a member of a class, whether objects, properties, or events or even a single object, property, or event. Categorical logic employs the quantifiers “all,” “some,” and “none” to refer to the members of categories, and categorical propositions are formulated in four ways:
A claims: All As are Bs (where the capitals “A” and “B” represent categories of subjects).
E claims: No As are Bs.
I claims: Some As are Bs.
O claims: Some As are not Bs.
Categorical syllogisms are syllogisms (two-premised formal arguments) that employ categorical propositions. Here are two examples:
All cats are mammals. (A claim) 1. No bachelors are married. (E claim)
Some cats are furry. (I claim) 2. All the people in this building are bachelors. (A claim)
Therefore, some mammals are furry. (I claim) 3. Thus, no people in this building are married. (E claim)
There are interesting limitations on what categorical logic can do. For instance, if one premise says that, “Some As are not Bs,” may we infer that some As are Bs, in what is known as an “existential assumption”? Aristotle seemed to think so (De Interpretatione), but this cannot be decided within the rules of the system. Further, and counterintuitively, it would mean that a proposition such as, “Some bachelors are not married,” is false since it implies that some bachelors are married.
Another limitation on categorical logic is that arguments with more than three categories cannot be easily evaluated for validity. The standard method for evaluating the validity of categorical syllogisms is the Venn diagram (named after John Venn, who introduced it in 1881), which expresses categorical propositions in terms of two overlapping circles and categorical arguments in terms of three overlapping circles, each circle representing a category of subjects.
Venn diagram for claim and Venn diagram for argument
A, B, and C represent categories of objects, properties, or events. The symbol “∩” comes from mathematical set theory to indicate “intersects with.” “A∩B” means all those As that are also Bs and vice versa.
Though there are ways of constructing Venn diagrams with more than three categories, determining the validity of these arguments using Venn diagrams is very difficult (and often requires computers). These limitations led to the development of more powerful systems of formal reasoning.
b. Propositional Logic
Propositional, or sentential, logic has advantages and disadvantages relative to categorical logic. It is more powerful than categorical logic in that it is not restricted in the number of terms it can evaluate, and therefore, it is not restricted to the syllogistic form. But it is weaker than categorical logic in that it has no operators for quantifying over subjects, such as “all” or “some.” For those, we must appeal to predicate logic (see §3c below).
Basic propositional logic involves formal reasoning about propositions (as opposed to categories), and its most basic unit of evaluation is the atomic proposition. “Atom” means the smallest indivisible unit of something, and simple English statements (subject + predicate) are atomic wholes because if either part is missing, the word or words cease to be a statement, and therefore ceases to be capable of expressing a proposition. Atomic propositions are simple subject-predicate combinations, for instance, “It is a cat” and “I am a mammal.” Variable letters such as p and q in argument forms are replaced with semantically rich constants, indicated by capital letters, such as A and B. Consider modus ponens again (noting that the atomic propositions are underlined in the English argument):
Argument Form
English Argument
Semantic Replacement
1. If p, then q.
1. If it is a cat, then it is a mammal.
1. If C, then M
2. p.
2. It is a cat.
2. C
3. Therefore, q.
3. Therefore, it is a mammal.
3. M
As you can see from premise 1 of the Semantic Replacement, atomic propositions can be combined into more complex propositions using symbols that represent their logical relationships (such as “If…, then…”). These symbols are called “operators” or “connectives.” The five standard operators in basic propositional logic are:
Operator/Connective
Symbol
Example
Translation
“not”
~ or ¬ or
It is not the case that p.
~p
“and”
& or •
Both p and q.
p & q
“or”
v
Either p or q.
p v q
“If…, then…”
à or ⊃
If p, then q.
p ⊃ q
“if and only if”
≡ or ⬌ or iff
p if and only if q.
p ≡ q
These operations allow us to identify valid relations among propositions: that is, they allow us to formulate a set of rules by which we can validly infer propositions from and validly replace them with others. These rules of inference (such as modus ponens; modus tollens; disjunctive syllogism) and rules of replacement (such as double negation; contraposition; DeMorgan’s Law) comprise the syntax of propositional logic, guaranteeing the validity of the arguments employing them.
Two Rules of Inference:
Conjunction
Argument Form
Propositional Translation
1. It is raining.
1. p
1. R
2. It is windy.
2. q
2. W
3. Therefore, it is raining and it is windy.
3. /.: (p & q)
3. /.: (R & W)
Disjunctive Syllogism
Argument Form
Propositional Translation
1. Either it is raining or my car is dirty.
1. (p v q)
1. (R v C)
2. My car is not dirty.
2. ~q
2. ~C
3. Therefore, it is raining.
3. /.: p
3. /.: R
Two Rules of Replacement:
Material Implication
Replacement Form
Propositional Translation
If it is raining, then the sidewalk is wet if and only if either it is not raining or the sidewalk is wet.
(p ⊃ q) ≡ (~p v q)
(R ⊃ W) ≡ (~R v W)
DeMorgan’s Laws
Replacement Form
Propositional Translation
It is not the case that the job is a good fit for you and you hate it if and only if it either is not a good fit for your or you do not hate it.
~(p & q) ≡ (~p v ~q)
~(F & H) ≡ (~F v ~H)
It is not the case that he is either a lawyer or a nice guy if and only if he is neither a lawyer nor a nice guy.
Standard propositional logic does not capture every type of proposition we wish to express (recall that it does not allow us to evaluate categorical quantifiers such as “all” or “some”). It also does not allow us to evaluate propositions expressed as possibly true or necessarily true, modifications that are called modal operators or modal quantifiers.
Modal logic refers to a family of formal propositional systems, the most prominent of which includes operators for necessity (□) and possibility (◊) (see §3d below for examples of other modal systems). If a proposition, p, is possibly true, ◊p, it may or may not be true. If p is necessarily true, □p, it must be true; it cannot be false. If p is necessarily false, either ~◊p or □~p, it must be false; it cannot be true.
There is a variety of modal systems, the weakest of which is called K (after Saul Kripke, who exerted important influence on the development of modal logic), and it involves only two additional rules:
Necessitation Rule: If A is a theorem of K, then so is □A.
Distribution Axiom: □(A⊃B) ⊃ (□A⊃□B). [If it is necessarily the case that if A, then B, then if it is necessarily the case that A, it is necessarily the case that B.]
Other systems maintain these rules and add others for increasing strength. For instance, the (S4) modal system includes axiom (4):
(4) □A ⊃ □□A [If it is necessarily the case that A, then it is necessarily necessary that A.]
An influential and intuitive way of thinking about modal concepts is the idea of “possible worlds” (see Plantinga, 1974; Lewis 1986). A world is just the set of all true propositions. The actual world is the set of all actually true propositions—everything that was true, is true, and (depending on what you believe about the future) will be true. A possible world is a way the actual world might have been. Imagine you wore green underwear today. The actual world might have been different in that way: you might have worn blue underwear. In this interpretation of modal quantifiers, there is a possible world in which you wore blue underwear instead of green underwear. And for every possibility like this, and every combination of those possibilities, there is a distinct possible world.
If a proposition is not possible, then there is no possible world in which that proposition is true. The statement, “That object is red all over and blue all over at the same time” is not true in any possible worlds. Therefore, it is not possible (~◊P), or, in other words, necessarily false (□~P). If a proposition is true in all possible worlds, it is necessarily true. For instance, the proposition, “Two plus two equal four,” is true in all possible worlds, so it is necessarily true (□P) or not possibly false (~◊~P).
All modal systems have a number of controversial implications, and there is not space to review them here. Here we need only note that modal logic is a type of formal reasoning that increases the power of propositional logic to capture more of what we attempt to express in natural languages. (For more, see “Modal Logic: A Contemporary View.”)
d. Predicate Logic
Predicate logic, in particular, first-order predicate logic, is even more powerful than propositional logic. Whereas propositional logic treats propositions as atomic wholes, predicate logic allows reasoners to identify and refer to subjects of propositions, independently of their predicates. For instance, whereas the proposition, “Susan is witty,” would be replaced with a single upper-case letter, say “S,” in propositional logic, predicate logic would assign the subject “Susan” a lower-case letter, s, and the predicate “is witty” an upper-case letter, W, and the translation (or formula) would be: Ws.
In addition to distinguishing subjects and predicates, first-order predicate logic allows reasoners to quantify over subjects. The quantifiers in predicate logic are “All…,” which is comparable to “All” quantifier in categorical logic and is sometimes symbolized with an upside-down A: ∀ (though it may not be symbolized at all), and “There is at least one…,” which is comparable to “Some” quantifier in categorical logic and is symbolized with a backward E: ∃. E and O claims are formed by employing the negation operator from propositional logic. In this formal system, the proposition, “Someone is witty,” for example, has the form: There is an x, such that x has the property of being witty, which is symbolized: (∃x)(Wx). Similarly, the proposition, “Everyone is witty,” has the form: For all x, x has the property of being witty, which is symbolized (∀x)(Wx) or, without the ∀: (x)(Wx).
Predicate derivations are conducted according to the same rules of inference and replacement as propositional logic with the exception of four rules to accommodate adding and eliminating quantifiers.
Second-order predicate logic extends first-order predicate logic to allow critical thinkers to quantify over and draw inferences about subjects and predicates, including relations among subjects and predicates. In both first- and second-order logic, predicates typically take the form of properties (one-place predicates) or relations (two-place predicates), though there is no upper limit on place numbers. Second-order logic allows us to treat both as falling under quantifiers, such as everything that is (specifically, that has the property of being) a tea cup and everything that is a bacheloris unmarried.
e. Other Formal Systems
It is worth noting here that the formal reasoning systems we have seen thus far (categorical, propositional, and predicate) all presuppose that truth is bivalent, that is, two-valued. The two values critical thinkers are most often concerned with are true and false, but any bivalent system is subject to the rules of inference and replacement of propositional logic. The most common alternative to truth values is the binary code of 1s and 0s used in computer programming. All logics that presuppose bivalence are called classical logics. In the next section, we see that not all formal systems are bivalent; there are non-classical logics. The existence of non-classical systems raises interesting philosophical questions about the nature of truth and the legitimacy of our basic rules of reasoning, but these questions are too far afield for this context. Many philosophers regard bivalent systems as legitimate for all but the most abstract and purely formal contexts. Included below is a brief description of three of the most common non-classical logics.
Tense logic, or temporal logic, is a formal modal system developed by Arthur Prior (1957, 1967, 1968) to accommodate propositional language about time. For example, in addition to standard propositional operators, tense logic includes four operators for indexing times: P “It has at some time been the case that…”; F “It will at some time be the case that…”; H “It has always been the case that…”; and G “It will always be the case that….”
Many-valued logic, or n-valued logic, is a family of formal logical systems that attempts to accommodate intuitions that suggest some propositions have values in addition to true and false. These are often motivated by intuitions that some propositions have neither of the classic truth values; their truth value is indeterminate (not just undeterminable, but neither true nor false), for example, propositions about the future such as, “There will be a sea battle tomorrow.” If the future does not yet exist, there is no fact about the future, and therefore, nothing for a proposition to express.
Fuzzy logic is a type of many-valued logic developed out of Lotfi Zadeh’s (1965) work on mathematical sets. Fuzzy logic attempts to accommodate intuitions that suggest some propositions have truth value in degrees, that is, some degree of truth between true and false. It is motivated by concerns about vagueness in reality, for example whether a certain color is red or some degree of red, or whether some temperature is hot or some degree of hotness.
Formal reasoning plays an important role in critical thinking, but not very often. There are significant limits to how we might use formal tools in our daily lives. If that is true, how do critical thinkers reason well when formal reasoning cannot help? That brings us to informal reasoning.
4. Informal Reasoning
Informal reasoning is inductive, which means that a proposition is inferred (but not derived) from one or more propositions on the basis of the strength provided by the premises (where “strength” means some degree of likelihood less than certainty or some degree of probability less than 1 but greater than 0; a proposition with 0% probability is necessarily false).
Particular premises grant strength to premises to the degree that they reflect certain relationships or structures in the world. For instance, if a particular type of event, p, is known to cause or indicate another type of event, q, then upon encountering an event of type p, we may infer that an event of type q is likely to occur. We may express this relationship among events propositionally as follows:
Events of type p typically cause or indicate events of type q.
An event of type p occurred.
Therefore, an event of type q probably occurred.
If the structure of the world (for instance, natural laws) makes premise 1 true, then, if premise 2 is true, we can reasonably (though not certainly) infer the conclusion.
Unlike formal reasoning, the adequacy of informal reasoning depends on how well the premises reflect relationships or structures in the world. And since we have not experienced every relationship among objects or events or every structure, we cannot infer with certainty that a particular conclusion follows from a true set of premises about these relationships or structures. We can only infer them to some degree of likelihood by determining to the best of our ability either their objective probability or their probability relative to alternative conclusions.
The objective probability of a conclusion refers to how likely, given the way the world is regardless of whether we know it, that conclusion is to be true. The epistemic probability of a conclusion refers to how likely that conclusion is to be true given what we know about the world, or more precisely, given our evidence for its objective likelihood.
Objective probabilities are determined by facts about the world and they are not truths of logic, so we often need evidence for objective probabilities. For instance, imagine you are about to draw a card from a standard playing deck of 52 cards. Given particular assumptions about the world (that this deck contains 52 cards and that one of them is the Ace of Spades), the objective likelihood that you will draw an Ace of Spades is 1/52. These assumptions allow us to calculate the objective probability of drawing an Ace of Spades regardless of whether we have ever drawn a card before. But these are assumptions about the world that are not guaranteed by logic: we have to actually count the cards, to be sure we count accurately and are not dreaming or hallucinating, and that our memory (once we have finished counting) reliably maintains our conclusions. None of these processes logically guarantees true beliefs. So, if our assumptions are correct, we know the objective probability of actually drawing an Ace of Spades in the real world. But since there is no logical guarantee that our assumptions are right, we are left only with the epistemic probability (the probability based on our evidence) of drawing that card. If our assumptions are right, then the objective probability is the same as our epistemic probability: 1/52. But even if we are right, objective and epistemic probabilities can come apart under some circumstances.
Imagine you draw a card without looking at it and lay it face down. What is the objective probability that that card is an Ace of Spades? The structure of the world has now settled the question, though you do not know the outcome. If it is an Ace of Spades, the objective probability is 1 (100%); it is the Ace of Spades. If it is not the Ace of Spades, the objective probability is 0 (0%); it is not the Ace of Spades. But what is the epistemic probability? Since you do not know any more about the world than you did before you drew the card, the epistemic probability is the same as before you drew it: 1/52.
Since much of the way the world is is hidden from us (like the card laid face down), and since it is not obvious that we perceive reality as it actually is (we do not know whether the actual coins we flip are evenly weighted or whether the actual dice we roll are unbiased), our conclusions about probabilities in the actual world are inevitably epistemic probabilities. We can certainly calculate objective probabilities about abstract objects (for instance, hypothetically fair coins and dice—and these calculations can be evaluated formally using probability theory and statistics), but as soon as we apply these calculations to the real world, we must accommodate the fact that our evidence is incomplete.
There are four well-established categories of informal reasoning: generalization, analogy, causal reasoning, and abduction.
a. Generalization
Generalization is a way of reasoning informally from instances of a type to a conclusion about the type. This commonly takes two forms: reasoning from a sample of a population to the whole population, and reasoning from past instances of an object or event to future instances of that object or event. The latter is sometimes called “enumerative induction” because it involves enumerating past instances of a type in order to draw an inference about a future instance. But this distinction is weak; both forms of generalization use past or current data to infer statements about future instances and whole current populations.
A popular instance of inductive generalization is the opinion poll: a sample of a population of people is polled with respect to some statement or belief. For instance, if we poll 57 sophomores enrolled at a particular college about their experiences of living in dorms, these 57 comprise our sample of the population of sophomores at that particular college. We want to be careful how we define our population given who is part of our sample. Not all college students are like sophomores, so it is not prudent to draw inferences about all college students from these sophomores. Similarly, sophomores at other colleges are not necessarily like sophomores at this college (it could be the difference between a liberal arts college and a research university), so it is prudent not to draw inferences about all sophomores from this sample at a particular college.
Let us say that 90% of the 57 sophomores we polled hate the showers in their dorms. From this information, we might generalize in the following way:
We polled 57 sophomores at Plato’s Academy. (the sample)
90% of our sample hates the showers in their dorms. (the polling data)
Therefore, probably 90% of all sophomores at Plato’s Academy hate the showers in their dorms. (a generalization from our sample to the whole population of sophomores at Plato’s Academy)
Is this good evidence that 90% of all sophomores at that college hate the showers in their dorms?
A generalization is typically regarded as a good argument if its sample is representative of its population. A sample is representative if it is similar in the relevant respects to its population. A perfectly representative sample would include the whole population: the sample would be identical with the population, and thus, perfectly representative. In that case, no generalization is necessary. But we rarely have the time or resources to evaluate whole populations. And so, a sample is generally regarded as representative if it is large relative to its population and unbiased.
In our example, whether our inference is good depends, in part, on how many sophomores there are. Are there 100, 2,000? If there are only 100, then our sample size seems adequate—we have polled over half the population. Is our sample unbiased? That depends on the composition of the sample. Is it comprised only of women or only of men? If this college is not co-ed, that is not a problem. But if the college is co-ed and we have sampled only women, our sample is biased against men. We have information only about female freshmen dorm experiences, and therefore, we cannot generalize about male freshmen dorm experiences.
How large is large enough? This is a difficult question to answer. A poll of 1% of your high school does not seem large enough to be representative. You should probably gather more data. Yet a poll of 1% of your whole country is practically impossible (you are not likely to ever have enough grant money to conduct that poll). But could a poll of less than 1% be acceptable? This question is not easily answered, even by experts in the field. The simple answer is: the more, the better. The more complicated answer is: it depends on how many other factors you can control for, such as bias and hidden variables (see §4c for more on experimental controls).
Similarly, we might ask what counts as an unbiased sample. An overly simple answer is: the sample is taken randomly, that is, by using a procedure that prevents consciously or unconsciously favoring one segment of the population over another (flipping a coin, drawing lottery balls). But reality is not simple. In political polls, it is important not to use a selection procedure that results in a sample with a larger number of members of one political party than another relative to their distribution in the population, even if the resulting sample is random. For example, the two most prominent parties in the U.S. are the Democratic Party and the Republican Party. If 47% of the U.S. is Republican and 53% is Democrat, an unbiased sample would have approximately 47% Republicans and 53% Democrats. But notice that simply choosing at random may not guarantee that result; it could easily occur, just by choosing randomly, that our sample has 70% Democrats and 30% Republicans (suppose our computer chose, albeit randomly, from a highly Democratic neighborhood). Therefore, we want to control for representativeness in some criteria, such as gender, age, and education. And we explicitly want to avoid controlling for the results we are interested in; if we controlled for particular answers to the questions on our poll, we would not learn anything—we would get all and only the answers we controlled for.
Difficulties determining representativeness suggest that reliable generalizations are not easy to construct. If we generalize on the basis of samples that are too small or if we cannot control for bias, we commit the informal fallacy of hasty generalization (see §5b). In order to generalize well, it seems we need a bit of machinery to guarantee representativeness. In fact, it seems we need an experiment, one of the primary tools in causal reasoning (see §4c below).
b. Analogy
Argument from Analogy, also called analogical reasoning, is a way of reasoning informally about events or objects based on their similarities. A classic instance of reasoning by analogy occurs in archaeology, when researchers attempt to determine whether a stone object is an artifact (a human-made item) or simply a rock. By comparing the features of an unknown stone with well-known artifacts, archaeologists can infer whether a particular stone is an artifact. Other examples include identifying animals’ tracks by their similarities with pictures in a guidebook and consumer reports on the reliability of products.
To see how arguments from analogy work in detail, imagine two people who, independently of one another, want to buy a new pickup truck. Each chooses a make and model he or she likes, and let us say they decide on the same truck. They then visit a number of consumer reporting websites to read reports on trucks matching the features of the make and model they chose, for instance, the year it was built, the size of the engine (6 cyl. or 8 cyl.), the type of transmission (2WD or 4WD), the fuel mileage, and the cab size (standard, extended, crew). Now, let us say one of our prospective buyers is interested in safety—he or she wants a tough, safe vehicle that will protect against injuries in case of a crash. The other potential buyer is interested in mechanical reliability—he or she does not want to spend a lot of time and money fixing mechanical problems.
With this in mind, here is how our two buyers might reason analogically about whether to purchase the truck (with some fake report data included):
Buyer 1
The truck I have in mind was built in 2012, has a 6-cylinder engine, a 2WD transmission, and a king cab.
62 people who bought trucks like this one posted consumer reports and have driven it for more than a year.
88% of those 62 people report that the truck feels very safe.
Therefore, the truck I am looking at will likely be very safe.
Buyer 2
The truck I have in mind was built in 2012, has a 6-cylinder engine, a 2WD transmission, and a king cab.
62 people who bought trucks like this one posted consumer reports and have driven it for more than a year.
88% of those 62 people report that the truck has had no mechanical problems.
Therefore, the truck I am looking at will likely have no mechanical problems.
Are the features of these analogous vehicles (the ones reported on) sufficiently numerous and relevant for helping our prospective truck buyers decide whether to purchase the truck in question (the one on the lot)? Since we have some idea that the type of engine and transmission in a vehicle contribute to its mechanical reliability, Buyer 2 may have some relevant features on which to draw a reliable analogy. Fuel mileage and cab size are not obviously relevant, but engine specifications seem to be. Are these specifications numerous enough? That depends on whether anything else that we are not aware of contributes to overall reliability. Of course, if the trucks having the features we know also have all other relevant features we do not know (if there are any), then Buyer 2 may still be able to draw a reliable inference from analogy. Of course, we do not currently know this.
Alternatively, Buyer 1 seems to have very few relevant features on which to draw a reliable analogy. The features listed are not obviously related to safety. Are there safety options a buyer may choose but that are not included in the list? For example, can a buyer choose side-curtain airbags, or do such airbags come standard in this model? Does cab size contribute to overall safety? Although there are a number of similarities between the trucks, it is not obvious that we have identified features relevant to safety or whether there are enough of them. Further, reports of “feeling safe” are not equivalent to a truck actually being safe. Better evidence would be crash test data or data from actual accidents involving this truck. This information is not likely to be on a consumer reports website.
A further difficulty is that, in many cases, it is difficult to know whether many similarities are necessary if the similarities are relevant. For instance, if having lots of room for passengers is your primary concern, then any other features are relevant only insofar as they affect cab size. The features that affect cab size may be relatively small.
This example shows that arguments from analogy are difficult to formulate well. Arguments from analogy can be good arguments when critical thinkers identify a sufficient number of features of known objects that are also relevant to the feature inferred to be shared by the object in question. If a rock is shaped like a cutting tool, has marks consistent with shaping and sharpening, and has wear marks consistent with being held in a human hand, it is likely that rock is an artifact. But not all cases are as clear.
It is often difficult to determine whether the features we have identified are sufficiently numerous or relevant to our interests. To determine whether an argument from analogy is good, a person may need to identify a causal relationship between those features and the one in which she is interested (as in the case with a vehicle’s mechanical reliability). This usually takes the form of an experiment, which we explore below (§4c).
Difficulties with constructing reliable generalizations and analogies have led critical thinkers to develop sophisticated methods for controlling for the ways these arguments can go wrong. The most common way to avoid the pitfalls of these arguments is to identify the causal structures in the world that account for or underwrite successful generalizations and analogies. Causal arguments are the primary method of controlling for extraneous causal influences and identifying relevant causes. Their development and complexity warrant regarding them as a distinct form of informal reasoning.
c. Causal Reasoning
Causal arguments attempt to draw causal conclusions (that is, statements that express propositions about causes: x causes y) from premises about relationships among events or objects. Though it is not always possible to construct a causal argument, when available, they have an advantage over other types of inductive arguments in that they can employ mechanisms (experiments) that reduce the risks involved in generalizations and analogies.
The interest in identifying causal relationships often begins with the desire to explaincorrelations among events (as pollen levels increase, so do allergy symptoms) or with the desire to replicate an event (building muscle, starting a fire) or to eliminate an event (polio, head trauma in football).
Correlations among events may be positive (where each event increases at roughly the same rate) or negative (where one event decreases in proportion to another’s increase). Correlations suggest a causal relationship among the events correlated.
But we must be careful; correlations are merely suggestive—other forces may be at work. Let us say the y-axis in the charts above represents the number of millionaires in the U.S. and the x-axis represents the amount of money U.S. citizens pay for healthcare each year. Without further analysis, a positive correlation between these two may lead someone to conclude that increasing wealth causes people to be more health conscious and to seek medical treatment more often. A negative correlation may lead someone to conclude that wealth makes people healthier and, therefore, that they need to seek medical care less frequently.
Unfortunately, correlations can occur without any causal structures (mere coincidence) or because of a third, as-yet-unidentified event (a cause common to both events, or “common cause”), or the causal relationship may flow in an unexpected direction (what seems like the cause is really the effect). In order to determine precisely which event (if any) is responsible for the correlation, reasoners must eliminate possible influences on the correlation by “controlling” for possible influences on the relationship (variables).
Critical thinking about causes begins by constructing hypotheses about the origins of particular events. A hypothesis is an explanation or event that would account for the event in question. For example, if the question is how to account for increased acne during adolescence, and we are not aware of the existence of hormones, we might formulate a number of hypotheses about why this happens: during adolescence, people’s diets change (parents no longer dictate their meals), so perhaps some types of food cause acne; during adolescence, people become increasingly anxious about how they appear to others, so perhaps anxiety or stress causes acne; and so on.
After we have formulated a hypothesis, we identify a test implication that will help us determine whether our hypothesis is correct. For instance, if some types of food cause acne, we might choose a particular food, say, chocolate, and say: if chocolate causes acne (hypothesis), then decreasing chocolate will decrease acne (test implication). We then conduct an experiment to see whether our test implication occurs.
Reasoning about our experiment would then look like one of the following arguments:
Confirming Experiment
Disconfirming Experiment
1. If H, then TI
1. If H, then TI.
2. TI.
2. Not-TI.
3. Therefore, probably H.
3. Therefore, probably Not-H.
There are a couple of important things to note about these arguments. First, despite appearances, both are inductive arguments. The one on the left commits the formal fallacy of affirming the consequent, so, at best, the premises confer only some degree of probability on the conclusion. The argument on the right looks to be deductive (on the face of it, it has the valid form modus tollens), but it would be inappropriate to regard it deductively. This is because we are not evaluating a logical connection between H and TI, we are evaluating a causal connection—TI might be true or false regardless of H (we might have chosen an inappropriate test implication or simply gotten lucky), and therefore, we cannot conclude with certainty that H does not causally influence TI. Therefore, “If…, then…” statements in experiments must be read as causalconditionals and not materialconditionals (the term for how we used conditionals above).
Second, experiments can go wrong in many ways, so no single experiment will grant a high degree of probability to its causal conclusion. Experiments may be biased by hidden variables (causes we did not consider or detect, such as age, diet, medical history, or lifestyle), auxiliary assumptions (the theoretical assumptions by which evaluating the results may be faulty), or underdetermination (there may be a number of hypotheses consistent with those results; for example, if it is actually sugar that causes acne, then chocolate bars, ice cream, candy, and sodas would yield the same test results). Because of this, experiments either confirm or disconfirm a hypothesis; that is, they give us some reason (but not a particularly strong reason) to believe our hypothesized causes are or are not the causes of our test implications, and therefore, of our observations (see Quine and Ullian, 1978). Because of this, experiments must be conducted many times, and only after we have a number of confirming or disconfirming results can we draw a strong inductive conclusion. (For more, see “Confirmation and Induction.”)
Experiments may be formal or informal. In formal experiments, critical thinkers exert explicit control over experimental conditions: experimenters choose participants, include or exclude certain variables, and identify or introduce hypothesized events. Test subjects are selected according to control criteria (criteria that may affect the results and, therefore, that we want to mitigate, such as age, diet, and lifestyle) and divided into control groups (groups where the hypothesized cause is absent) and experimental groups (groups where the hypothesized cause is present, either because it is introduced or selected for).
Subjects are then placed in experimental conditions. For instance, in a randomized study, the control group receives a placebo (an inert medium) whereas the experimental group receives the hypothesized cause—the putative cause is introduced, the groups are observed, and the results are recorded and compared. When a hypothesized cause is dangerous (such as smoking) or its effects potentially irreversible (for instance, post-traumatic stress disorder), the experimental design must be restricted to selecting for the hypothesized cause already present in subjects, for example, in retrospective (backward-looking) and prospective (forward-looking) studies. In all types of formal experiments, subjects are observed under exposure to the test or placebo conditions for a specified time, and results are recorded and compared.
In informal experiments, critical thinkers do not have access to sophisticated equipment or facilities and, therefore, cannot exert explicit control over experimental conditions. They are left to make considered judgments about variables. The most common informal experiments are John Stuart Mill’s five methods of inductive reasoning, called Mill’s Methods, which he first formulated in A System of Logic (1843). Here is a very brief summary of Mill’s five methods:
(1) The Method of Agreement
If all conditions containing the event y also contain x, x is probably the cause of y.
For example:
“I’ve eaten from the same box of cereal every day this week, but all the times I got sick after eating cereal were times when I added strawberries. Therefore, the strawberries must be bad.”
(2) The Method of Difference
If all conditions lacking y also lack x, x is probably the cause of y.
For example:
“The organization turned all its tax forms in on time for years, that is, until our comptroller, George, left; after that, we were always late. Only after George left were we late. Therefore, George was probably responsible for getting our tax forms in on time.”
(3) The Joint Method of Agreement and Difference
If all conditions containing event y also contain event x, and all events lacking y also lack x, x is probably the cause of y.
For example:
“The conditions at the animal shelter have been pretty regular, except we had a string of about four months last year when the dogs barked all night, every night. But at the beginning of those four months we sheltered a redbone coonhound, and the barking stopped right after a family adopted her. All the times the redbone hound wasn’t present, there was no barking. Only the time she was present was there barking. Therefore, she probably incited all the other dogs to bark.”
(4) The Method of Concomitant Variation
If the frequency of event y increases and decreases as event x increases and decreases, respectively, x is probably the cause of y.
For example:
“We can predict the amount of alcohol sales by the rate of unemployment. As unemployment rises, so do alcohol sales. As unemployment drops, so do alcohol sales. Last quarter marked the highest unemployment in three years, and our sales last quarter are the highest they had been in those three years. Therefore, unemployment probably causes people to buy alcohol.”
(5) The Method of Residues
If a number of factors x, y, and z, may be responsible for a set of events A, B, and C, and if we discover reasons for thinking that x is the cause of A and y is the cause of B, then we have reason to believe z is the cause of C.
For example:
“The people who come through this medical facility are usually starving and have malaria, and a few have polio. We are particularly interested in treating the polio. Take this patient here: she is emaciated, which is caused by starvation; and she has a fever, which is caused by malaria. But notice that her muscles are deteriorating, and her bones are sore. This suggests she also has polio.”
d. Abduction
Not all inductive reasoning is inferential. In some cases, an explanation is needed before we can even begin drawing inferences. Consider Darwin’s idea of natural selection. Natural selection is not an object, like a blood vessel or a cellular wall, and it is not, strictly speaking, a single event. It cannot be detected in individual organisms or observed in a generation of offspring. Natural selection is an explanation of biodiversity that combines the process of heritable variation and environmental pressures to account for biomorphic change over long periods of time. With this explanation in hand, we can begin to draw some inferences. For instance, we can separate members of a single species of fruit flies, allow them to reproduce for several generations, and then observe whether the offspring of the two groups can reproduce. If we discover they cannot reproduce, this is likely due to certain mutations in their body types that prevent them from procreating. And since this is something we would expect if natural selection were true, we have one piece of confirming evidence for natural selection. But how do we know the explanations we come up with are worth our time?
Coined by C. S. Peirce (1839-1914), abduction, also called retroduction, or inference to the best explanation, refers to a way of reasoning informally that provides guidelines for evaluating explanations. Rather than appealing to types of arguments (generalization, analogy, causation), the value of an explanation depends on the theoretical virtues it exemplifies. A theoretical virtue is a quality that renders an explanation more or less fitting as an account of some event. What constitutes fittingness (or “loveliness,” as Peter Lipton (2004) calls it) is controversial, but many of the virtues are intuitively compelling, and abduction is a widely accepted tool of critical thinking.
The most widely recognized theoretical virtue is probably simplicity, historically associated with William of Ockham (1288-1347) and known as Ockham’s Razor. A legend has it that Ockham was asked whether his arguments for God’s existence prove that only one God exists or whether they allow for the possibility that many gods exist. He supposedly responded, “Do not multiply entities beyond necessity.” Though this claim is not found in his writings, Ockham is now famous for advocating that we restrict our beliefs about what is true to only what is absolutely necessary for explaining what we observe.
In contemporary theoretical use, the virtue of simplicity is invoked to encourage caution in how many mechanisms we introduce to explain an event. For example, if natural selection can explain the origin of biological diversity by itself, there is no need to hypothesize both natural selection and a divine designer. But if natural selection cannot explain the origin of, say, the duck-billed platypus, then some other mechanism must be introduced. Of course, not just any mechanism will do. It would not suffice to say the duck-billed platypus is explained by natural selection plus gremlins. Just why this is the case depends on other theoretical virtues; ideally, the virtues work together to help critical thinkers decide among competing hypotheses to test. Here is a brief sketch of some other theoretical virtues or ideals:
Conservatism – a good explanation does not contradict well-established views in a field.
Independent Testability – a good explanation is successful on different occasions under similar circumstances.
Fecundity – a good explanation leads to results that make even more research possible.
Explanatory Depth – a good explanation provides details of how an event occurs.
Explanatory Breadth – a good explanation also explains other, similar events.
Though abduction is structurally distinct from other inductive arguments, it functions similarly in practice: a good explanation provides a probabilistic reason to believe a proposition. This is why it is included here as a species of inductive reasoning. It might be thought that explanations only function to help critical thinkers formulate hypotheses, and do not, strictly speaking, support propositions. But there are intuitive examples of explanations that support propositions independently of however else they may be used. For example, a critical thinker may argue that material objects exist outside our minds is a better explanation of why we perceive what we do (and therefore, a reason to believe it) than that an evil demon is deceiving me, even if there is no inductive or deductive argument sufficient for believing that the latter is false. (For more, see “Charles Sanders Peirce: Logic.”)
5. Detecting Poor Reasoning
Our attempts at thinking critically often go wrong, whether we are formulating our own arguments or evaluating the arguments of others. Sometimes it is in our interests for our reasoning to go wrong, such as when we would prefer someone to agree with us than to discover the truth value of a proposition. Other times it is not in our interests; we are genuinely interested in the truth, but we have unwittingly made a mistake in inferring one proposition from others. Whether our errors in reasoning are intentional or unintentional, such errors are called fallacies (from the Latin, fallax, which means “deceptive”). Recognizing and avoiding fallacies helps prevent critical thinkers from forming or maintaining defective beliefs.
Fallacies occur in a number of ways. An argument’s form may seem to us valid when it is not, resulting in a formal fallacy. Alternatively, an argument’s premises may seem to support its conclusion strongly but, due to some subtlety of meaning, do not, resulting in an informal fallacy. Additionally, some of our errors may be due to unconscious reasoning processes that may have been helpful in our evolutionary history, but do not function reliably in higher order reasoning. These unconscious reasoning processes are now widely known as heuristics and biases. Each type is briefly explained below.
a. Formal Fallacies
Formal fallacies occur when the form of an argument is presumed or seems to be valid (whether intentionally or unintentionally) when it is not. Formal fallacies are usually invalid variations of valid argument forms. Consider, for example, the valid argument form modus ponens (this is one of the rules of inference mentioned in §3b):
modus ponens (valid argument form)
1. p → q
1. If it is a cat, then it is a mammal.
2. p
2. It is a cat.
3. /.: q
3. Therefore, it is a mammal.
In modus ponens, we assume or “affirm” both the conditional and the left half of the conditional (called the antecedent): (p à q) and p. From these, we can infer that q, the second half or consequent, is true. This a valid argument form: if the premises are true, the conclusion cannot be false.
Sometimes, however, we invert the conclusion and the second premise, affirming that the conditional, (p à q), and the right half of the conditional, q (the consequent), are true, and then inferring that the left half, p (the antecedent), is true. Note in the example below how the conclusion and second premise are switched. Switching them in this way creates a problem.
modus ponens
affirming the consequent
(valid argument form)
(formal fallacy)
1. p → q
1. p → q
2. p
2. q
q, the consequent of the conditional in premise 1, has been “affirmed” in premise 2
3. /.: q
3. /.: p (?)
To get an intuitive sense of why “affirming the consequent” is a problem, consider this simple example:
affirming the consequent
If it is a cat, then it is a mammal.
It is a mammal.
Therefore, it is a cat.(?)
From the fact that something is a mammal, we cannot conclude that it is a cat. It may be a dog or a mouse or a whale. The premises can be true and yet the conclusion can still be false. Therefore, this is not a valid argument form. But since it is an easy mistake to make, it is included in the set of common formal fallacies.
Here is a second example with the rule of inference called modus tollens. Modus tollens involves affirming a conditional, (p à q), and denying that conditional’s consequent: ~q. From these two premises, we can validly infer the denial of the antecedent: ~p. But if we switch the conclusion and the second premise, we get another fallacy, called denying the antecedent.
modus tollens
denying the antecedent
(valid argument form)
(formal fallacy)
1. p → q
1. p → q
p, the antecedent of the conditional in premise 1, has been “denied” in premise 2
2. ~q
2. ~p
3. ~p
3. /.: ~q(?)
1. If it is a cat, then it is a mammal.
1. If it is a cat, then it is a mammal.
2. It is not a mammal.
2. It is not a cat.
3. Therefore, it is not a cat.
3. Therefore, it is not a mammal.(?)
Technically, all informal reasoning is formally fallacious—all informal arguments are invalid. Nevertheless, since those who offer inductive arguments rarely presume they are valid, we do not regard them as reasoning fallaciously.
b. Informal Fallacies
Informal fallacies occur when the meaning of the terms used in the premises of an argument suggest a conclusion that does not actually follow from them (the conclusion either follows weakly or with no strength at all). Consider an example of the informal fallacy of equivocation, in which a word with two distinct meanings is used in both of its meanings:
Any law can be repealed by Congress.
Gravity is a law.
Therefore, gravity can be repealed by Congress.
In this case, the argument’s premises are true when the word “law” is rightly interpreted, but the conclusion does not follow because the word law has a different referent in premise 1 (political laws) than in premise 2 (a law of nature). This argument equivocates on the meaning of law and is, therefore, fallacious.
Consider, also, the informal fallacy of ad hominem, abusive, when an arguer appeals to a person’s character as a reason to reject her proposition:
“Elizabeth argues that humans do not have souls; they are simply material beings. But Elizabeth is a terrible person and often talks down to children and the elderly. Therefore, she could not be right that humans do not have souls.”
The argument might look like this:
Elizabeth is a terrible person and often talks down to children and the elderly.
Therefore, Elizabeth is not right that humans do not have souls.
The conclusion does not follow because whether Elizabeth is a terrible person is irrelevant to the truth of the proposition that humans do not have souls. Elizabeth’s argument for this statement is relevant, but her character is not.
Another way to evaluate this fallacy is to note that, as the argument stands, it is an enthymeme (see §2); it is missing a crucial premise, namely: If anyone is a terrible person, that person makes false statements. But this premise is clearly false. There are many ways in which one can be a terrible person, and not all of them imply that someone makes false statements. (In fact, someone could be terrible precisely because they are viciously honest.) Once we fill in the missing premise, we see the argument is not cogent because at least one premise is false.
Importantly, we face a number of informal fallacies on a daily basis, and without the ability to recognize them, their regularity can make them seem legitimate. Here are three others that only scratch the surface:
Appeal to the People: We are often encouraged to believe or do something just because everyone else does. We are encouraged to believe what our political party believes, what the people in our churches or synagogues or mosques believe, what people in our family believe, and so on. We are encouraged to buy things because they are “bestsellers” (lots of people buy them). But the fact that lots of people believe or do something is not, on its own, a reason to believe or do what they do.
Tu Quoque (You, too!): We are often discouraged from pursuing a conclusion or action if our own beliefs or actions are inconsistent with them. For instance, if someone attempts to argue that everyone should stop smoking, but that person smokes, their argument is often given less weight: “Well, you smoke! Why should everyone else quit?” But the fact that someone believes or does something inconsistent with what they advocate does not, by itself, discredit the argument. Hypocrites may have very strong arguments despite their personal inconsistencies.
Base Rate Neglect: It is easy to look at what happens after we do something or enact a policy and conclude that the act or policy caused those effects. Consider a law reducing speed limits from 75 mph to 55 mph in order to reduce highway accidents. And, in fact, in the three years after the reduction, highway accidents dropped 30%! This seems like a direct effect of the reduction. However, this is not the whole story. Imagine you looked back at the three years prior to the law and discovered that accidents had dropped 30% over that time, too. If that happened, it might not actually be the law that caused the reduction in accidents. The law did not change the trend in accident reduction. If we only look at the evidence after the law, we are neglecting the rate at which the event occurred without the law. The base rate of an event is the rate that the event occurs without the potential cause under consideration. To take another example, imagine you start taking cold medicine, and your cold goes away in a week. Did the cold medicine cause your cold to go away? That depends on how long colds normally last and when you took the medicine. In order to determine whether a potential cause had the effect you suspect, do not neglect to compare its putative effects with the effects observed without that cause.
For more on formal and informal fallacies and over 200 different types with examples, see “Fallacies.”
c. Heuristics and Biases
In the 1960s, psychologists began to suspect there is more to human reasoning than conscious inference. Daniel Kahneman and Amos Tversky confirmed these suspicions with their discoveries that many of the standard assumptions about how humans reason in practice are unjustified. In fact, humans regularly violate these standard assumptions, the most significant for philosophers and economists being that humans are fairly good at calculating the costs and benefits of their behavior; that is, they naturally reason according to the dictates of Expected Utility Theory. Kahneman and Tversky showed that, in practice, reasoning is affected by many non-rational influences, such as the wording used to frame scenarios (framing bias) and information most vividly available to them (the availability heuristic).
Consider the difference in your belief about the likelihood of getting robbed before and after seeing a news report about a recent robbery, or the difference in your belief about whether you will be bitten by a shark the week before and after Discovery Channel’s “Shark Week.” For most of us, we are likely to regard their likelihood as higher after we have seen these things on television than before. Objectively, they are no more or less likely to happen regardless of our seeing them on television, but we perceive they are more likely because their possibility is more vivid to us. These are examples of the availability heuristic.
Since the 1960s, experimental psychologists and economists have conducted extensive research revealing dozens of these unconscious reasoning processes, including ordering bias, the representativeness heuristic, confirmation bias, attentional bias, and the anchoring effect. The field of behavioral economics, made popular by Dan Ariely (2008; 2010; 2012) and Richard Thaler and Cass Sunstein (2009), emerged from and contributes to heuristics and biases research and applies its insights to social and economic behaviors.
Ideally, recognizing and understanding these unconscious, non-rational reasoning processes will help us mitigate their undermining influence on our reasoning abilities (Gigerenzer, 2003). However, it is unclear whether we can simply choose to overcome them or whether we have to construct mechanisms that mitigate their influence (for instance, using double-blind experiments to prevent confirmation bias).
6. The Scope and Virtues of Good Reasoning
Whether the process of critical thinking is productive for reasoners—that is, whether it actually answers the questions they are interested in answering—often depends on a number of linguistic, psychological, and social factors. We encountered some of the linguistic factors in §1. In closing, let us consider some of the psychological and social factors that affect the success of applying the tools of critical thinking.
a. Context
Not all psychological and social contexts are conducive for effective critical thinking. When reasoners are depressed or sad or otherwise emotionally overwhelmed, critical thinking can often be unproductive or counterproductive. For instance, if someone’s child has just died, it would be unproductive (not to mention cruel) to press the philosophical question of why a good God would permit innocents to suffer or whether the child might possibly have a soul that could persist beyond death. Other instances need not be so extreme to make the same point: your company’s holiday party (where most people would rather remain cordial and superficial) is probably not the most productive context in which to debate the president’s domestic policy or the morality of abortion.
The process of critical thinking is primarily about detecting truth, and truth may not always be of paramount value. In some cases, comfort or usefulness may take precedence over truth. The case of the loss of a child is a case where comfort seems to take precedence over truth. Similarly, consider the case of determining what the speed limit should be on interstate highways. Imagine we are trying to decide whether it is better to allow drivers to travel at 75 mph or to restrict them to 65. To be sure, there may be no fact of the matter as to which is morally better, and there may not be any difference in the rate of interstate deaths between states that set the limit at 65 and those that set it at 75. But given the nature of the law, a decision about which speed limit to set must be made. If there is no relevant difference between setting the limit at 65 and setting it at 75, critical thinking can only tell us that, not which speed limit to set. This shows that, in some cases, concern with truth gives way to practical or preferential concerns (for example, Should I make this decision on the basis of what will make citizens happy? Should I base it on whether I will receive more campaign contributions from the business community?). All of this suggests that critical thinking is most productive in contexts where participants are already interested in truth.
b. The Principle of Charity/Humility
Critical thinking is also most productive when people in the conversation regard themselves as fallible, subject to error, misinformation, and deception. The desire to be “right” has a powerful influence on our reasoning behavior. It is so strong that our minds bias us in favor of the beliefs we already hold even in the face of disconfirming evidence (a phenomenon known as “confirmation bias”). In his famous article, “The Ethics of Belief” (1878), W. K. Clifford notes that, “We feel much happier and more secure when we think we know precisely what to do, no matter what happens, than when we have lost our way and do not know where to turn. … It is the sense of power attached to a sense of knowing that makes men desirous of believing, and afraid of doubting” (2010: 354).
Nevertheless, when we are open to the possibility that we are wrong, that is, if we are humble about our conclusions and we interpret others charitably, we have a better chance at having rational beliefs in two senses. First, if we are genuinely willing to consider evidence that we are wrong—and we demonstrate that humility—then we are more likely to listen to others when they raise arguments against our beliefs. If we are certain we are right, there would be little reason to consider contrary evidence. But if we are willing to hear it, we may discover that we really are wrong and give up faulty beliefs for more reasonable ones.
Second, if we are willing to be charitable to arguments against our beliefs, then if our beliefs are unreasonable, we have an opportunity to see the ways in which they are unreasonable. On the other hand, if our beliefs are reasonable, then we can explain more effectively just how well they stand against the criticism. This is weakly analogous to competition in certain types of sporting events, such as basketball. If you only play teams that are far inferior to your own, you do not know how good your team really is. But if you can beat a well-respected team on fair terms, any confidence you have is justified.
c. The Principle of Caution
In our excitement over good arguments, it is easy to overextend our conclusions, that is, to infer statements that are not really warranted by our evidence. From an argument for a first, uncaused cause of the universe, it is tempting to infer the existence of a sophisticated deity such as that of the Judeo-Christian tradition. From an argument for the compatibilism of the free will necessary for moral responsibility and determinism, it is tempting to infer that we are actually morally responsible for our behaviors. From an argument for negative natural rights, it is tempting to infer that no violation of a natural right is justifiable. Therefore, it is prudent to continually check our conclusions to be sure they do not include more content than our premises allow us to infer.
Of course, the principle of caution must itself be used with caution. If applied too strictly, it may lead reasoners to suspend all belief, and refrain from interacting with one another and their world. This is not, strictly speaking, problematic; ancient skeptics, such as the Pyrrhonians, advocated suspending all judgments except those about appearances in hopes of experiencing tranquility. However, at least some judgments about the long-term benefits and harms seem indispensable even for tranquility, for instance, whether we should retaliate in self-defense against an attacker or whether we should try to help a loved one who is addicted to drugs or alcohol.
d. The Expansiveness of Critical Thinking
The importance of critical thinking cannot be overstated because its relevance extends into every area of life, from politics, to science, to religion, to ethics. Not only does critical thinking help us draw inferences for ourselves, it helps us identify and evaluate the assumptions behind statements, the moral implications of statements, and the ideologies to which some statements commit us. This can be a disquieting and difficult process because it forces us to wrestle with preconceptions that might not be accurate. Nevertheless, if the process is conducted well, it can open new opportunities for dialogue, sometimes called “critical spaces,” that allow people who might otherwise disagree to find beliefs in common from which to engage in a more productive conversation.
It is this possibility of creating critical spaces that allows philosophical approaches like Critical Theory to effectively challenge the way social, political, and philosophical debates are framed. For example, if a discussion about race or gender or sexuality or gender is framed in terms that, because of the origins those terms or the way they have functioned socially, alienate or disproportionately exclude certain members of the population, then critical space is necessary for being able to evaluate that framing so that a more productive dialogue can occur (see Foresman, Fosl, and Watson, 2010, ch. 10 for more on how critical thinking and Critical Theory can be mutually supportive).
e. Productivity and the Limits of Rationality
Despite the fact that critical thinking extends into every area of life, not every important aspect of our lives is easily or productively subjected to the tools of language and logic. Thinkers who are tempted to subject everything to the cold light of reason may discover they miss some of what is deeply enjoyable about living. The psychologist Abraham Maslow writes, “I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail” (1966: 16). But it is helpful to remember that language and logic are tools, not the projects themselves. Even formal reasoning systems depend on axioms that are not provable within their own systems (consider Euclidean geometry or Peano arithmetic). We must make some decisions about what beliefs to accept and how to live our lives on the basis of considerations outside of critical thinking.
Borrowing an example from William James (1896), consider the statement, “Religion X is true.” James says that, while some people find this statement interesting, and therefore, worth thinking critically about, others may not be able to consider the truth of the statement. For any particular religious tradition, we might not know enough about it to form a belief one way or the other, and even suspending judgment may be difficult, since it is not obvious what we are suspending judgment about.
If I say to you: ‘Be a theosophist or be a Mohammedan,’ it is probably a dead option, because for you neither hypothesis is likely to be alive. But if I say: ‘Be an agnostic or be a Christian,’ it is otherwise: trained as you are, each hypothesis makes some appeal, however small, to your belief (2010: 357).
Ignoring the circularity in his definition of “dead option,” James’s point seems to be that if you know nothing about a view or what statements it entails, no amount of logic or evidence could help you form a reasonable belief about that position.
We might criticize James at this point because his conclusion seems to imply that we have no duty to investigate dead options, that is, to discover if there is anything worth considering in them. If we are concerned with truth, the simple fact that we are not familiar with a proposition does not mean it is not true or potentially significant for us. But James’s argument is subtler than this criticism suggests. Even if you came to learn about a particularly foreign religious tradition, its tenets may be so contrary to your understanding of the world that you could not entertain them as possible beliefs of yours. For instance, you know perfectly well that, if some events had been different, Hitler would not have existed: his parents might have had no children, or his parents’ parents might have had no children. You know roughly what it would mean for Hitler not to have existed and the sort of events that could have made it true that he did not exist. But how much evidence would it take to convince you that, in fact, Hitler did not exist, that is, that your belief that Hitler did exist is false? Could there be an argument strong enough? Not obviously. Since all the information we have about Hitler unequivocally points to his existence, any arguments against that belief would have to affect a very broad range of statements; they would have to be strong enough to make us skeptical of large parts of reality.
7. Approaches to Improving Reasoning through Critical Thinking
Recall that the goal of critical thinking is not just to study what makes reasons and statements good, but to help us improve our ability to reason, that is, to improve our ability to form, hold, and discard beliefs according to whether they meet the standards of good thinking. Some ways of approaching this latter goal are more effective than others. While the classical approach focuses on technical reasoning skills, the Paul/Elder model encourages us to think in terms of critical concepts, and irrationality approaches use empirical research on instances of poor reasoning to help us improve reasoning where it is least obvious we need it and where we need it most. Which approach or combination of approaches is most effective depends, as noted above, on the context and limits of critical thinking, but also on scientific evidence of their effectiveness. Those who teach critical thinking, of all people, should be engaged with the evidence relevant to determining which approaches are most effective.
a. Classical Approaches
The classic approach to critical thinking follows roughly the structure of this article: critical thinkers attempt to interpret statements or arguments clearly and charitably, and then they apply the tools of formal and informal logic and science, while carefully attempting to avoid fallacious inferences (see Weston, 2008; Walton, 2008; Watson and Arp, 2015). This approach requires spending extensive time learning and practicing technical reasoning strategies. It presupposes that reasoning is primarily a conscious activity, and that enhancing our skills in these areas will improve our ability to reason well in ordinary situations.
There are at least two concerns about this approach. First, it is highly time intensive relative to its payoff. Learning the terminology of systems like propositional and categorical logic and the names of the fallacies, and practicing applying these tools to hypothetical cases requires significant time and energy. And it is not obvious, given the problems with heuristics and biases, whether this practice alone makes us better reasoners in ordinary contexts. Second, many of the ways we reason poorly are not consciously accessible (recall the heuristics and biases discussion in §5c). Our biases, combined with the heuristics we rely on in ordinary situations, can only be detected in experimental settings, and addressing them requires restructuring the ways in which we engage with evidence (see Thaler and Sunstein, 2009).
b. The Paul/Elder Model
Richard Paul and Linda Elder (Paul and Elder, 2006; Paul, 2012) developed an alternative to the classical approach on the assumption that critical thinking is not something that is limited to academic study or to the discipline of philosophy. On their account, critical thinking is a broad set of conceptual skills and habits aimed at a set of standards that are widely regarded as virtues of thinking: clarity, accuracy, depth, fairness, and others. They define it simply as “the art of analyzing and evaluating thinking with a view to improving it” (2006: 4). Their approach, then, is to focus on the elements of thought and intellectual virtues that help us form beliefs that meet these standards.
The Paul/Elder model is made up of three sets of concepts: elements of thought, intellectual standards, and intellectual traits. In this model, we begin by identifying the features present in every act of thought. They use “thought” to mean critical thought aimed at forming beliefs, not just any act of thinking, musing, wishing, hoping, remembering. According to the model, every act of thought involves:
point of view
concepts
purpose
interpretation and inference
implications and consequences
information
assumptions
question at issue
These comprise the subject matter of critical thinking; that is, they are what we are evaluating when we are thinking critically. We then engage with this subject matter by subjecting them to what Paul and Elder call universal intellectual standards. These are evaluative goals we should be aiming at with our thinking:
clarity
breadth
accuracy
logic
precision
significance
relevance
fairness
depth
While in classical approaches, logic is the predominant means of thinking critically, in the Paul/Elder model, it is put on equal footing with eight other standards. Finally, Paul and Elder argue that it is helpful to approach the critical thinking process with a set of intellectual traits or virtues that dispose us to using elements and standards well.
intellectual humility
intellectual perseverance
intellectual autonomy
confidence in reason
intellectual integrity
intellectual empathy
intellectual courage
fairmindedness
To remind us that these are virtues of thought relevant to critical thinking, they use “intellectual” to distinguish these traits from their moral counterparts (moral integrity, moral courage, and so on).
The aim is that, as we become familiar with these three sets of concepts and apply them in everyday contexts, we become better at analyzing and evaluating statements and arguments in ordinary situations.
Like the classical approach, this approach presupposes that reasoning is primarily a conscious activity, and that enhancing our skills will improve our reasoning. This means that it still lacks the ability to address the empirical evidence that many of our reasoning errors cannot be consciously detected or corrected. It differs from the classical approach in that it gives the technical tools of logic a much less prominent role and places emphasis on a broader, and perhaps more intuitive, set of conceptual tools. Learning and learning to apply these concepts still requires a great deal of time and energy, though perhaps less than learning formal and informal logic. And these concepts are easy to translate into disciplines outside philosophy. Students of history, psychology, and economics can more easily recognize the relevance of asking questions about an author’s point of view and assumptions than perhaps determining whether the author is making a deductive or inductive argument. The question, then, is whether this approach improves our ability to think better than the classical approach.
c. Other Approaches
A third approach that is becoming popular is to focus on the ways we commonly reason poorly and then attempt to correct them. This can be called the Rationality Approach, and it takes seriously the empirical evidence (§5c) that many of our errors in reasoning are not due to a lack of conscious competence with technical skills or misusing those skills, but are due to subconscious dispositions to ignore or dismiss relevant information or to rely on irrelevant information.
One way to pursue this approach is to focus on beliefs that are statistically rare or “weird.” These include beliefs of fringe groups, such as conspiracy theorists, religious extremists, paranormal psychologists, and proponents of New Age metaphysics (see Gilovich, 1992; Vaughn and Schick, 2010; Coady, 2012). If we recognize the sorts of tendencies that lead to these controversial beliefs, we might be able to recognize and avoid similar tendencies in our own reasoning about less extreme beliefs, such as beliefs about financial investing, how statistics are used to justify business decisions, and beliefs about which public policies to vote for.
Another way to pursue this approach is to focus directly on the research on error, those ordinary beliefs that psychologists and behavioral economists have discovered we reason poorly, and to explore ways of changing how we frame decisions about what to believe (see Nisbett and Ross, 1980; Gilovich, 1992; Ariely, 2008; Kahneman, 2011). For example, in one study, psychologists found that judges issue more convictions just before lunch and the end of the day than in the morning or just after lunch (Danzinger, et al., 2010). Given that dockets do not typically organize cases from less significant crimes to more significant crimes, this evidence suggests that something as irrelevant as hunger can bias judicial decisions. Even though hunger has nothing to do with the truth of a belief, knowing that it can affect how we evaluate a belief can help us avoid that effect. This study might suggest something as simple as that we should avoid being hungry when making important decisions. The more we learn ways in which our brains use irrelevant information, the better we can organize our reasoning to avoid these mistakes. For more on how decisions can be improved by restructuring our decisions, see Thaler and Sunstein, 2009.
A fourth approach is to take more seriously the role that language plays in our reasoning. Arguments involve complex patterns of expression, and we have already seen how vagueness and ambiguity can undermine good reasoning (§1). The pragma-dialectics approach (or pragma-dialectical theory) is the view that the quality of an argument is not solely or even primarily a matter of its logical structure, but is more fundamentally a matter of whether it is a form of reasonable discourse (Van Eemeren and Grootendorst, 1992). The proponents of this view contend that, “The study of argumentation should … be construed as a special branch of linguistic pragmatics in which descriptive and normative perspectives on argumentative discourse are methodically integrated” (Van Eemeren and Grootendorst, 1995: 130).
The pragma-dialectics approach is a highly technical approach that uses insights from speech act theory, H. P. Grice’s philosophy of language, and the study of discourse analysis. Its use, therefore, requires a great deal of background in philosophy and linguistics. It has an advantage over other approaches in that it highlights social and practical dimensions of arguments that other approaches largely ignore. For example, argument is often public (external), in that it creates an opportunity for opposition, which influences people’s motives and psychological attitudes toward their arguments. Argument is also social in that it is part of a discourse in which two or more people try to arrive at an agreement. Argument is also functional; it aims at a resolution that can only be accommodated by addressing all the aspects of disagreement or anticipated disagreement, which can include public and social elements. Argument also has a rhetorical role (dialectical) in that it is aimed at actually convincing others, which may have different requirements than simply identifying the conditions under which they should be convinced.
These four approaches are not mutually exclusive. All of them presuppose, for example, the importance of inductive reasoning and scientific evidence. Their distinctions turn largely on which aspects of statements and arguments should take precedence in the critical thinking process and on what information will help us have better beliefs.
8. References and Further Reading
Ariely, Dan. 2008. Predictably Irrational: The Hidden Forces that Shape Our Decisions. New York: Harper Perennial.
Ariely, Dan. 2010. The Upside of Irrationality. New York: Harper Perennial.
Ariely, Dan. 2012. The (Honest) Truth about Dishonesty. New York: Harper Perennial.
Aristotle. 2002. Categories and De Interpretatione, J. L. Akrill, editor. Oxford: University of Oxford Press.
Clifford, W. K. 2010. “The Ethics of Belief.” In Nils Ch. Rauhut and Robert Bass, eds., Readings on the Ultimate Questions: An Introduction to Philosophy, 3rd ed. Boston: Prentice Hall, 351-356.
Chomsky, Noam. 1957/2002. Syntactic Structures. Berlin: Mouton de Gruyter.
Coady, David. What To Believe Now: Applying Epistemology to Contemporary Issues. Malden, MA: Wiley-Blackwell, 2012.
Danzinger, Shai, Jonathan Levav, and Liora Avnaim-Pesso. 2011. “Extraneous Factors in Judicial Decisions.” Proceedings of the National Academy of Sciences of the United States of America. Vol. 108, No. 17, 6889-6892. doi: 10.1073/pnas.1018033108.
Foresman, Galen, Peter Fosl, and Jamie Carlin Watson. 2017. The Critical Thinking Toolkit. Malden, MA: Wiley-Blackwell.
Fogelin, Robert J. and Walter Sinnott-Armstrong. 2009. Understanding Arguments: An Introduction to Informal Logic, 8th ed. Belmont, CA: Wadsworth Cengage Learning.
Gigerenzer, Gerd. 2003. Calculated Risks: How To Know When Numbers Deceive You. New York: Simon and Schuster.
Gigerenzer, Gerd, Peter Todd, and the ABC Research Group. 2000. Simple Heuristics that Make Us Smart. Oxford University Press.
Gilovich, Thomas. 1992. How We Know What Isn’t So. New York: Free Press.
James, William. “The Will to Believe”, in Nils Ch. Rauhut and Robert Bass, eds., Readings on the Ultimate Questions: An Introduction to Philosophy, 3rd ed. Boston: Prentice Hall, 2010, 356-364.
Kahneman, Daniel. 2011. Thinking Fast and Slow. New York: Farrar, Strauss and Giroux.
Lewis, David. 1986. On the Plurality of Worlds. Oxford Blackwell.
Lipton, Peter. 2004. Inference to the Best Explanation, 2nd ed. London: Routledge.
Maslow, Abraham. 1966. The Psychology of Science: A Reconnaissance. New York: Harper & Row.
Mill, John Stuart. 2011. A System of Logic, Ratiocinative and Inductive. New York: Cambridge University Press.
Nisbett, Richard and Lee Ross. 1980. Human Inference: Strategies and Shortcomings of Social Judgment. Englewood Cliffs, NJ: Prentice Hall.
Paul, Richard. 2012. Critical Thinking: What Every Person Needs to Survive in a Rapidly Changing World. Tomales, CA: The Foundation for Critical Thinking.
Paul, Richard and Linda Elder. 2006. The Miniature Guide to Critical Thinking Concepts and Tools, 4th ed. Tomales, CA: The Foundation for Critical Thinking.
Plantinga, Alvin. 1974. The Nature of Necessity. Oxford Clarendon.
Prior, Arthur. 1957. Time and Modality. Oxford, UK: Oxford University Press.
Prior, Arthur. 1967. Past, Present and Future. Oxford, UK: Oxford University Press.
Prior, Arthur. 1968. Papers on Time and Tense. Oxford, UK: Oxford University Press.
Quine, W. V. O. and J. S. Ullian. 1978. The Web of Belief, 2nd ed. McGraw-Hill.
Russell, Bertrand. 1940/1996. An Inquiry into Meaning and Truth, 2nd ed. London: Routledge.
Thaler, Richard and Cass Sunstein. 2009. Nudge: Improving Decisions about Health, Wealth, and Happiness. New York: Penguin Books.
van Eemeren, Frans H. and Rob Grootendorst. 1992. Argumentation, Communication, and Fallacies: A Pragma-Dialectical Perspective. London: Routledge.
van Eemeren, Frans H. and Rob Grootendorst. 1995. “The Pragma-Dialectical Approach to Fallacies.” In Hans V. Hansen and Robert C. Pinto, eds. Fallacies: Classical and Contemporary Readings. Penn State University Press, 130-144.
Vaughn, Lewis and Theodore Schick. 2010. How To Think About Weird Things: Critical Thinking for a New Age, 6th ed. McGraw-Hill.
Walton, Douglas. 2008. Informal Logic: A Pragmatic Approach, 2nd ed. New York: Cambridge University Press.
Watson, Jamie Carlin and Robert Arp. 2015. Critical Thinking: An Introduction to Reasoning Well, 2nd ed. London: Bloomsbury Academic.
Weston, Anthony. 2008. A Rulebook for Arguments, 4th ed. Indianapolis: Hackett.
Zadeh, Lofti. 1965. “Fuzzy Sets and Systems.” In J. Fox, ed., System Theory. Brooklyn, NY: Polytechnic Press, 29-39.
Author Information
Jamie Carlin Watson
Email: jamie.c.watson@gmail.com
University of Arkansas for Medical Sciences
U. S. A.
Empirical Aesthetics
Empirical aesthetics is a research area at the intersection of psychology and neuroscience that aims to understand how people experience, evaluate, and create objects aesthetically. Its central two questions are: How do we experience beauty? How do we experience art? In practice, this means that empirical aesthetics studies (1) prototypically aesthetic responses, such as beauty or chills, and (2) responses to prototypically aesthetic objects, such as paintings and music. Empirical aesthetics also encompasses broader questions about how we experience other aesthetic experiences, such as ugliness and the sublime, and about how we create art. The field of empirical aesthetics aims to understand how such aesthetic experiences and behaviors emerge and unfold. To do so, researchers in the field link the observer’s characteristics to her responses, link the object properties to the observer’s responses, or describe an interaction between them. As a science, empirical aesthetics relies heavily on the analysis and interpretation of data. Data is primarily generated from experiments: Researchers conduct studies in which they manipulate the independent variables to observe the effect of those manipulations on one or more independent variables. In addition, empirical aesthetics relies on observational data, where people’s behavior is observed or surveyed without the introduction of manipulations.
Empirical aesthetics is as old as empirical psychology. The first thorough written account dates back to Gustav Fechner, who published Vorschule der Aesthetik in 1876. Nonetheless, the modern field of empirical aesthetics can be considered rather young. Its gain in popularity in the 21st century can be linked to the emergence of neuroaesthetics—the study of brain responses associated with aesthetic experiences—in the late 1990s. Contemporary empirical aesthetics studies aesthetic experiences with a variety of methods, including brain-imaging and measures of other physiological responses, such as the movements of the eyes and facial muscles.
The first comprehensive treatise on what has become known as “empirical aesthetics” was written by Gustav Fechner and published in 1876 is the two-volume book Vorschule der Aesthetik (VdA). The first volume primarily contains the descriptions of 6 principles of aesthetics, as posited by Fechner himself. Notable is also the last chapter on taste. The second volume contains a large section on art as well as the description of a further seven aesthetic principles.
The main purpose of the book is to demonstrate that aesthetic experiences, primarily aesthetic pleasure or beauty, can be studied empirically just like any other form of perception. Fechner called this empirical approach to aesthetics one “from below” and distinguished it clearly from the philosophical approach “from above.” The basic distinction made is the following: Aesthetics from below observes individual cases of aesthetic responses and infers the laws that govern all of these responses from the pattern that crystallizes across individual cases. Aesthetics from above, in contrast, posits general laws and infers from those what an individual aesthetic response should look like. While the VdA itself only contains data and descriptions of a few experiments, Fechner’s descriptions of the proposed laws clearly focus on their observable effects, implying that they can be documented in an experiment.
The direct impact of the VdA on modern empirical aesthetics remains modest. This may be because it has not been published in a translated version, or it may reflect a general reluctance to cite early work in empirical aesthetics. It is, however, well known and cited as the first major work on aesthetics by an empirical psychologist. From its content, only one experiment often serves as reference probably because the associated article exists in English. This experiment examined the effect of a rectangle’s aspect ratio on aesthetic preference. Famously, Fechner found that his participants most often named the rectangle with an aspect ratio equivalent to the golden section (1:1.618) as the one they liked best. What is less well known is that Fechner himself was critical of this finding and reported an equal preference for the square ratio (1:1) in a population of blue-collar workers. His main worry about the findings concerned the potential influence of associations on the result, specifically that participants did not merely judge the rectangular form but also its resemblance to the familiar shapes of envelopes, books, and so on.
b. Empirical Aesthetics in the 20th century
After the pioneering days of Gustav Fechner and his colleagues, psychology (and philosophy) went through an era known as Behaviorism. Behaviorism effectively claimed that psychology, as a science, can only study observable behavior. Research on inner states and subjective experiences, which form the core interest of aesthetics, was shunned. This did not deter researchers like Edgar Pierce, Oswald Külpe, Lillien Jane Martin, Robert Ogden, Kate Gordon, and many others from continuing the study of people’s preferences for visual design, art, color, and particularly individual differences in such preferences.
Most of the work on empirical aesthetics in the early and mid-20th century has not had a remarkable impact on the field. Worth mentioning, however, is the early work of Edward Thorndike, and later of Hans Eysenck, on individual differences in aesthetic responsiveness and creativity. Most other studies during this time period focused on determining what kinds of object properties—specifically consonance and dissonance of tones, as well as colors—are most rewarding to specific groups of people.
Another notable exception to the mostly forgotten early research on aesthetics is Rudolph Arnheim’s work. He looked at aesthetic experiences through the lens of Gestalt psychology’s principles of organization: balance, symmetry, composition, and dynamical complexity (the trade-off between order and complexity). Arnheim saw aesthetic experiences as a product of the integration of two sources of information: the structural configurations of the image and the particular focus of the viewer that depends on her experience and disposition. One should also note that the writings of the art historian and critic Ernst Gombrich during the same time period have informed modern empirical aesthetics.
A look at the institutional level also reveals that empirical aesthetics continued to evolve during the 20th century. A division of Esthetics was among the 19 charter divisions when the American Psychological Association (APA) was founded. This 10th division of the APA was renamed “Psychology and the Arts” in 1965. Its size was modest then, relative to other divisions, and has stayed so throughout the years.
c. First Renaissance: Berlyne’s Aesthetics and Psychobiology
After what in retrospect appears like a relative drought during behaviorism, empirical aesthetics re-emerged with Daniel Berlyne and the foundation of the International Association of Empirical Aesthetics (IAEA). The IAEA was founded at the first international congress in Paris in 1965 by Daniel Berlyne (University of Toronto, Canada), Robert Francès (Université de Paris, France), Carmelo Genovese (Università di Bologna, Italy), and Albert Wellek (Johann-Gutenberg-Universität Mainz, Germany).
The most visible effort of establishing the “studies in the new experimental aesthetics” is the so-named book edited by Berlyne (1974) which contains a collection of study reports, many conducted by Berlyne himself. In addition, Berlyne had earlier published the book Aesthetics and Psychobiology (1971) which is often cited as the main reference for Berlyne’s hypotheses on the relationship between object properties and hedonic responses.
Central to Daniel Berlyne’s own ideas on aesthetic experiences is the concept of arousal. Berlyne postulated that arousal is relevant to aesthetics in that an intermediate level of arousal would lead to the greatest hedonic response. Arousal itself is conceptualized as the result of “collative,” psychophysical, and ecological variables. The best-known and most-investigated determinants of arousal are an object’s complexity and novelty. Berlyne’s theory thus links an object’s properties, such as complexity, to their effects on the observer (arousal) and then to the aesthetic response (pleasantness, liking). The concreteness of the proposed links and variables has led many researchers to test his theory. The results have been mixed at best and therefore Berlyne’s arousal theory of aesthetic appreciation has been mostly abandoned.
d. Early Modern Empirical Aesthetics
After Berlyne’s work had again highlighted that aesthetic responses can be studied with the methods and rigor of modern experimental psychology, research and theory development in the field of empirical aesthetics continued slowly but steadily for about another 20 years. This phase of empirical aesthetics was primarily concerned with linking certain stimulus properties (mostly of images) to preference or liking judgments.
One notable theoretical step forward after Berlyne’s Aesthetics and Psychobiology was Colin Martindale’s connectionist model that viewed aesthetic pleasure as a function of the strength of so-called “cognitive units” activation. Martindale (1988) maintained that “[the apprehension of a work of art of any sort will involve activation of cognitive units … the pleasure engendered by a work of art will be a positive monotonic function of how activated this entire ensemble of cognitive units is. The more activated the ensemble of units, the more pleasurable an observer will find the stimulus to be.” Combined with the assumption that more prototypical objects activate stronger cognitive units, this led to the hypothesis that more prototypical, meaningful objects are aesthetically preferred. The results of Martindale’s experiments were in line with this view and foreshadowed the development of contemporary theories that emphasize processing fluency and meaningfulness as sources of aesthetic pleasure.
e. Second Renaissance: The Emergence of Neuroaesthetics
The introduction of modern brain-imaging techniques has changed the face of psychology forever. The introduction of functional magnetic brain imaging (fMRI) to empirical aesthetics was no exception. The first fMRI experiments that focused on aesthetic experiences were conducted in the early 2000s, and the term “neuroaesthetics” subsequently emerged. The boundary between neuroaesthetics and empirical aesthetics has since been blurred and even studies that are strictly speaking not “neuro”-aesthetic—because they do not measure brain activity—are often labeled as such.
Neuroaesthetics in its initial phase asked a simple question: Which area(s) of the brain is responsive to experiences that are beautiful? The answer, across a variety of stimuli such as paintings, music, and math equations, seemed to be: the orbitofrontal cortex (OFC). This brain area at the bottom of the front-most part of the brain had previously been associated with the anticipation of various other pleasurable things, such as food and money.
Findings like these spurred one of the questions that still lies at the core of empirical aesthetics: What—if anything—makes aesthetic experiences special? Some scholars, like Martin Skov and Marcos Nadal, are skeptical that they are at all. They base their view on the findings from neuroscience mentioned above: The signature of brain activity that is linked to intensely pleasurable aesthetic experiences does not seem to differ from the one that is linked to other pleasures, such as food or winning money. Others continue to make the case for aesthetics being special. For instance, Winfried Menninghaus and his colleagues argue that “aesthetic emotions” are distinct from everyday emotions in that they always relate to an aesthetic evaluation, an aesthetic virtue, and a notion of pleasure, and that they predict liking. This debate about whether and how aesthetic experiences are special persists and has been spurred by the first findings of studies in neuroaesthetics. At the same time, this debate is not a new one and is present in the writings of intellectuals such as William James and George Santayana.
f. Empirical Aesthetics in the 21st Century
Empirical aesthetics embraces all the different approaches that have shaped its history. Both theoretical and empirical work follow a multi-methodological approach that takes stimulus properties, observer characteristics, environmental conditions, and neurological processes into account. The amount of empirical data and reports is rapidly growing.
Empirical aesthetics in the 21st century continues to work on and clarify research questions present since its beginnings. For instance, Marco Bertamini and his colleagues clarified in 2016 that the preference for objects with curved contours is, in fact, due to an increased liking of roundness and not merely a dislike of angularity. At the same time, the field also adds new research questions to its agenda, notably the question about the generalizability of previous findings beyond the laboratory setting. The emergence of tablets, portable eye trackers and EEG systems has greatly facilitated data collection in natural environments, such as museums. At the same time, virtual reality settings enable more controlled experiments in at least simulations of more naturalistic environments than isolated cubicles.
On an institutional level, the import of empirical aesthetics has been acknowledged in the form of new institutions with an explicit focus on empirical aesthetics. Among them are the Max Planck Institute of Empirical Aesthetics in Frankfurt, Germany, founded 2012; the Penn Center for Neuroaesthetics, Pennsylvania, USA, founded 2019; and the Goldsmiths University’s MSc Program for Arts, Neuroaesthetics and Creativity, which started in 2018.
On the level of theory development, several models of art appreciation have emerged in the 21st century. One of the most cited models was developed by Helmut Leder in 2004 and later modified by him and Marcos Nadal (see Further Readings). The most comprehensive model developed so far is the Vienna integrated model of top-down and bottom-up processes in art perception (VIMAP) that was mainly proposed by Matthew Pelowski and also co-authored by Helmut Leder as well as other members of their research group. It is worth noting that both of these, as well as many other theoretical models, focus on visual art.
Theories about the aesthetic appreciation of music have been developed independently from those about visual arts. Since the late 2010s, the idea that music is liked because it affords the right balance between predictability and surprise has become popular. It relies on the notion of predictive coding, the view that our perceptual system constantly tries to predict what it will encounter next, and that it updates its predictions based on the observed differences between prediction and reality. This difference is called prediction error. The main thesis of the predictive coding view is that small prediction errors and/or a decrease of prediction errors over time are rewarding. In other words, we are hard-wired to enjoy the process of learning, and aesthetic experiences are but one kind of experience that enables us to do so.
A predictive coding account for visual aesthetic experiences was formulated for visual experiences by Sande Van de Cruys and Johan Wagemans, too. It has also been present and dominating views on creativity, art, and the experience of beauty in the computer sciences based on a model developed by Juergen Schmidhuber in the 1990s. However, to date, Schmidhuber’s ideas are little known to psychologists and neuroscientists and his theory remains uncited in the empirical aesthetics literature. This may, however, change, as interdisciplinary collaborations between psychologists, neuroscientists, and computer scientists become more frequent.
2. Subfields of Empirical Aesthetics
a. Perceptual (Visual) Aesthetics
Empirical aesthetics was pioneered by the psychophysicist Gustav Fechner. Psychophysics is the study of the relation between stimulus properties and human perception. Whilst applicable to all senses, most of the psychophysics research (in humans) has focused on vision. True to its roots, most of the past research on empirical aesthetics has also focused on which stimulus properties lead to particular aesthetic perceptions and judgments, and most of it concerns visual object properties.
Most work on perceptual aesthetics aims to uncover which stimulus properties are, on average, liked most. The best-supported findings along these lines are that curvature is preferred to angularity and that symmetry is preferred over asymmetry. In addition, people show a preference for average as opposed to unusual objects, in particular for faces. In the realm of color, green-blue hues are liked better than yellow ones in the great majority of the world. As opposed to popular rumors, a preference for the golden ratio has not found empirical support.
Apart from the above-listed widely supported findings, researchers in empirical aesthetics are studying a diverse number of other visual object properties that are hypothesized to be linked to aesthetic preference. Among these, the spatial frequency distribution has been of particular interest. The spatial frequency distribution of an image is a measure for how many sharp to blurry contours are present in an image; high spatial frequencies correspond to sharp and low spatial frequencies to blurry edges. Some evidence shows that art images with a spatial frequency distribution mimicking the one found in nature photography is preferred.
Researchers also investigate how fractal dimensionality influences aesthetic preferences. Fractal dimensionality refers to the degree to which the same pattern repeats itself on a smaller scale within the same image. An image of tree branches, for instance, has a high fractal dimensionality because the same pattern of twigs crossing one another is repeated in a similar way, no matter how far one ‘zooms into’ the overall image. In contrast, an image of differently shaped clouds in the sky has a lower fractal dimensionality because the visible pattern changes considerably depending on how far one ‘zooms into’ the image. Fractal dimensionality studies have revealed a certain intra-individual stability in preference for relatively high or low fractal dimensionality across different stimuli and even sensory domains.
Another quantifiable image property that is linked to people’s preferences is the so-called image self-similarity. Self-similarity and fractal dimensionality are related constructs but computed differently. They do follow a similar logic in that they compare a cut-out portion of the reference image to the reference image and then take the cut-out portion as the next reference image. Self-similarity can be conceived as an objective measure of complexity. In that sense, this line of research walks in the footsteps of Berlyne’s ideas. However, it also faces the same problem that Berlyne did. On one hand, it aims to measure object properties objectively and relate those to people’s aesthetic responses. On the other hand, it also wants to relate these objective measures to their immediate subjective impression. In the case of self-similarity, researchers are both interested in how well self-similarity metrics map onto subjectively perceived complexity, and at the same time they use self-similarity as a measure of complexity to relate this ‘objective’ complexity metric to subjective aesthetic evaluations. Neither of these relationships—self-similarity to complexity; self-similarity to aesthetic ratings—is a perfect one. Thus, the question of how all possible associations—self-similarity to subjective complexity, self-similarity to subjective rating, or subjective complexity to subjective rating—work together, and what portion of the aesthetic response can truly be attributed to the objective measure alone, remains open.
Other scholars are less concerned with objective stimulus properties and, instead, focus on the relation between different subjective impressions of the same stimulus. Coming back to the example of Berlyne’s hypothesis: Stimulus complexity is omitted (or merely introduced as a means of plausibly altering arousal), and the main relation of interest is the one between subjective arousal and subjective pleasure or liking. Studies that investigate exactly this relation between subjective arousal and aesthetic pleasure have overall been unable to support Berlyne’s claim that intermediate arousal causes the greatest aesthetic pleasure.
However, other relationships between aesthetic ratings have proven stable. Pleasure, liking, and beauty ratings are so closely related to one another that a differentiation between them is empirically speaking close to impossible. Research on people with a self-reported impairment in the ability to feel pleasure (anhedonia) additionally shows that people who cannot feel pleasure in general or in response to music are also much less likely to experience beauty from images or music respectively. This strong link between pleasure and beauty has also been taken as a further argument for the claim that (hedonic) aesthetic responses are undistinguishable from other hedonic responses (see Neuroaesthetics below).
Study results like the ones above draw a picture of the population average. However, researchers are increasingly aware of and interested in documenting individual differences in aesthetic judgments. They quantify the relative contribution of individual versus shared taste by asking a large number of observers to rate the same set of images, at least some of which are rated several times by the same observer. In this way, they determine what fraction of the rating can be predicted at all (factoring out inconsistencies of the same observer rating the same image) and what fraction can, in turn, be predicted by others’ ratings (shared taste) or not (individual taste). The contribution of shared taste seems so far smaller than the one of individual taste with a 50% contribution to face attractiveness and a mere 10% contribution to the liking of abstract art.
Thus, inter-individual differences in aesthetic responses are prominent. The different explanations for their occurrence can all be summarized under one common concept: prior exposure. The effects of prior exposure to a certain kind of stimulus are sometimes studied in the form of expertise within a certain art realm. Such studies compare the responses of, for example, architects and laypersons to photographs of buildings. Another way of studying effects of prior knowledge is the comparison of how people perceive and evaluate art from their own culture versus a foreign culture. Prior knowledge can also be experimentally introduced by showing the same observer the same image(s) repeatedly and thus making her more and more familiar with the stimulus set. Like the popular claim that the golden ratio is preferred, the notion of a “mere exposure” effect—that repeatedly presented stimuli are liked better—has not found consistent empirical support. A meta-analysis pooling results from more than 250 studies suggests that exposure increases positive evaluations, among them liking and beauty, up to a point after which further exposure becomes detrimental. Across all studies, this point seems to occur after about 30-40 exposures but the number varies depending on the kind of evaluation, other experimental details, and the kind of population studied.
Beyond the concern of understanding inter-individual differences, one of the big goals of empirical aesthetics remains to find processing principles that do generalize across all observers. To some extent, this kind of thinking was already present in Berlyne’s early writings when he posited that intermediate subjective arousal leads to the highest pleasure. Whilst this connection between subjective arousal and pleasure has not found consistent support, the notion of pleasure is still a central one in the quest of finding a general processing principle in empirical aesthetics. Intense pleasure is associated with intense beauty, with high liking, and a greater preference. This is true not only in the visual but also in the auditory, gustatory, and tentatively even the olfactory domain. Studies that assess anhedonia, the inability to experience pleasure, also find that this absence of pleasure also leads to impoverished beauty judgments.
The great majority of these findings (as well as the ones reported below) were obtained in a laboratory setting or online. That means that people experienced images, sounds, and some other objects in a highly controlled setting or on their own devices, almost always via a screen. For those scholars that are primarily interested in people’s response to art, these settings pose considerable concerns about ecologic validity: Can we really infer how someone will experience seeing a grand master’s work in real life from her responses to a miniature replica on a screen in a little cubicle? An entire line of research tries to answer this question and identify the similarities and differences between how people perceive and evaluate art in the laboratory versus in museums or at live performances.
b. Neuroaesthetics
Neuroaesthetics is different from other subfields of empirical aesthetics in terms of its methodology, not its subject. Neuroaesthetics is the science of the neural correlates of aesthetic experiences and behavior—that is, the brain activity and structures associated with them. Researchers use a variety of tools to measure brain activity: functional magnetic resonance imaging (fMRI); electroencephalography (EEG); magnetoencephalography (MEG); and more. In addition, diffusion tensor imaging (DTI) can provide insights into the strength of the anatomical connection between different areas of the brain. Due to the relatively poor temporal resolution compared to EEG and MEG, fMRI experiments predominantly use static objects, like images. In contrast, EEG and MEG methods are popular amongst researchers who are interested in stimuli that dynamically change over time, such as music and film. Neuroaesthetics has also begun to use non-invasive methods for stimulating and suppressing brain activity, such as transcranial direct-current stimulation (tDCS).
One of the best-supported findings from neuroimaging studies in aesthetics is that the experience of intensely pleasurable or beautiful objects increases activity in the reward circuitry of the brain, most notably the orbitofrontal cortex (OFC). Even though different studies with varying kinds of objects presented—such as music, paintings, stock images—find slightly different patterns of brain activations, increased activation in the OFC is observed in the vast majority of studies. This finding is of great significance because the same network of brain regions is active during the anticipation or reception of various other kinds of rewards, like food and money, too.
There is one line of studies that does point towards a difference between intensely pleasurable art experiences and other kinds of rewarding experiences. Edward Vessel and his colleagues find that when people view the most aesthetically moving art images, areas of the brain that are part of the “default mode network” (DMN) are activated. The DMN is usually associated with not being engaged with a stimulus or task, and hence, in the absence of another object, with self-reflection. The co-activation of perceptual-, reward-, and default-mode-regions is therefore unusual. According to these researchers, it is the best contender for explaining what makes aesthetic experiences special. This claim has to be taken with a grain of salt; they have yet to show that this co-activation does not occur during highly moving, non-aesthetic experiences.
Neuroaesthetics is, in principle, also concerned with changes in the different chemical substances involved in brain function, so-called neurotransmitters, associated with aesthetic experiences. In practice, inferences about the contribution of neurotransmitters are only rarely possible from the data, and direct manipulations of their concentration is even rarer.
c. Music Science
The study of music (and other sounds) in empirical aesthetics deserves separate mention from the research that concerns vision and other sensory modalities. While research on aesthetics in all but the auditory domain is often published and discussed in general outlets, research on music has a number of dedicated journals, such as Psychomusicology. Psychological theories of aesthetics also tend to focus on static stimuli, neglecting many of the variables of interest for those primarily interested in music, specifically those related to dynamic changes of the stimulus over time.
It is most likely due to the fact that music lends itself to studying changes of percepts over time that the idea of prediction and prediction errors are most prominently present in music science compared to other specialty fields of empirical aesthetics. The intuition is the following: A sequence of tones is liked if the next tone sounds subjectively better than the tone the listener had anticipated. The discrepancy between the expected and actually perceived pleasure (or reward) of an event has been termed “reward prediction error” in reward learning theories. However, this reward prediction error is not the only one being discussed in music science. Some researchers have also shown that ‘factual’ prediction errors can also predict how much one likes a sequence of tones. Here, the intuition is that a sequence of tones is liked if the listener is able to make a reasonable prediction about the next tone but is nonetheless surprised by the actually presented tone. From this point of view, people like musical sequences that elicit low uncertainty but at the same time relatively high surprise. Of note for this line of research is that the quantification of its core measures—uncertainty and surprise—can be automated by an algorithm first introduced in the mid-2000s: The Information Dynamics of Music (IDyOM) system provides a statistical learning model that can calculate both uncertainty and surprise scores for each note in a series of standardized musical notes. Its application in contemporary studies has provided results that are in line with a prediction-error account of aesthetic pleasure.
Music science is also a special field because there is a unique condition called musical anhedonia. People with musical anhedonia do not feel pleasure from music even though they are able to enjoy other experiences, like food, sex, or visual art, and have normal hearing abilities. This very circumscribed condition has enabled researchers to tackle a number of questions about the development, neural basis, and purpose of the human ability to produce and enjoy music. Contemporary insights from this line of research suggest that the functional connection between brain regions that guide auditory perception and regions that are associated with processing rewards of all kinds is key for music enjoyment. The picture is still complicated by the fact that a disruption of this connectivity cannot only lead to musical anhedonia but also to an extremely high craving for music, so-called “musicophilia.”
Music science is marked not only by its interest in understanding and predicting what kind of music people like. A considerable research effort also goes into understanding whether and how music can communicate and elicit a wide range of emotions from happiness to anger to sadness. It is a matter of debate whether music elicits the same kind of emotions in the listener as the emotional events that typically provoke them. An additional point of controversy is whether certain structural properties of music, such as the key it is played in, are consistently associated with the identification or experience of a certain emotion. What does seem to be clear, however, is that certain pieces of music are consistently rated as sounding like they express a particular emotion or even a particular narrative. At the same time, playing specific musical pieces has also been shown to be an effective means of changing people’s self-reported happiness, sadness, or anger. The idea that music can serve as a tool for emotion regulation—either by evoking or mitigating emotions in the listener—forms the core of many practical applications of music science.
A last phenomenon that deserves special mention within the realms of music and emotion is the so called “sad-music paradox”, the phenomenon that people report that they like to listen to sad music when they are already in a sad mood themselves. This apparent contradiction between the hedonic tone the music expresses (sad, negative) and the judgment of the listener (liked, positive) poses a problem for those claiming that music elicits the same genuine emotion in its listener that it expresses. The question why people report to like listening to sad music when sad has yet to be answered. It is worth noting, though, that little research has to date recorded the actual emotional response of sad people listening to sad versus neutral or happy music.
d. Other Subfields
Even though the three areas of concentration mentioned above represent the majority of the research on aesthetics, a few more distinct areas deserve to be mentioned.
One art form that should not go unmentioned is literature. Aesthetic responses to text are less frequently studied by empirical aesthetics than images or music, even though some scholars occasionally use short poems—preferably haikus—in their studies. The bias towards very short poetic forms of literature reveals one of the main reasons why the study of aesthetic responses to literature is not as common: It takes time to read a longer passage of text, and empirical scientists ideally want their participants to experience and respond to a large number of different objects. Overall, there is little data available on who likes what kind of written words and why. Arthur M. Jacobs has nonetheless developed a theory on aesthetic responses to literature, called “Neurocognitive Poetics Model.” It suggests that literary texts can be analyzed along a 4 x 4 matrix, crossing 4 levels along the text dimension (metric, phonological, morpho-syntactical and semantic) with 4 levels of author and reader related dimensions (sub-lexical, lexical, inter-lexical, and supra-lexical). Scholars who focus on the investigation of literature are also the ones who, in empirical aesthetics, come closest to addressing the paradox of fiction. The most common way of thinking about it in the field is that readers (or listeners or viewers) empathize with fictive characters and that they do experience genuine emotions during their aesthetic experience, much as if they would witness the characters having the experiences depicted in real life. There is some evidence from neuroimaging studies that at least shows that similar brain structures that are involved in genuine emotional responses are involved in processing fictive emotional content. At the same time, it is often presumed that people are nonetheless aware of the fact that they are emotionally responding to fiction, not reality, and that this allows them to distance themselves from eventual negative responses. Winfried Menninghaus and his colleagues have developed this line of thought into the “distancing-embracing model of the enjoyment of negative responses in art reception”.
The second art form that deserves mentioning but suffers from being less frequently studied is dance. There have been relatively few studies that experimentally investigated what kinds of dance movements elicit what kinds of aesthetic responses, potentially due to the fact that the production of well-controlled variations of dance sequences is labor-intensive and does not offer the same amount of experimental control as, for instance, the manipulation of static images. There are, however, efforts to link the relatively small and isolated study of dance to the rest of empirical aesthetics by, for instance, linking the variability of the velocity of a dancer’s movements as a measure of complexity to the aesthetic pleasure the dance sequence elicits.
Finally, a related, complex, and dynamic class of experiences that accordingly suffers from the same scarcity of data as dance is movies. Even though some scholars have used sections of movies to study correlations between people’s neural responses to identical but complex stimuli, we know relatively little about people’s aesthetic responses to movies. Self-report studies indicate that aesthetic evaluations of movies are highly idiosyncratic and that laypeople do not seem to agree with critics.
A separate area of research that can fall under the umbrella of empirical aesthetics, broadly conceived, is the one of creativity and, closely related to that, the study of art production. This field is interested in creativity both as a personality trait as well as an acquired skill or temporary act. The question as to how far creativity can be viewed as a stable trait of a person is one of the questions of interest to the field.
3. Relations to Other Fields
a. Relations to Other Fields of Psychology and Neuroscience
Like most areas of psychology, the boundaries of empirical aesthetics are porous. Aesthetic judgments become a subject of social psychology when they concern people. Aesthetic experiences become a subject of affective psychology when conceived as emotional responses. Evolutionary psychology perspectives have been used to explain various aesthetic preferences and why people create art. Aesthetic preferences have an undeniable effect on decision making. The list can be extended to include any number of subsections of psychology.
One connection between empirical aesthetics and other fields that has been emphasized more is the one to general neuroscience research on reward processing, learning, and decision making. The link to this area of neuroscience has become apparent starting with the first fMRI studies on aesthetic appreciation that showed that the same brain regions involved in processing various other kinds of rewards are active while experiencing aesthetically pleasing, beautiful stimuli. Nonetheless, it is rare that decision-making studies use aesthetically pleasing objects as rewards. One major hurdle that seems to prevent the integration of aesthetically pleasing experiences as one kind of reward into the fields of decision-making and reward-learning is the fact that the reward obtained from an aesthetic experience has yet to be expressed in a mathematically tractable way.
Consumer psychology is a field that has established close ties with empirical aesthetics. Product appearance and other aesthetic qualities matter when it comes to our decisions about which products to buy. At first glance, consumer research also seems to tie empirical aesthetics to the field of neuroeconomics and decision-making. In practice, however, there is a marked difference between the application-oriented experimental research that dominates consumer psychology and the theory- and model-oriented research in neuroeconomics and decision-making. Consumer psychology aims to document the effects of certain aesthetic variables in well-defined environments and has thus provided evidence that aesthetic variables do influence decision-making in environments that have relatively high ecologic validity. Neuroeconomics and the study of decision-making from the cognitive and computational psychology perspective, in contrast, aim to develop mathematically precise models of decision-making strategies with a particular interest in optimal strategies. This focus necessitates tightly controlled laboratory settings in which aesthetic variables are omitted.
Empirical aesthetics also has a natural connection to social psychology when it comes to the aesthetic evaluation of people, specifically in terms of attractiveness. The implications of such subjective impressions about a person have been well-documented even before the re-emergence of aesthetics as a part of psychology. The “what is beautiful is good” heuristic and the Halo-effect are the best-known findings from this line of research. Attractiveness research has been partially conducted without much regard to empirical aesthetics and rather as a natural extension of person evaluation and face perception. However, attractiveness is also increasingly studied by scholars primarily interested in empirical aesthetics. In this sense, the study of facial attractiveness represents a two-way connection between empirical aesthetics and social-cognitive psychology.
Empirical aesthetics connects with evolutionary psychology in at least two major ways. One, evolutionary theories are popular for explaining aesthetic preferences for natural environments and faces, human bodies, or faces. This line of investigation ties together empirical aesthetics, social psychology, and evolutionary psychology with regard to human and non-human mate choice. Two, arguments against the existence of a specialized ‘aesthetic circuitry’ in the brain also rest on an argument of evolutionary implausibility: It seems unlikely that such a circuit should have evolved when the use of existing reward systems would be able to perform the same functions.
b. Relationship to Computational Sciences
Even though it has become increasingly hard to draw a line between computer science, computational (cognitive) psychology, and neuroscience, it is worth mentioning computational approaches separately from other frameworks in psychology and neuroscience when it comes to empirical aesthetics.
For one, attempts to use machine learning algorithms to classify objects (again, mostly images) based on their aesthetic appeal to humans are by no means always related to psychological research. In a broader sense, algorithms that create aesthetically appealing images have been developed for commercial purposes long before scientists started to systematically harness the tools of machine learning to study aesthetics.
Second, a dominant framework for how to think about the engagement with art and beauty emerged in computational sciences in the 1990s—arguably before mainstream psychology and neuroscience had turned computational, too. As briefly discussed in the History section, this framework was most prominently popularized by Juergen Schmidhuber.
Third, computer scientists have an additional interest in not only understanding the appreciation and creation of aesthetic experiences by humans, but also in striving to ‘artificially’ create them. Deep-learning algorithms have become famous for beating master-players of chess and Go. Now, computer scientists also try to use them to create art or apply certain art styles unto other images.
c. Relationship to Aesthetics in Philosophy
Like any science, empirical aesthetics is deeply rooted in its philosophical precedents. Even more so than in other sciences, scholars in empirical aesthetics acknowledge this in their writing. Theories and experiments are sometimes partially based on classic philosophical perspectives on aesthetics. Contemporary aesthetic philosophy, however, is rarely mentioned. This disconnect between modern empirical and philosophical aesthetics is mostly due to the fact that the scope of empirical aesthetics remains close to the narrow, older definition of aesthetics as “theory of beauty” and art, whereas aesthetics in modern philosophy has shifted its focus towards new definitions of art, expression, and representation.
During the early days of empirical aesthetics, and psychology as a science in general, the divide between psychology and philosophy used to be less pronounced. Ernst Gombrich and Rudolf Arnheim, for instance, have influenced both fields.
Psychologists and neuroscientists are uninterested in what counts as “art” and what does not count as “art,” in stark contrast to their philosopher colleagues. They are also much less concerned about the normative definition of the aesthetic responses they intend to measure. This core difference between the psychological and philosophical approach to aesthetics is rooted in the diverging goals of the two fields. Empirical aesthetics aims to describe what people deem an aesthetic experience, in contrast to most philosophers who seek to prescribe what should be a proper aesthetic experience.
When it comes to aesthetic emotional responses, the divide between philosophy and psychology lies on a different level. While the paradox of fictional emotions is one that leaves many philosophers doubting that one does indeed show true emotional responses to fictional texts or music, psychologists rarely doubt that people experience emotions when experiencing art in whatever form. On the contrary, a lot of psychological research on emotions relies on the presumption that texts, music, movies, and so on, do elicit genuine emotions since those media are used to manipulate an observer’s emotional state and then study it. And while philosophers question the rationality of these emotions, presuming they exist, psychologists do not question whether or not an emotional response is rational or not. Again, the contrast between philosophy and psychology seems to originate from the different approach towards people’s self-reports. Psychologists take those self-reports, along with neural and physiological data, as evidence of the state of the world, as their subject of study. Philosophers question how far these self-reports reflect the concept in question. Importantly, however, both philosophers and psychologists still do ponder a related question: Are aesthetic emotions special? Or are they the same emotions as any others and just happen to have been elicited by an aesthetic experience?
This historically important link between morality and aesthetics, especially beauty, in philosophy is rarely made in psychology. Apart from the above mentioned “what is beautiful is good” phenomenon, psychologists do not usually link aesthetic and moral judgments. However, there is evidence that facial and moral beauty judgments are linked to activation in overlapping brain regions, specifically the orbitofrontal cortex (OFC). It should be noted, though, that this orbitofrontal region is the same one that is even more generally implied in encoding the experienced or anticipated pleasure of objects of many different kinds. In addition, the so-called “Trait Appreciation of Beauty” framework proposed by Rhett Diessner and his colleagues explicitly contains a morality dimension. This framework is, however, not widely used.
The topic of aesthetic attitude is viewed from very different angles by psychologists and philosophers. In experiments, psychologists use instructions to elicit a change in their participant’s attitude to study the effect of such an attitude change on how objects are perceived. They, for instance, tell people to either judge an image’s positivity as part of a brochure on hygiene behavior or as a piece of abstract art. Research along these lines has found that people do indeed change their evaluations of images depending on whether it is presented as art or non-art. Neuroaesthetics studies have also investigated whether neural activation during image viewing changes depending on the instructions that are given to people, such as to look at them neutrally or with a detached, aesthetic stance. These researchers have indeed uncovered differences in how and when brain activity changes due to these different instructions.
Psychology has so far stayed silent on the issue of whether aesthetic representations, art, can contribute to knowledge. With the exception of some research on the effect of different modes of representing graphs in the context of data visualization, there does not seem to be an interest in exploring the potential contribution of aesthetic factors to learning. However, the inverse idea—that the potential for learning may be a driving force for seeking out aesthetic experiences—seems to be gaining some traction in empirical aesthetics of the 21st century.
It is worth noting, too, that some philosophers combine theories that were developed in the philosophical tradition with experimental methods. Some of these philosophers conduct this empirical philosophy of aesthetics in collaboration with psychologists. This kind of collaboration is in its infancy but does show similar promise as in the field of moral philosophy and psychology.
4. Controversies
a. Is Aesthetics Special?
The major controversy in empirical aesthetics concerns the very core of its existence: Is there anything special about aesthetic experiences and behaviors that distinguishes them from others? For example: Is the pleasure from looking at the Mona Lisa any different from the pleasure of eating a piece of chocolate? Some scholars argue that the current data show that the same neural reward circuit underlies both experiences. In addition, they argue that it would be evolutionarily implausible for a special aesthetic appreciation network to evolve as well as that there is evidence that even non-mammalian animals exhibit behavior that can be classified as signals of aesthetic preferences. Scholars who take the opposing view argue that aesthetic experiences do have properties that distinguish them from other pleasant experiences, especially that they also include unpleasant sensations such as fear or disgust. They also point out the symbolic and communicative function of art that goes beyond the mere evocation of aesthetic pleasure.
b. What Should Empirical Aesthetics Mainly be About?
Empirical aesthetics as a field is far from having found a consensus about its identity. The center of an ongoing tension within the field is the relative balance between a focus on the arts, including all kinds of responses associated with it, versus a focus on aesthetic appreciation, including all kinds of objects that can be aesthetically appreciated. It is therefore unsurprising that most research in the past has occurred at the intersection of both topics, which is to say that it has dealt with aesthetic preferences for artworks or sensory properties that can at least be considered fundamental properties of artworks.
At the two other ends of the extreme, scholars criticize each other for presenting data that is irrelevant for the field. Proponents of an empirical aesthetics of the arts criticize studies that use stock photographs or image databases like the International Affective Picture System because these kinds of objects supposedly cannot elicit genuine aesthetic responses. Proponents of an empirical aesthetics of appreciation in general criticize studies that use only a narrow selection of certain artworks because these supposedly cannot generalize to a broad enough range of experiences to yield significant insights.
c. Population Averages vs. Individual Differences
Another big controversy in the field has accompanied it since its early beginnings. Should we study population averages or individual differences? This question arises within almost any field in psychology, but it has created a particularly marked division of research approaches within empirical aesthetics. The ones studying population averages and object properties criticize the other side by saying that their subjective measures are fundamentally flawed. The ones focusing on individual differences point out that object properties can often only account for a small proportion of the observed responses.
Most contemporary researchers still operate on the level of understanding and predicting average responses across a pre-defined population, mostly Western-educated, rich populations in industrialized democracies. In contrast to Berlyne, however, this choice is often not based on the conviction that this approach is the best one. It is, instead, often the only feasible one based on the amount of data that can be obtained from a single participant. The fewer data points per participant, the less feasible it is to make substantial claims about individual participants. The very nature of aesthetic experiences and responses—that is, that an object needs to be experienced for a certain time; that judgments may not always be made instantaneously; that one cannot make repeated, independent judgments of the same object; that aesthetic judgments may be compromised as the experiment progresses due to boredom and fatigue; that one cannot assume stability of aesthetic responses over longer delays of days or even hours—complicates the collection of many data points for a single participant.
Still, a paradigm shift seems to be taking place, slowly. In the early 21st century, more and more studies have at least reported to what extent their overall findings generalize across participants, or to what extent aesthetic judgments were driven by individual differences versus common taste. In addition, some have reported stable preferences for certain stimulus characteristics across modalities or object kinds within a given participant.
5. References and Further Reading
Berlyne, D. E. (Ed.). (1974). Studies in the New Experimental Aesthetics: Steps toward an Objective Psychology of Aesthetic Appreciation. Washington D. C.: Hemisphere Publishing Corporation.
Brielmann, A. A., & Pelli, D. G. (2018). Aesthetics. Current Biology, 28(16), R859-R863.
Brown, S., Gao, X., Tisdelle, L., Eickhoff, S. B., and Liotti, M. (2011). Naturalizing aesthetics: brain areas for aesthetic appraisal across sensory modalities. Neuroimage, 58(1), 250-258.
Chatterjee, A., and Vartanian, O. (2014). Neuroaesthetics. Trends Cogn. Sci., 18, 370–375.
Fechner, G. T. (1876). Vorschule der Aesthetik. Leipzig: Breitkopf & Hartel.
Graf, L. K. M., and Landwehr, J. R. (2015). A dual-process perspective on fluency-based aesthetics: the pleasure-interest model of aesthetic liking. Pers. Soc. Psychol. Rev., 19, 395–410.
Ishizu, T., and Zeki, S. (2011). Toward a brain-based theory of beauty. Plos ONE, 6, e21852.
Leder, H., and Nadal, M. (2014). Ten years of a model of aesthetic appreciation and aesthetic judgments: the aesthetic episode — developments and challenges in empirical aesthetics. Br. J. Psychol., 105, 443–446.
Montoya, R. M., Horton, R. S., Vevea, J. L., Citkowicz, M., & Lauber, E. A. (2017). A re-examination of the mere exposure effect: The influence of repeated exposure on recognition, familiarity, and liking. Psychological bulletin, 143(5), 459.
Nadal, M. and Ureña, E. (2021). One hundred years of Empirical Aesthetics: Fechner to Berlyne (1876 – 1976). In M. Nadal & O. Vartanian (Eds.), The Oxford Handbook of Empirical Aesthetics. New York: Oxford University Press, https://psyarxiv.com/c92y7/
Palmer, S. E., Schloss, K. B., and Sammartino, J. (2013). Visual aesthetics and human preference. Annu. Rev. Psychol., 64, 77–107.
Pelowski, M., Markey, P. S., Forster, M., Gerger, G., and Leder, H. (2017). Move me, astonish me… delight my eyes and brain: The Vienna integrated model of top-down and bottom-up processes in art perception (VIMAP) and corresponding affective, evaluative, and neurophysiological correlates. Physics of Life Reviews, 21, 80-125.
We are creatures with clear cognitive limitations. Our memories are finite and there is a limit to the kinds of things we can store and retrieve. We cannot, for example, remember the justification or evidence for many of our beliefs. Moreover, in response to our limited cognitive resources, we generally tend to maintain our beliefs and are reluctant to change them. A clear case of this psychological tendency to preserve beliefs obtains when people are informed of the inadequacy of the original grounds of their beliefs. Their reluctance to change their beliefs shows that they are sensitive to the fact that changing them incurs cognitive costs, thus straining their limited resources.
Certain views in epistemology have sought to put a rational gloss on this phenomenon of belief perseverance by suggesting the thesis of doxastic conservatism, according to which the fact that one believes a proposition provides some measure of justification for that belief. This initial picture has, however, become more complicated by further claims made on behalf of the thesis to the effect that it also has the potential to resolve certain outstanding problems in epistemology, (such as how perception is a source of reason), skeptical worries about induction, and the problem of easy knowledge. Examination of these claims reveals that they involve more than one thesis of conservatism. Moreover, it appears that the epistemic role that is attributed to the conservative thesis is often played by superficially similar claims which derive their epistemic significance not from what the thesis regards as the source of justification but from other substantial properties that are attributed to beliefs. This article presents and examines some of the main accounts of the thesis of doxastic conservatism as well as the arguments that are suggested in their support.
Doxastic conservatism refers to a variety of theses which, in different ways, emphasize the stability of one’s belief system by requiring the subject to refrain from revising his or her beliefs when there are no good reasons for a revision. We are all too familiar with the fact that our undeniable cognitive limitations restrict the set of things we can store or retrieve. Often, we lose track of the justification relations among our beliefs and the reasons behind them, which is why, as well documented experiments have shown, we tend to retain many of our beliefs despite being informed of the inadequacy of their original grounds. Against this background of limited cognitive resources and the costs that the changing of one’s mind incurs, doxastic conservatism (DC) presents itself as a viable blueprint for regulating our belief-forming processes by recommending the adoption of those hypotheses that minimize the revision of our belief system, thereby ensuring its stability. The advocates of DC, however, do not limit its virtues to the minimization of such cognitive costs; they sometimes make ambitious claims on its behalf, highlighting its ability to serve a number of epistemological projects such as the justification of memory beliefs and the resolution of various skeptical problems (McGrath 2007, McCain 2008, and Poston 2012).
However, before proceeding to delineate the contours of the conservatism thesis as well as its purported virtues, an important terminological remark is in order. The conservative thesis that is the subject of this article is generally known in the literature under the rubric of epistemic conservatism. However, since there are several different theses in epistemology all using the same or a similar label, it is best to call it doxastic rather than epistemic conservatism to distinguish it from such theses as phenomenal conservatism and epistemic conservatism, the latter of which is found in the context of the liberalism (dogmatism)/conservatism debate. According to phenomenal conservatism (Huemer 2001), if it seems to you that p, then, in the absence of defeaters, you thereby have at least some degree of justification for believing p. When phenomenal conservatism is restricted to perceptual experience, the thesis is known as dogmatism (Pryor 2000). According to dogmatism (liberalism), perceptual experience (e) gives one justification to believe its content if it appears to one that p and one has no reason to suspect that any skeptical alternative to p is true. So, in this liberal theory, experience on its own can confer justification on the belief in its content. Against this liberal view stands the conservative view, notably defended by Crispin Wright (2004), according to which to be warranted or justified in holding a perceptual belief, we must have some antecedent justification or entitlement to believe certain fundamental presuppositions such as the existence of the world, the reliability of our perceptual system, and so on. With this word of caution out of the way, discussion of conservatism can proceed without the risk of confusing it with similarly labeled doctrines.
Doxastic conservatism has informed philosophical views as diverse as Quine’s and Chisholm’s. According to Quine (1951), belief revision must be subject to a number of conditions, most notably the overall simplicity of the resulting belief system and the need to preserve as many earlier beliefs as possible. For Chisholm (1980), however, conservative principles play an important role in his defense of epistemological foundationalism. Despite the popularity of the conservatism thesis, however, it is difficult to identify a single thesis as representing its content. Sometimes DC is presented as the claim that the holding of a belief confers some positive epistemic status on its content, sometimes it is said to regulate our decision to continue to hold a belief, and sometimes it is said to help us to decide between a number of evidentially equivalent alternative hypotheses. Although it is easy to see a common motivation behind all these different versions of DC, one should not lose sight of their differences, because the considerations that are usually offered in their defense often involve different concerns. This article begins by distinguishing between three main varieties of doxastic conservatism, namely, differential, perseverance, and generation conservatism. It then examines the plausibility of the arguments given in their support. This investigation pays special attention to the alleged payoffs of DC. In particular, it will inquire whether it is DC, on its own, that has such epistemic potentials or whether the latter are the result of other claims that superficially resemble DC.
2. Varieties of Doxastic Conservatism
As noted above, the thesis of doxastic conservatism actually covers a family of views that are all presented as conservative theses. These non-equivalent versions of conservatism differ from each other not only because their advocates often reject one version while upholding another, but also because the arguments that are put forward in their favor are actually tailored to defend one particular version to the exclusion of another. For example, Lawrence Sklar (1975) defends what he calls methodological conservatism, which guides a cognizer who comes to know of a hypothesis that is evidentially equivalent to the one that she has already adopted. However, Sklar rejects another version of conservatism, according to which holding a belief confers some measure of justification on that belief, for being too strong. It is, however, this latter version of conservatism, defended by Chisholm (1980), that is upheld as the standard version of doxastic conservatism.
Another exponent of conservatism, Gilbert Harman (1986), is more concerned with uncovering the principles that regulate our continuing to believe a proposition in the absence of contrary reasons. Although Harman sometimes appeals to the standard version of conservatism, on his official account conservatism is the view according to which “one is justified in continuing fully to accept something in the absence of a special reason not to” (1986, p. 46). Accordingly, to evaluate the thesis of epistemic conservatism, it would be more appropriate to begin by distinguishing the following types of the conservative theses (Vahid 2004):
Differential Conservatism (DiC)
One is justified in holding to a hypothesis (belief) despite coming to know of evidentially equivalent alternatives.
Perseverance Conservatism (PC)
One is justified to continue to hold a belief as long as there is no special reason to give it up.
Generation Conservatism (GC)
Holding a belief confers some measure of justification on the belief.
When evaluating these theses for their epistemic worth, it should be borne in mind that some of the virtues mentioned in connection with their plausibility are pragmatic in nature. We are told that, due to our cognitive limitations, changing our mind for no good reason is a waste of our time, energy, and resources, and that, therefore, following the conservative principles would minimize such costs and save us from dwindling our resources. Whatever the practical merits of conservatism, such virtues, on their own, due to their pragmatic nature, fail to ensure that epistemically warranted beliefs would result from adherence to the conservative principles as canons of theory choice. Thus, we need to see what it is exactly that makes conservatism an epistemic, rather than a pragmatic, thesis, and this is best achieved by examining the merits of each conservative principle on its own.
3. Differential Conservatism
According to Sklar, “[a]ll [differential conservatism] commits us to is the decision not to reject a hypothesis once believed simply because we become aware of alternative, incompatible hypotheses which are just as good as, but no better than, that believed” (1975, 378). Sklar’s defense of DiC is a sort of transcendental argument involving what he calls an anti-foundationalist “local theory of justification.” Unlike the foundationalist theory of justification, in which basic beliefs depend on no other beliefs for their justification, the local theory takes epistemic justification to be relative to a body of assumed, unchallenged background beliefs. These background beliefs are supposed to play the role of evidence. They can play such a role, says Sklar, because their own status is not at the time under scrutiny. Such an account, however, is consistent with the existence of incompatible total belief structures all being locally justified. The only way to rule out this possibility is to invoke differential conservatism and hold on to what we already believe despite becoming aware of competing belief structures.
This argument, however, leaves one with a lacuna about the status of the background beliefs in the relevant belief structures. In order to confer justification on the target belief, these background beliefs must themselves be justified. However, with Sklar’s rejection of foundationalism, such beliefs can only acquire their epistemic worth either by cohering with the rest of one’s belief system or by relying on GC, according to which the mere holding of a belief confers a positive epistemic status on it. The latter option is not available to Sklar because he rejects GC for being too strong. The only alternative seems to be to adopt a coherence theory of justification and claim that the unchallenged background beliefs acquire their justified status by belonging to a coherent belief system. Collapsing the local theory into a holistic coherence theory of justification, though, renders DiC redundant. It is indeed no accident that, in defending the local theory, Sklar is forced to respond to the alternative coherent systems objection that usually arises for the coherentist accounts of justification. Either way, the transcendental argument is unsuccessful.
Considering the following scenario highlights another problem with DiC. Suppose two subjects, S1 and S2, faced with the task of explaining some body of data (e), come up with two incompatible hypotheses, H1 and H2 respectively, that can equally account for e. Both can be said to be justified in believing their respective hypothesis. Suppose, however, that S2 also learns or independently discovers that H1 equally accounts for e, and for some non-evidential, perhaps aesthetic, reason gives up his previous belief and instead comes to believe that H1. By DiC, S2 should have stuck with H2 and his belief that H1 is therefore not justified. If so, we have a case here where two tokens of the same belief (H1) are based on the same grounds, but while one is justified, the other is not, and this undermines the thesis of epistemic supervenience according to which the justification of a belief supervenes on certain non-epistemic properties of that belief, such as being reliably produced, being adequately grounded, or being part of a coherent belief system. It is worth noting that this problem only affects DiC on non-permissivist views according to which, for any body of evidence e and any proposition p, there is at most one kind of doxastic attitude towards p that is permitted by e.
One might object that since e equally justifies both H1 and H2, and the two hypotheses cohere equally well with S2’s beliefs, S2 is neither justified in believing H1 nor justified in believing H2 (McCain 2008). Of course, with these further stipulations about the justificatory role of coherence and the strength of evidence, S2 will not be justified. This would be a different conservative thesis (see below for discussion), however, and not the thesis that Sklar defends, which clearly states that one can retain one’s justification for believing a hypothesis despite coming to know of incompatible but evidentially equivalent hypotheses.
Finally, DiC may lack intuitive plausibility. Suppose a subject S comes to believe that H1 on the ground that it explains some data e. It is plausible to think that S’s awareness of a competing but equally explanatory hypothesis H2 should require some doxastic attitude adjustment vis-à-vis her belief that H1. The thought is that by finding out that H2 equally accounts for e and that H1 and H2 cannot be both correct, S, being a rational agent, should conclude that she may have assessed e inappropriately. Of course, this does not mean that S’s belief that H1 is false. However, given her fallibility, awareness of the second-order evidence regarding the possibility of her inappropriate assessment of e should prompt S to be more circumspect in her attitude towards H1. It follows that the rational credibility of this belief is thereby eroded to some extent. If S continues to come across further competing explanations of e such as H3, H4, and so forth, the evidential impact of such collective evidence would become quite impressive to the extent of significantly undermining the epistemic status of S’s original belief that H1. This conclusion is clearly at odds with DiC’s recommendation that S ought to stick with H1 regardless of the competing hypotheses that she may come across.
Marc Moffett (2007) has argued, however, that awareness of competing but equally explanatory hypotheses—underdetermination—does not constitute a defeater for holding on to our beliefs if we help ourselves with something like GC. In other words, if we accept that merely having a belief constitutes a prima facie justification for that belief, then it follows that if one believes that p at t, one should not abandon this belief at t unless one has adequate reason to do so. Accordingly, Moffett denies that knowledge that one’s belief is rationally underdetermined by evidence undermines our entitlement to that belief, for, given that underdetermination is a widespread phenomenon, we would be forced to adopt a theoretically neutral psychological standpoint in a great portion of our cognitive endeavors, which is implausible. Apart from the problem that this maneuver is at odds with Sklar’s rejection of GC, it is unclear whether such considerations constitute epistemic, rather than prudential or moral, reasons for holding on to beliefs.
4. Perseverance Conservatism
The driving force behind this version of conservatism, defended most notably by Harman (1986), has been the phenomenon of “lost justification” or “lost evidence.” As noted before, failure to keep track of our evidence, itself the result of our cognitive limitations, is usually taken to explain the so-called phenomenon of “belief perseverance in the face of evidential discrediting.” Experimental results have shown that people exhibit a psychological tendency towards the perseverance of their beliefs when apprised of the unreliability of the original source of those beliefs because they fail to recall that it was the discredited evidence that was initially responsible for their beliefs.
Harman discerns two competing theories of the rationality of belief perseverance, the foundations theory and the coherence theory. The former requires that one have a special reason to continue to hold a particular belief if one is to be justified in that belief, while the latter, by contrast, only requires the absence of any special reason to revise the belief in question for one to be justified to continue to hold it. Since the foundations theory requires that one keep track of one’s original reasons, Harman concludes that, in the face of the phenomenon of lost evidence, it is the coherence theory that is normatively correct, and the conservative thesis is simply the expression of its normative import. Although Harman sometimes understands conservatism as the thesis that “a proposition can acquire justification simply by being believed” (1986, p.30), it is obvious that it is not GC that he has in mind. For if having a bare belief is to be sufficient for its justification, the phenomenon of lost evidence or justification would be rendered impossible since the belief itself would ground its justification.
Harman’s official account of conservatism, along the lines of PC, maintains that one is justified in continuing to hold a belief as long as there are no good reasons against it. Although PC can account for the rationality of belief perseverance, alternative explanations of such rationality can undermine its credibility. Before addressing this issue, it is worth considering an objection made to Harman’s argument from lost justification by David Christensen (1994), as it would further clarify what PC really involves. Christensen thinks that one can explain the phenomenon of belief perseverance without appealing to any conservative principle. Suppose, for example, that I currently hold the belief that the population of India exceeds that of the United States, though I have forgotten what the source of my belief was. To show that conservatism need not play any role in explaining the rationality of this belief, Christensen offers what he takes to be a similar case, where it is completely implausible to invoke any conservative principle. Suppose I flip a coin which lands out of my sight, and I decide to believe that it has landed tails up without checking to see whether it has. It is obviously implausible to take that belief as justified, but it seems this is what conservatism invites us to do.
Despite their structural similarity, the two cases, says Christensen, differ in the following respect: “In both cases, I have a belief for which I am unable to cite any grounds…Yet in one case, maintaining the belief seems quite reasonable; while in the other… unreasonable” (1994, 74). Christensen claims that one cannot explain this difference in terms of the applicability of conservatism in one case and its inapplicability in another. Rather, it is to be explained by the role that background beliefs play in the two cases. In the India example, I have some background beliefs—for example, that I have acquired the belief from a reliable source, that despite India and the United States being favorite topics in my family, I have never been contradicted, and so forth—that convince me that my belief is correct. Moreover, it is precisely such beliefs that maintain the rationality of my continuing to hold the belief about India’s population. However, no similar beliefs are present in the coin example, which is why I am not justified in holding the belief that I do.
Vahid has, however, criticized this argument on several grounds (2004). It could be, for instance, that the coin example and the India example are only superficially similar. It is true that, in both cases, I have a belief for which I can no longer recall any evidence, but this is true for different reasons in the two cases. In the India case, I have forgotten the original source of my belief, but in the coin case there is simply no reason to report. So the coin case is not really an instance of the phenomenon of forgotten evidence. More damaging, even the India example does not seem to be a case of forgotten evidence. For what seems to be forgotten in that example is merely the name or identity of the source of my belief, and that is irrelevant to the question of the rationality of continuing to hold to that belief. After all, Christensen himself maintains that among the things I know in this case is that “I was once told [about India’s population] by… some… reliable source, and I (quite rationally) accepted the source’s word for it” (1994, p.73). To put it differently, the evidence that I have forgotten in this example concerns the identity of the source of my belief that India is more populous than the United States. However, as far the belief itself is concerned, I have enough information to render it justified.
As noted above, PC’s credibility in accounting for the rationality of belief perseverance can be undermined if there are alternative explanations of the phenomenon of lost justification. Here is one alternative explanation (Vahid 2004). Suppose we take the property of justification to be an objective property that beliefs possess when they are adequately grounded. It is also customary in the epistemology literature to distinguish between “being justified” or “having justification” and the “activity of justification” (Alston 1989). The idea is that just as one can be good, say, without being able to show that one possesses that virtue, one’s belief can have (and retain) the property of justification if it was initially based on adequate grounds, and there are currently no defeaters, without one being able to show that it is justified. With these distinctions in force, one could say that what the phenomenon of lost justification threatens is not the justification of one’s belief, but one’s ability to show that one is justified in holding that belief. The plausibility of this explanation depends, however, on the fate of some of the contentious issues in the internalism-externalism debate in epistemology.
5. Generation Conservatism
This section presents what is generally regarded as the standard version of conservatism, namely, GC. Given its mainstream status, GC has been more extensively discussed than the other versions of conservatism. Unlike the other two versions of conservatism, where what is at issue is the epistemic status of belief when one has lost track of its grounds or when one has been apprised of the evidentially equivalent competing hypotheses to that belief, generation conservatism (GC) is concerned with whether the very formation of a belief bears on its epistemic status. As Chisholm characterizes GC, the principle says that “anything we find ourselves believing may be said to have some presumption in its favor—provided it is not explicitly contradicted by the set of other things that we believe” (1980, pp. 551-552). Despite being the mainstream version of conservatism, GC also turns out to be its most controversial version.
There is no doubt that GC, if true, is a powerful epistemic tool for resolving a number of standing problems in epistemology, such as the problem of skepticism, the puzzle over the epistemic status of memory beliefs, and others. It has also been put to use to address a number of other challenges, like the problems facing internalism. For example, Smithies (2019) has argued for what he calls “phenomenal mentalism” according to which epistemic justification is determined only by our phenomenally individuated mental states, which include not only our conscious experiences but also our consciously accessible beliefs. More specifically, he defends a synchronic version of phenomenal mentalism, which takes epistemic justification to be determined by the phenomenal states you have now. This particular version of internalism is, however, vulnerable to the problem of forgotten evidence, as when a subject no longer remembers the ground of her justified belief (Goldman 1999). Along with other authors (McGrath 2007), Smithies thinks that GC provides a neat solution to this problem.
GC has also been the subject of many criticisms including, among other things, the “boost” and “conversion” objections as well as arbitrariness worries (for the latter, see below). The conversion problem says that when one has adequate evidence for two contrary hypotheses H1 and H2, one should withhold judgment. However, according to GC, by believing either of the hypotheses, the subject can convert her evidential situation from unjustified to justified, which is unacceptable. According to the boost problem, GC allows a subject to boost her justification for believing a proposition (p) by simply forming the belief. Suppose that S has evidence that supports p to some degree n. If GC is true, S can boost this support relation by simply believing that p. Following Feldman (2014), McCain (2020) has responded to this objection by rejecting the “additivity of evidence” principle, which says that if an agent S acquires new evidence that supports p while retaining any old evidence, then, in the absence of defeaters for p, S becomes better justified in believing p. The argument against this principle appeals to the possibility of redundant evidence; that is, evidence that makes no difference in levels of justification. Accordingly, there can be cases where S acquires new evidence for p without becoming better justified in believing p (Feldman 2014).
Another objection is that GC seems to conflict with the causal accounts of the basing relation (Frise 2017). The obtaining of the basing relation is what marks the transition from propositional to doxastic justification, and it is widely believed that causation must play a role in any viable account of the basing relation. The problem is that since a belief cannot cause itself at a time, the beliefs that GC claims are justified fail to satisfy the causal requirements for the basing relation (see McCain 2020 for a response). Finally, there is also the argument that GC lacks in intuitive plausibility. The idea is that it seems difficult to see how the bare fact of believing a proposition could confer justification on that belief. To get a sense of this unease, consider Christensen’s coin example again. I flip a coin which lands out of my sight and I form the belief that it has landed tails up without bothering to see if this is the case. GC seems to rule that this belief is justified, which is implausible. A response is that belief makes an epistemic contribution only when the subject lacks evidence for or against it (Poston 2012). If this is correct, though, it follows that a subject who believes that p in the absence of evidence for or against p would be placed in a position by GC to assert the following Moore-paradoxical sentence: <p but I have no evidence for p>. The awkwardness of allowing mere belief to be a source of justification does not disappear so easily (see Poston 2014, ch.2 for a response).
An important note, however, is that although constructing counterexamples to GC appears to be easy, it may be easier to beg the question against the proponent of GC. Consider, for example, an argument involving a possible scenario on the basis of which Foley (1982) rejects GC as being too strong. Consider subject S, who comes to believe a proposition H, which is not explicitly contradicted by anything else she believes, while, given her circumstances, it is more reasonable for S to believe not-H. By GC, S is justified in believing H. However, this scenario is under-described. For it is either the case that S has no evidence for H or not-H, in which case GC can be plausibly applied to S’s belief that H to ground her justification in believing that H, or it may be that, as Foley stipulates, S has better reasons for not-H than she has for H, in which case GC is no longer applicable because it takes the bare fact of believing a proposition to endow it with justification only in cases where there are no reasons against it. In such circumstances, it is not clear how, on pain of begging the question against the proponent of GC, one is to take believing not-H as being more reasonable than believing H.
In response to the above problems, the proponents of conservatism usually try to introduce certain modifications in their theses to make them more appealing. Although their official view is still the claim that it is the mere belief that confers justification on its content, closer observation reveals that these modified accounts often rely on such external factors as “seemings,” coherence, and evidence about our “general reliability” in order to enable their conservative theses to discharge their epistemic role. Before turning to such versions, though, it is important to consider some of the arguments that have been suggested for and against GC.
a. Arguing for and Against GC
i. The Transcendental Argument
A rather common transcendental argument for GC starts with the methodological question of how we are supposed to conduct our inquiries. One policy is to follow the Cartesian advice of dispensing with all our beliefs except those that are certain. This would seem to ensure that our beliefs satisfy the epistemic goal of believing truth and avoiding falsehood. Abandoning all our beliefs and trying to rebuild them from scratch, though, is close to cognitive suicide. The only way forward, then, is to work with what we have got and rely on our perspective in order to achieve the epistemic goal. Our perspective, however, consists not only of our mundane beliefs but also of our beliefs about which methods or belief-forming processes are reliable for regulating the formation of those mundane beliefs. There is no Archimedean point from which we can determine which of our belief-forming processes are reliable. Accordingly, there is no way to regulate our belief-forming activities except by relying on our antecedent convictions that constitute our perspective on the world. For those convictions to discharge this epistemic role, however, they must possess some epistemic worth to start with. Otherwise, our beliefs would fail to be justified. This means that mere belief, as GC claims, can confer some measure of justification on its content.
This argument can be resisted. Even if one may now appeal, a la GC, to the justified status of one’s background beliefs about the reliability of belief-forming processes in order to rationally and actively reaffirm a particular belief resulting from such processes, it is not clear that the belief in question was actually based on such background beliefs upon its inception (Podgorski 2016). So the justification now associated with the reaffirmation of the belief may not be the justification that was lost in the process. Another worry is that the motivations behind this argument might result in absurd consequences if the argument is repeatedly applied. Suppose, having formed a justified belief p by relying on my background beliefs, I take myself to have fulfilled the epistemic goal of truth and so consider the belief as justified. The type of conservatism that results from this argument would then seem to require that I hold on to the belief in the absence of a challenge from within my perspective. As Roger White argues, the same conservative motivations would also require me to avoid such challenges:
If I allow myself to critically rethink my commitment to p, there is a chance that I might conclude that I was mistaken. But from my perspective now, to change my mind as to whether p would be to be led into error. Of course I do not want that. So it is better for me to avoid all possible challenges and cling dogmatically to my current convictions. No one, I take it, wants to endorse this sort of attitude. (2007, pp. 125-26)
Finally, there is the problem of arbitrariness that seems to arise from applying GC to our day-to-day cognitive dealings (White 2007 and Podgorski 2016). The idea is that it is possible for two or more subjects to have radically different perspectives, including different habits of thought that enjoy the same degree of coherence. According to GC, all such subjects are justified in their beliefs. Any insistence on the privileged status of one particular belief system over others invites the charge that the norms of epistemic impartiality are being violated. Moreover, if all belief systems are equally rational, it is not clear why one should not feel free to adopt the perspective of others.
ii. The Argument from Charity
Another argument (Vahid 2004) in defense of GC takes its inspirations from the consideration involving Davidson’s claim that belief ascription is constrained by the principle of charity. Very roughly, Davidson takes the evidence for the semantic theory to consist in the conditions under which speakers hold sentences true. The holding of a sentence true by a speaker is, however, a function of both what she means by that sentence as well as what she believes. This means that belief cannot be inferred without prior knowledge of the meaning, and meaning cannot be deduced without the belief. This is where Davidson appeals to the principle of charity. The idea is that we can solve the problem of the interdependence of belief and meaning “by holding the belief constant as far as possible while solving for meaning. This is accomplished by assigning truth conditions to alien sentences that make native speakers right when plausibly possible, according, of course, to our own view of what is right” (Davidson 1984, p.137). Thus understood, we may view the application of the principle of charity as involving the maximization of truth by the interpreter’s own lights.
Now, if belief ascription is to be constrained by charity and the latter is characterized by the aim of maximizing truth and minimizing falsity in the speaker’s belief system, then this would seem to endow the ascribed belief with some presumption of rationality, since justification (rationality) is also generally understood in terms of promoting the truth goal. A belief is justified in so far as it promotes the epistemic goal of believing what is true and not believing what is false. It should be noted, however, that since charity begins at home, the ascriber’s (interpreter’s) beliefs are as much subject to the constraint of charity as are the beliefs of the subject (interpretee) to whom beliefs are ascribed. One problem with this approach is that since charity requires the assignment of truth conditions to the interpretee’s sentences according to the interpreter’s view of what is right, the kind of rationality that emerges from this belief ascription process is one that is perspective-dependent and thus very weak. Although this seems to comport well with Chisholm’s own estimate of GC as constituting the “lowest” degree of epistemic justification (1980, 547), it will not be of much interest to those philosophers who think that conservatism is a substantial epistemic thesis.
The possible failure of the above arguments to fully substantiate the epistemic credentials of GC has not, however, deterred its proponents from coming up with modified versions of the thesis that no longer suffer from the problems that have afflicted the original version. Before considering these modified versions of GC, it pays to consider an argument against GC that denies that ordinary people are even capable of holding bare beliefs, and so concludes that, since the evaluation of GC requires the possibility of such beliefs, it is practically impossible to evaluate such a thesis.
Daniel Coren (2018) has claimed that the nature of bare belief makes it impossible to evaluate GC. He takes it that a bare belief is supposed to be a belief that is stripped of all personal memory and epistemic context. Although Coren regards such beliefs as logically conceivable, he thinks that for us, human agents, they are practically inconceivable. Even in the case of forgotten evidence, says Coren, it is not the case that one’s beliefs stand entirely on their own without any support from what the agent can recall. There are always some epistemic contexts within which our beliefs can be located. Coren realizes that there are some seemingly plausible examples where beliefs seem to lack such contexts as when, as a result of hypnosis or a bang on my head, I come to believe that, say, the number of stars in the universe is a prime number.
In response, he denies that such cases of belief-formation (involving hypnosis or brain injury) are practically possible. He says that he can imagine guessing or wanting the number of stars to be a prime number. However, being an ordinary agent, he cannot imagine being in such an extraordinary, non-human state of believing that the number of stars is prime while having no supporting reasons. To conclude, Coren’s main objection to conservatism is that since we are not “able to imagine having a belief in total isolation from other beliefs…[we cannot] evaluate the question of whether having a bare belief would have any positive epistemic status” (2018, p.10).
While Coren’s skeptical observations about the possibility of bare beliefs is a useful antidote to the often loose and fast way that conservatives play with such beliefs, it may not be advisable to base a critique of conservatism completely on empirical claims such as the practical impossibility of beliefs resulting from hypnosis or the formation of beliefs that are not sensitive to epistemic reasons. After all, there are views that countenance forming beliefs on the basis of pragmatic reasons (Leary 2017 and Rinard 2018). Moreover, not all processes resulting in belief need to be of a cognitive variety. Beliefs can come and go as a direct result of brain injury. Only beliefs that are the result of a cognitive process carry the distinction of being responsive to reasons.
More importantly, there are at least two ways of understanding Coren’s analysis of a bare belief in terms of a “belief in total isolation from other beliefs” depending on whether “isolation” is understood as conceptual or epistemic isolation. It is surely true that beliefs cannot be held in conceptual isolation. For, being conceptually structured, beliefs are said to be inferentially integrated with other beliefs such that their combination can yield further beliefs as consequences (Stich 1978). This is what lies behind Jerry Fodor’s (1983) denial that cognitive systems, unlike perceptual systems, are modular and informationally encapsulated. A conservative, though, need not challenge the preceding observations. What she claims is rather that beliefs can acquire their epistemic status in epistemic isolation from other beliefs. Although beliefs always appear in holistic networks with other beliefs, it is possible for some of them, says the conservative, to acquire their justification in epistemic isolation. At the risk of begging the question against such conservatives, the impossibility of having beliefs in total epistemic isolation from other beliefs cannot serve as a premise in an argument against conservatism.
b. Modifying GC
It was pointed out that the problems arising from the attempts to substantiate the standard formulation of GC have prompted some conservatives to suggest alternative formulations of the conservative principle that are no longer vulnerable to those difficulties. These versions of GC either involve appending further conditions to GC or seek to radically revise some of its assumptions.
i. Conservatism as the Principle of Credulity
An early attempt to modify GC by adding further restrictions to it is due to William Lycan (1988). Lycan presents his version of GC as the “principle of credulity” according to which the “seeming” plausibility (truth) of beliefs is sufficient for their acceptance. He takes this principle to underlie the justification of what he calls “spontaneous beliefs,” namely, beliefs that are directly produced by such sources as perception, memory, and so forth. Lycan’s’s choice of the beliefs that are rendered justified by his principal of credulity, namely perceptual and memory beliefs, raises, however, the suspicion that what we are dealing with here is in fact an instance of the thesis of phenomenal conservatism (discussed earlier), according to which if it seems to you that p, then, in the absence of defeaters, you are justified in believing that p. For the phenomenal conservative such seemings (perceptual, memorial, intuitive, and so forth) constitute the general source of justification for the beliefs to which they give rise, whereas for the upholders of GC it is the mere fact of believing of a proposition that endows that proposition with some epistemic worth.
Leaving this point to one side, Lycan’s recognition of the over-permissiveness of his principle in justifying what he regards as “wild” spontaneous beliefs, such as religious and superstitious beliefs, prompts him to add a number of restrictions on it, such as consistency with previously justified explanatory beliefs as well as the availability of an explanation by the agent’s belief system of how his or her spontaneous beliefs are produced. He requires that:
[Our] total body of beliefs and theories yield an idea of how… spontaneous belief[s] [were] produced in us. Finally suppose that according to this idea or explanation, the mechanism that produced the belief was (as we may say) a reliable one, in good working order. Then, I submit, our spontaneous beliefs are fully justified. (1988, p.168)
Although such constraints might exclude Lycan’s “wild” beliefs, they seem to be so stringent that it is unlikely that they can even be satisfied in the case of spontaneous perceptual and memory beliefs. With Lycan’s principal of credulity, we have, once more, an example of a seemingly conservative principal whose epistemic engine is driven not by the mere holding of a belief itself but by factors external to that belief.
ii. Conservatism as Believing in the Absence of Defeaters
Lycan’s principle of credulity is not the only way of evading the problems that are associated with GC. Kevin McCain (2008) suggests another way of getting around such problems that involves appending GC with two defeating conditions along the following lines:
PEC
If S believes that p and p is not incoherent, then S is justified in retaining the belief that p and S remains justified in believing that p so long as p is not defeated for S.
Defeating Condition (C1)
If S has better reasons for believing that not-p than S’s reason for believing that p, then S is no longer justified in believing that p.
Defeating Condition (C2)
If S’s reasons for believing p and not-p are equally good and the belief that not-p coheres equally as well or better than the belief that p does with S’s other beliefs, then S is no longer justified in believing that p.
McCain’s account of conservatism involves references to both the fact of believing something as well as the absence of defeaters. He is quite explicit, though, that while S’s belief that p provides S with justification, the belief itself “is not counted among S’s reasons” (2008, p.187). He claims that the role that that the belief plays in its justification is akin to the role that the absence of defeaters plays in the justification of a belief. Perhaps what McCain has in mind here is that believing that p is merely an enabler for the reason that is supposed to justify the belief. It is not, however, clear what, once the belief itself is excluded from the realm of reasons, is supposed to play that role when one appeals to the conservative thesis. Closer examination of McCain’s account reveals, however, that this role is played by such notions as evidence and coherence in the account’s defeating conditions. (C1) requires that evidence for not-p not be stronger than evidence for p, and (C2) requires an asymmetry in the coherence of p and not-p with the rest of the agent’s beliefs. Thus, far from relying on mere belief to confer justification on its content, it is the strength of evidence for the belief as well as the coherence of the target belief p with the rest of the subject’s belief repertoire that are doing the main epistemic work. This conclusion can receive further support when we turn to McCain’s claims, on behalf of his account, that it can resolve skeptical and other standing problems of epistemology.
To illustrate, consider some of the ambitious claims that McCain makes on behalf of PEC. He thinks, for example, that his version of conservatism is able to neutralize the challenge presented by Cartesian-style skeptical arguments that seek to undermine our beliefs in the external world by proposing alternative hypotheses, such as dreaming, being a brain in a vat, and so forth, that can also explain our evidence (perceptual experience). According to McCain, his conservative thesis explains why the skeptical argument fails, for none of the skeptical hypotheses provide a defeater that satisfy C1. Evidence in favor of these hypotheses is no better than evidence in favor of the belief about the external world. Neither is C2 satisfied because our belief in the external world coheres better with our overall set of beliefs, including our commonsense beliefs such as, “It is raining now,” “I slept last night,” and so forth.
The crucial premise in McCain’s argument is that “[a]lthough these commonsense beliefs are closely related to the belief that there is an external world, they are not directly dependent upon the belief that that there is an external world. We do not form the belief that there is an external world and then infer from them the belief that ‘the sun is shining today,’ etc.” (2008, pp. 189-190). It is true that this is not how we form our commonsense beliefs. However, McCain makes an important assumption when clarifying his defeating condition C2. C2 prohibits that not-p cohere equally as well or better than the belief that p does with S’s other beliefs. This prompts the question of what sort of beliefs should be included in that repertoire when one is assessing whether the belief that p is justified. On pain of guaranteeing that C2 would not be met, McCain thinks that, as a necessary condition, neither p nor “any belief q that is directly dependent upon the belief that p for its justification should…be included in S’s set of beliefs in regard to [C2]” (2008, p.187). As this remark clearly indicates, beliefs that are to be included in the agent’s belief system are supposed not to depend on the target belief p in an epistemic, rather than an inferential, sense. They are supposed not to be “directly dependent upon the belief that p for [their] justification.” In other words, McCain is assuming that commonsense beliefs can be justified independently of whether or not one is justified in believing that there is an external world, and this is where his conservatism is helping itself to an assumption from outside of conservatism’s sphere.
To explain, just as his version of conservatism owed its epistemic bite to the involvement of such notions as evidence and coherence, its purported broader epistemic uses in resolving long standing epistemic disputes equally derives its power not from the thesis itself but from some substantive epistemological theory, namely dogmatism (liberalism), in its background. According to dogmatism, absent defeaters, experience is sufficient, on its own, to confer justification on the belief in its content. This accounts for why McCain thinks that commonsense beliefs (in the subject’s belief set) can be justified without being dependent for their justification on belief about the external world. By contrast, a rival theory such as Crispin Wright’s conservatism maintains that experiences can only justify one’s commonsense beliefs provided one is already warranted in believing that there is an external world. It is clear then that what is really doing the epistemic work in McCain’s response to the skeptical argument is not his conservatism as such but the dogmatist account of perceptual justification that is presupposed in that account. With such a view in place, there would be no need to appeal to doxastic conservatism.
iii. Conservatism as a Dynamic Strategy
Another approach, from Abelard Podgorski (2016), recommends a dynamic strategy in response to the problems discussed so far. Podgorski agrees with many of the objections that have been leveled against conservatism, such as bootstrapping and arbitrariness. Accordingly, he seeks to present an alternative conservative view that, while incorporating the basic motivations for GC, is not susceptible to its weaknesses. His proposal involves a dynamic interpretation of the rational relevance of two types of considerations: those that bear on the question of whether p and those that bear on the question of whether to make up one’s mind about p. He intends the dynamic slant to make it clear that the relevant norms governing our epistemic life are those that govern processes, rather than states, in particular the process of considering whether p, for some proposition p. Two such norms are distinguished: those that regulate when to initiate the process of considering whether p is true and those that govern the rational operation of that process.
Accordingly, what is distinctive of dynamic conservatism is that it appeals not to the norms that generate an agent’s mental states at particular times but to the norms that govern the process of considering whether p. To see how the dynamic approach intends to secure conservatism, Podgorski introduces the following norm for regulating the initiation of consideration.
Inconsiderate
One is not always rationally required to initiate consideration whether p when one believes that p and one’s evidence does not make p worth believing (from one’s perspective).
If Inconsiderate is true, one may permissibly fail to reconsider belief in p while one’s evidence does not make p worth believing. Now, just as there are things that bear on whether or not something is worth believing, there are things that bear on whether a question is worth opening for consideration. For example, it is worth considering whether p if it is important that my belief p is true, or my evidential situation regarding p is significantly better now than it was before or will be in the future. On the other hand, it is less worth considering whether p if, say, it does not matter whether my belief p is true, or if my evidential situation regarding p is worse than it was when I last considered p. Costs involving time and cognitive effort can also negatively bear on if it is worth considering whether p. Since Podgorski takes not considering to have a default permissible status, he concludes that “[w]e are not required to consider a question until we have some special positive reason to do so. So agents will be rational in maintaining any given belief for at least as long as they do not encounter such a reason” (2016, pp. 366-67).
Podgorski claims that, like standard conservatism, dynamic conservatism is also sensitive to the fact that changing beliefs incurs cognitive costs. It also explains the phenomenon of lost justification or evidence, for as long as an agent lacks reasons to reconsider her belief, she may, even having lost her evidence, persist in her belief. However, by rejecting the claim that bare belief can confer justification on its content, dynamic conservatism differs from the standard version. On the dynamic view, the belief that is held by an agent need not be worth having, because “by rejecting state-oriented norms demanding the worthiness of our beliefs, we allow that there are periods of time where what a belief has going for it simply does not matter for an agent’s rationality” (2016, p. 372).
However, it may be that by rejecting the main tenet of GC, dynamic conservatism becomes too modest to have an epistemic impact when it comes to regulating our belief-forming processes. Consider, for example, a case of forgotten evidence where, having forgotten the original evidence, there happens to be some reason to effectively reconsider our belief, since its truth turns out to be important in that context. Here, evidentialists usually appeal to second-order evidence (about, say, the general reliability of memory, and so forth). As we have seen, however, this sort of evidence is legitimate only if the belief in question was originally based on that evidence. Therefore, while this evidence can be used to ground the rationality of one’s active reaffirmation of that belief, it fails to explain one’s rationality when one does not perform such affirmation. Dynamic conservatism, however, seems to fall short in the opposite direction.
It can explain the rationality of the agent’s holding on to his belief in the cases of forgotten evidence, where he lacks a special reason to reconsider that belief. It also explains such cases without locating the source of this rationality in the agent’s current or past evidence. It seems to fall short of accounting for the agent’s rationality in the sort of cases described above, where there is reason to reconsider one’s belief. As Podgorski admits, “[t]he cases the [dynamic] view does not endorse as rational are those where an agent actively reaffirms their belief without relevant second-order information. And these are the cases where it is least intuitive that doxastic inertia is rational. Nevertheless, if these are taken to be core cases, it must be admitted that this is a genuine disadvantage of the dynamic approach” (2016, p. 370).
6. Conclusion
The doxastic conservatism debate develops out of the attempts to show that our tendency to maintain and preserve our beliefs beyond the evidence at our disposal is a rational phenomenon. Conservatism presents itself as a normative thesis with the potential to resolve a number of outstanding issues in epistemology. It turns out, however, that there is not just one single conservative principle, but a variety of such theses. Further discussions of doxastic conservatism may focus on these contenders and how they relate to the properties of belief relevant to the epistemic evaluation of doxastic states when one encounters evidentially equivalent alternatives, the perseverance of doxastic states in the absence of specific reasons to change them, and whether features of one’s doxastic state can add to the justification of the beliefs that constitute it.
7. References and Further Reading
Adler, J. 1990. “Conservatism and Tacit Confirmation,” Mind 99: 559-570.
Alston, W. 1989. Epistemic Justification, Cornell University Press.
Chisholm, R. 1980. “A Version of Foundationalism” in Wettstein, et al (eds.), Midwest Studies in Philosophy V, University of Minnesota Press.
Christensen, D. 1994. “Conservatism in Epistemology,” Nous 28: 69-89.
Christensen, D. 2000. “Diachronic coherence and epistemic impartiality,” The Philosophical review 109, 3: 349-371.
Davidson, D. 1984. “Radical Interpretation,” reprinted in Inquires into Truth and Interpretation, Oxford: Clarendon.
Feldman, R. 2014. “Evidence of evidence is evidence” in Matheson, J. and R. Vitz (eds.), The Ethics of Belief, New York: Oxford University Press: 284–300.
Fodor, J. 1983. The Modularity of Mind. MIT Press.
Foley, R. 1982. “Epistemic Conservatism,” Philosophical Studies 43: 165-182.
Foley, R. 1987. The Theory of Epistemic Rationality, Harvard University Press.
Frise, M. 2017. “Internalism and the problem of stored beliefs,” Erkenntnis 82: 285–304.
Fumerton, R. 2007. “Epistemic conservatism: Theft or honest toil?” Oxford Studies in Epistemology 2: 63–86.
Goldman, A. 1979. “Varieties of Epistemic Appraisal,” Nous 13: 23-38.
Goldman, A. 1986. Epistemology and Cognition, Harvard University Press.
Goldman, A. 1999. “Internalism Exposed,” The Journal of Philosophy 96: 271-293.
Goldstick, D. 1971. “Methodological Conservatism,” American Philosophical Quarterly 8: 186-191.
Harman, G. 1986. Change in View, MIT Press.
Huemer, Michael. 2001. Skepticism and the Veil of Perception, Rowman and Littlefield.
Kvanvig, J. 1989. “Conservatism and its Virtues,” Synthese, 79:143–163.
Lycan, W. 1988. Judgement and Justification, Cambridge University Press.
McCain, K. 2008. “The Virtues of Epistemic Conservatism,” Synthese, 164:185–200.
McCain, K. 2020. “Epistemic Conservatism and the Basing relation,” in Carter, A. and P. Bondy (eds.) Well-Founded Belief, Routledge.
McGrath, M. 2007. “Memory and Epistemic Conservatism,” Synthese, 157: 1–24.
Moffett, M. 2007. “Reasonable Disagreement and Rational Group Inquiry”, Episteme 4, 3: 352-367.
Podgorski, A. 2016. “Dynamic Conservatism,” Ergo, 3: 349–376.
Poston, T. 2012. “Is There an ‘I’ in Epistemology?” Dialectica 6, 4: 517-541.
Poston, T. 2014. Reason and Explanation: A Defense of Explanatory Coherentism, Palgrave Macmillan.
Pryor, J. 2000. “The Skeptic and the Dogmatist,” Nous, 34: 517-49.
Sklar, L. 1975. “Methodological Conservatism,” Philosophical Review LXXIV: 186-191.
Smithies, D. 2019. The Epistemic Role of Consciousness, Oxford University Press.
Stich, Stephen. 1978. “Beliefs and Subdoxastic States,” Philosophy of Science 45: 499– 518.
Quine, W.V.O. 1951, “Two Dogmas of Empiricism” in From a Logical Point of View, 2nd ed. New York: Harper & Row.
Vahid, H. 2004. “Varieties of Epistemic Conservatism,” Synthese 141: 97-122.
Vogel, J. 1992. “Sklar on Methodological Conservatism,” Philosophy and Phenomenological Research 52: 125-131.
White, R. 2007. “Epistemic Subjectivism,” Episteme 4: 115-129.
Wright, C. 2004. “Warrant for Nothing (and Foundations for Free)?” Aristotelian Society Supplementary Volume, 78, 1:167–212.
Author Information
Hamid Vahid
Email: hamid36vahid@gmail.com
Institute for Research in Fundamental Sciences
Iran
The Indeterminacy of Translation and Radical Interpretation
The indeterminacy of translation is the thesis that translation, meaning, and reference are all indeterminate: there are always alternative translations of a sentence and a term, and nothing objective in the world can decide which translation is the right one. This is a skeptical conclusion because what it really implies is that there is no fact of the matter about the correct translation of a sentence and a term. It would be an illusion to think that there is a unique meaning which each sentence possesses and a determinate object to which each term refers.
Arguments in favor of the indeterminacy thesis first appear in the influential works of W. V. O. Quine, especially in his discussion of radical translation. Radical translation focuses on a translator who has been assigned to translate the utterances of a speaker speaking a radically unknown language. She is required to accomplish this task solely by observing the behavior of the speaker and the happenings in the environment. Quine claims that a careful study of such a process reveals that there can be no determinate and uniquely correct translation, meaning, and reference for any linguistic expression. As a result, our traditional understanding of meaning and reference is to be thrown away. Quine’s most famous student, Donald Davidson, develops this scenario under the title of “radical interpretation.” Among other differences, radical interpretation is distinguished from Quine’s radical translation with regard to its concentration on an interpreter constructing a theory of meaning for the speaker’s language. Such a theory is supposed to systematically entail the meaning of the speaker’s sentences. Nonetheless, radical interpretation too cannot resist the emergence of indeterminacy. According to the thesis of the indeterminacy of interpretation, there always will be rival interpretations of the speaker’s language, and no objective criterion can decide which interpretation is to be chosen as the right one.
These views of Quine and Davidson have been well received by analytic philosophers particularly because of their anti-Cartesian approach to knowledge. This approach says knowledge of what we mean by our sentences and what we believe about the external world, other minds, and even ourselves cannot be grounded in any infallible a priori knowledge; instead, we are rather bound to study this knowledge from a third-person point of view, that is, from the standpoint of others who are attempting to understand what we mean and believe. What the indeterminacy of translation/interpretation adds to this picture is that there can never be one unique, correct way of determining what these meanings and beliefs are.
The article begins with Quine’s arguments for the indeterminacy of translation, then introduces Davidson’s treatment of indeterminacy by focusing on his semantic project and the scenario of radical interpretation. Then the discussion turns to David Lewis’s version of radical interpretation, Daniel Dennett’s intentional stance, and the way Lewis and Dennett treat the indeterminacy of interpretation.
1. Quine’s Naturalized Epistemology and Physicalism
Quine has famously argued that the reference of any language’s term and the meaning of any language’s sentence is indeterminate. When a speaker uses terms like “rabbit”, “tree”, and “rock”, it can never be settled to what specific object she is referring.When she utters “that’s a rabbit”, “that’s a tree”, “tigers are fast”, and the like, it will always remain indeterminate what she really means by them. These claims can be called the “skeptical conclusions” of Quine’s arguments for the indeterminacy of translation.
The first preliminary point to note is that this sort of skepticism is not epistemological but constitutive. Quine’s claim will not be that it is difficult to know what someone means by her words, or that we may lack the sort of epistemic powers, skills, or tools required to ascertain such meanings. His claim is that there is no determinate meaning and reference to know at all: there is no fact as to what a sentence means and what a term refers to. This is what Quine means by the claim that meaning and reference are indeterminate.
Quine has two famous arguments for these conclusions: (1) the argument from below, which is also called the argument for the “inscrutability of reference”, “indeterminacy of reference” and “ontological relativity” (Quine 1970), and (2) the “argument from above”, which is also called the argument for the “indeterminacy of translation” (Quine 1970) or “holophrastic indeterminacy” (Quine 1990a). The two arguments are discussed below after first considering the grounds on which Quine builds his arguments since the arguments rely on a variety of important positions, among which Quine’s version of naturalism is significant.
a. Quine’s Naturalism
According to Quinean naturalism, there is no such thing as first philosophy which, independently of natural science, can offer unquestionable knowledge of the world; rather, philosophy is to be viewed as continuous with science, especially physics (Quine,1981). On this view, we are bound to investigate the world, human beings included, from the standpoint of our best scientific theory. In our study of the world, we should take a “third-person” point of view rather than a Foundationalist Cartesian one. The Cartesian position advocatesa priori and infallible knowledge, on the basis of which our knowledge of the external world and other minds can be established. Something is knowable a priori if it can be known independently of any specific experience of the external world, and such knowledge is infallible if it is immune to doubt or uncertainty. For Descartes, such knowledge cannot be dependent on, or inferred from, science because science relies on what we can perceive via our senses, and we can never trust our senses: they can deceive us. The Cartesian view, therefore, looks for a source of knowledge that is free from such doubts. Quine, especially in his famous article “Two Dogmas of Empiricism” (Quine 1951), argues that any hope of finding such an a priori basis for knowledge is illusory because, among other reasons, the analytic/synthetic distinction cannot be preserved.
Analytic statements are traditionally held to be true in virtue of the meaning of their constituent parts. Anyone who knows English and thus knows what “bachelor” and “unmarried” mean would know that the sentence “bachelors are unmarried” is true. Synthetic statements (such as “it’s raining”) are those which are true not solely on the basis of the meaning of their terms, but also on the basis of what goes on in the world. Many philosophers believed that if a statement is analytic, it is also necessarily true, and what it expresses is knowable a priori. In “Two Dogmas of Empiricism”, Quine argues that there is no non-circular way of defining the notion of analyticity. If so, what then does form the bedrock of our knowledge of the world? Quine’s answer is natural science.
This is part of what provides Quine with enough reason to call his philosophy “naturalistic”. If epistemology is defined as the study of knowledge, then Quine insists that epistemology must be naturalized: it must follow the methods of science (Quine 1969b). However, what does a scientist do? A scientist investigates the connection between her theory and the (sensory) evidence or data she collects from the world. She makes observations, forms hypotheses about the future behavior of certain objects or the occurrence of future events, and checks whether they are supported by further evidence. Investigating the link between evidence and theory, and the support the latter can receive from the former, is the best we can do in our study of a subject matter. We can never stand outside of our theory and survey the world; we are bound to work from within (Quine 1981). Philosophers interested in the study of reality, knowledge, morality, mind, meaning, translation, and so forth, have no choice but to proceed in the same way, that is, to explore the link between the relevant flow of evidence and their best (scientific) theory about them. This explains why Quine is also called a “physicalist”.
b. Quine’s Physicalism
Quine’s view of physicalism has changed during his philosophical career. The clearest characterization of it has been offered by Quine himself: “Nothing happens in the world … without some redistribution of microphysical states” (Quine 1981, 98). According to this view, in the absence of some relevant physical change, there can be no real change in any subject matter. Let’s use the notion of “facts of the matter”. Our scientific theory of the world works if the world can be viewed as consisting in specific things, that is, if there are certain facts of the matter about them. For instance, the theory works if there are molecules, electrons, trees, neutrinos, and so forth; it tells us that molecules have certain features, move in such and such a way, and are made of such and such elements. Quine’s physicalism implies that facts about any subject matter are to be fixed by the totality of such facts about the world, and the totality of facts about world is fixed by our choice of a total theory of the world. For example, if one claims that temperature is real, and thus there are facts about temperature, such facts are to be determined once the relevant physical facts are fixed, which are, in this case, facts about molecules’ average kinetic energy at a certain time. According to Quine’s physicalism, we can legitimately talk about facts about temperature because once we know the amount of molecules’ average kinetic energy, we know all there is to know about temperature. In this sense, the physical facts have fixed the facts about temperature. Therefore, we can characterize Quine’s physicalism as the view implying that either the totality of physical facts determines the facts about a subject matter, or there is simply no fact about that subject matter at all. This view will play a vital role in Quine’s arguments for the indeterminacy of translation.
One of our most central questions in the philosophy of language concerns what determines the meaning of a linguistic expression. We can already guess that, for Quine, any answer to this question must be offered from a naturalistic point of view. We should see what science can tell us about our linguistic practices, especially that of meaning something by an expression. The indeterminacy of translation arises from such a Quinean way of treating the questions about meaning and reference.
2. Quine’s Arguments for the Indeterminacy of Translation
For the moment, assume that Quine can successfully establish the skeptical conclusion that there is no fact of the matter about the correct translation of any expression. When we talk about translating terms, we talk about pairing two terms which have the same reference. For instance, if you look at a standard German to English dictionary, you can find “snow” is the translation of the German word “Schnee”. Both of these terms refer to a certain sort of thing: snow. Moreover, when we talk about translation in the case of sentences, we talk about the process of pairing two sentences, such as “snow is white” and “Der Schnee ist weiss”, in terms of having the same meaning. But, if neither “snow is white” nor “Der Schnee ist weiss” can be said to have any determinate meaning, it follows that we cannot say that one is the correct translation of the other simply because there is no such thing as one unique meaning that they share, and vice versa, that is, if none can be said to be the correct translation of the other, there is then no unique meaning which can be claimed to be shared by them. This shows that meaning can be studied in terms of translation. If Quine can lead us to skepticism about the existence of correct translations, he has thereby led us to skepticism about the existence of determinate meanings.
Quine invites us to consider the way in which we learn our first language. Through such a process, we learn how to use our language’s terms correctly, especially “Mama”, “Milk”, “Fire” and so on, which can be treated as one-word sentences. We gradually become competent in detecting different parts of sentences and understanding how these parts can be put together to form more complex expressions. We finally gain mastery of the use of our language so that others in our speech-community can treat us as reliable users of it. Quine thinks that, instead of talking about such a complex process of learning a first language, we can “less abstractly and more realistically” talk about translation (Quine 1960, 27). Imagine that a translator finds herself in the middle of Amazon Jungle and faces a member of a tribe nearby whose language is entirely unknown to her. In order to start communicating with this native speaker, she should start translating his utterances. For each expression in the native’s language, she should find an expression in her own language which has the same meaning: she starts making a dictionary for that language. Since the language is radically unknown, our translator is called a “radical translator”.
a. Quine’s Radical Translation Scenario
Imagine that where our radical translator and the native meet, a rabbit scurries by, and the native utters “Gavagai”. The translator treats “Gavagai” as a one-word sentence. Considering the presence of the rabbit and the native’s response, the translator writes down “Lo, a rabbit” as the hypothetical translation of “Gavagai”. Her reason is this: in a similar situation, she would utter “Lo, a rabbit”. This translation is currently only hypothetical because one observation alone would not be enough for the translator to decide whether “Lo, a rabbit” is the correct translation of “Gavagai”. She continues checking this hypothetical translation with further evidence. For instance, suppose that she has been successful in detecting that the native’s word “Evet” corresponds to “Yes” and “York” corresponds to “No”. Suppose again that a different rabbit with a different color is observable and the translator points to it and asks: “Gavagai?” Assume that the native responds by “Evet”. In this situation, Quine says that the native assents to “Gavagai” in the presence of the rabbit. On a different occasion, an owl is present and the translator asks the same question “Gavagai?” The native responds by “York” this time. In this situation, the native dissents from “Gavagai”.
The native’s behavioral responses, that is, his assent to, or dissent from, a sentence on specific occasions, are pivotal for Quine’s project because they form the “evidential basis” for translation. For two reasons the translator cannot have access to anything more than this sort of evidence. First of all, the native’s language is radically unknown to the translator: she has no prior information whatsoever about what the native’s words mean and what the native believes. This by itself puts a considerable limitation on the sort of evidence available to her. Secondly, Quine was a physicalist. For Quine, physicalism, in the case of translation, manifests itself in a sort of behaviorism. The reason is that the relevant physical facts about translation are facts about the observable behavior of the speaker, that is, the native’s assents and dissents. To be more precise, the translator can appeal only to the native’s dispositions to verbal behavior. As Quine famously puts it, “there is nothing in linguistic meaning…beyond what is to be gleaned from overt behavior in observable circumstances” (Quine 1987, 5). Therefore, when Quine talks about “evidence”, he talks about behavioral evidence, and when he talks about “facts”, he talks about the native’s observable behavior.
Suppose that the translator, after making several observations, has become confident that “Lo, a rabbit” is to be considered as the correct translation of “Gavagai”. Another important notion is introduced by Quine at this point. We can now say that “Gavagai” and “Lo, a rabbit” are stimulus synonymous, or have the same stimulus meaning (Quine 1990a). The claim that “Gavagai” and “Lo, a rabbit” have the same stimulus meaning is equivalent to the claim that what prompts the native to assent to (or dissent from) “Gavagai” also prompts the translator to assent to (or dissent from) “Lo, a rabbit”. What causes the native to assent to “Gavagai” and the translator to assent to “Lo, a rabbit” is the presence of a rabbit. Therefore, the stimulus meaning of “Gavagai” is the set of all the stimulations which prompt the native to assent to, or dissent from, “Gavagai”. Similarly, the stimulus meaning of “Lo, a rabbit” is the set of all the stimulations which cause the translator to assent to, or dissent from, “Lo, a rabbit”. Since the stimulations were the same in this case, that is, the presence of a rabbit, we can conclude that “Gavagai” and “Lo, a rabbit” have the same stimulus meaning. But why does Quine talk about stimulations, rather than objects? Instead of talking about rabbit stimulations, one may complain, he could simply say that rabbits prompt the native to assent to “Gavagai”.
Quine’s insistence on treating stimulations rather than objects as central has its roots in his adherence to naturalism. For him, what is scientifically worth considering about meaning and reference is the pattern of stimulations since, as Quine puts it, “it is a finding of natural science itself, however fallible, that our information about the world comes only through impacts on our sensory receptors” (Quine 1990a, 19). What science tells us, in this case, is that the native and the translator, via observing the rabbit in view, would have a visual stimulation, or some “pattern of chromatic irradiation of the eye” (Quine 1960, 31). For Quine, we can assume that the native would be prompted to assent to “Gavagai” by the same irradiations which prompt the translator to assent to “Lo, a rabbit”. Even if we linguists wanted to talk about the rabbit itself, we had no other way but to rely on what our sensory receptors receive from touching it, seeing it, and the like.
Having reviewed the scenario of radical translation, consider Quine’s first argument for indeterminacy, that is, his argument for the inscrutability of reference.
b. The Argument from Below: The Inscrutability of Reference
With the notion of stimulus meaning at hand, we can introduce Quine’s more technical notion of “observation sentences”, which also has an important role to play in his arguments. Our radical translator starts her translation by focusing on the native’s sentences which are about the immediate happenings in the world. Quine calls sentences like “Lo, a rabbit”, “it’s raining”, “that’s a tree”, and the like, “observation sentences.” Observation sentences themselves belong to the category of “occasion sentences”, the sentences that are true on some occasions and false on others. For instance, the sentence “it’s raining” as uttered by the speaker at time t is true if it is raining around her at t. The truth-value of occasion sentences, that is, their truth or falsity, depends on whether the speaker is prompted to assent to, or dissent from, them on specific occasions. Thus, the stimulus meaning of occasion sentences is highly sensitive to the occasion of speech and may change with regard to some additional information the speaker may receive. (On the contrary, “standing sentences” are much less sensitive to the occasion of speech, such as “rabbits are animals”.) Observation sentences are those occasion sentences which are more stable with regard to their stimulus meaning, in the sense that almost all members of a speech-community can be said to have more or less similar dispositions to assent to, or dissent from, them on specific occasions. Our translator is primarily concerned with translating the native’s observation sentences. Her aim is to match the native’s observation sentences, such as “Gavagai”, with the observation sentences of her own language, such as “Lo, a rabbit”, by way of discovering whether these sentences have the same stimulus meaning, that is, whether the native’s and the translator’s assents to, or dissents from, them are prompted by the same sort of stimulations. To simplify the example, assume that the native utters “Yo, gavagai”.
Quine’s principal question is this: Given that “Yo, gavagai” and “Lo, a rabbit” have the same stimulus meaning, would this fact justify claiming that the terms “gavagai” and “rabbit” are the correct translations of one another? Quine’s answer is negative. One term is the correct translation of another if both refer to the same thing, or if both have the same reference. But, as Quine argues, the fact that “Yo, gavagai” and “Lo, a rabbit” are stimulus synonymous cannot show that the native’s term “gavagai” and the translator’s term “rabbit” have the same referent. In order to see why, imagine that there is a second translator translating the observation sentences of another member of the native tribe. Suppose that when, for the first time, the native utters “Yo, gavagai” in the presence of a rabbit, our second translator, before writing down “Lo, a rabbit” as the translation of “Yo, gavagai”, hesitates for a moment. Having taken into account the cultural and other differences between him and the native, he decides to take “Lo, an undetached rabbit-part” as his hypothetical translation of “Yo, gavagai”, on the basis of the idea that, perhaps, the natives believe that there are only particulars in the world, not whole objects. The translator thinks that he would have assented to “Lo, an undetached rabbit-part” if he had had such a belief about the world. Our translator, however, does not need to be worried because if he is wrong, he will soon find some evidence to the contrary leading him to throw away such a hypothetical translation and replace it with “Lo, a rabbit”. He goes on, just like our first translator, and checks the native’s assents and dissents with regard to “Yo, gavagai” on different occasions.
The problem is that the same sort of evidence which led our first translator to translate “Yo, gavagai” into “Lo, a rabbit”, equally well supports the second translator’s translation, “Lo, an undetached rabbit-part”. The reason is simple: whenever a rabbit is present, an undetached rabbit-part (such as its ear) is also present. The problem becomes worse once we realize that there can be an infinite number of such alternative translations, such as “Lo, another manifestation of rabbithood”, “Lo, a rabbit time-slice”, and so forth. All such translations are mutually incompatible but are compatible with all evidence there is with regard to the native’s verbal behavior. Nothing in the native’s assents to “Yo, gavagai” in the presence of rabbits can discriminate between such rival translations. The two translators have come up with different dictionaries, that is, different sets of translations of the native’s terms, in each of which a different translation has been offered for the native’s term “gavagai”. In one, it has been suggested that “gavagai” refers to what “rabbit” refers to because, for the first translator, “Lo, a rabbit” and “Yo, gavagai” have the same stimulus meaning. In another, it has been suggested that “gavagai” stands for what “an undetached rabbit-part” refers to because “Lo, an undetached rabbit-part” and “Yo, gavagai” have the same stimulus meaning. Which of these translations is to be chosen as the correct one? To which object does “gavagai” refer after all?
Quine famously claims that there is no objective basis for deciding which translation is right and which is wrong. There are indefinitely many mutually different translations of a term, which are compatible with all possible facts about stimulus meaning. “Yo, gavagai”, “Lo, a rabbit”, “Lo, an undetached rabbit-part”, and so on, are all stimulus synonymous. And obviously such facts do not suffice to determine the reference of the term “gavagai”: all the stimulations which prompt the native to assent to “Yo, gavagai” prompt assent to “Lo, a rabbit”, “Lo, an undetached rabbit-part”, and so forth. This implies that for “gavagai” there can be indefinitely many referents, and there would be nothing objective on the basis of which we can determine which one is the real referent of “gavagai”. As a result, the reference of the native’s term “gavagai” becomes inscrutable. Also, since the same problem can potentially arise for any term in any language, reference is inscrutable in general.
To see why this conclusion is skeptical, recall Quine’s physicalism: either the physical facts fix the semantic facts by picking out one unique translation of the native’s term as the correct one, or they fail to fix such semantic facts, in which case it should be concluded that there are no such facts at all. The physical facts, in the case of translation, were the facts about the native’s assents and dissents, and they failed to determine the reference of the term “gavagai”. There is, therefore, no fact as to what a term refers to. Again, this sort of skepticism is not epistemological: the claim is not that there is a hidden fact within the physical facts which, if we had the epistemic power to discover it, would solve our problem. Quine’s claim has rather an ontological consequence: since it remains forever indeterminate to what things in the world people refer by their terms, it is entirely indeterminate how they slice the world. This is the reason why the inscrutability of reference leads to “ontological relativity”: it would never be determinate whether, for the native, the world consists in whole enduring rabbits or only rabbit-parts.
This argument has been subject to various criticisms. Most of them target the “gavagai” example, but Quine does not think that such criticisms succeed. For instance, many may think that the solution to the above indeterminacy problem is simple: Why not simply ask the native? Assume that we have found out how to translate the native’s words for “is the same as”. The problem will be solved if the linguist points to the rabbit’s ear and simultaneously to the rabbit’s foot and asks the native, “Is this gavagai the same as that gavagai?” If the native responds positively by “Evet”, then “gavagai” refers to the whole rabbit because the rabbit’s ear and the rabbit’s foot are two different rabbit-parts. Quine’s response is that you simply begged the question by presupposing that the translation of the native’s expression for “is the same as” (whatever it is in the native’s language) is determinate. But, what if its translation be “is part of the same rabbit”? In this case, when we asked, “Is this gavagai the same as that gavagai?”, what we were asking was: “Is this gavagai part of the same rabbit as that gavagai?” The native’s previous positive response is now compatible with the assumption that by “gavagai” the native refers to an undetached rabbit-part because the ear and the foot are indeed parts of the same rabbit.
For Quine, the problem is deeper than this: the “gavagai” example has been just a convenient way of putting it. Nonetheless, many philosophers of language have found this response unconvincing. There is an interesting debate about it between the proponents and the opponents of Quine’s argument from below. To mention some of the most famous ones, Gareth Evans (Evans 1975) and Jerry Fodor (Fodor 1993, 58-79) have attempted to modify and push the general sort of objection introduced above. Mark Richard (Richard 1997) and especially Christopher Hookway (Hookway 1988, 151-155) have argued that Quine is right in his claim that this strategy would inevitably fail because we can always offer alternative translations of the native’s terms which remain compatible with any such modifications. Although these alternative translations may seem too complex, odd, or unnatural, what would prevent us from taking the native to believe in them?
c. The Argument from Above: The Indeterminacy of Translation
Having been disappointed with such debates about his “gavagai” example, Quine claimed that, for those who have not been satisfied with the argument from below, he has a very different, broader, and deeper argument: the “argument from above”. It is this second argument that Quine prefers to call his argument for “the indeterminacy of translation” (Quine 1970). One reason is that his previous argument for the inscrutability of reference at most results in the conclusion that there are always alternative translations of the native’s sentences because facts about stimulus meaning cannot fix the reference of sub-sentential parts of the sentences. The truth-value of the sentences, however, remains the same since if “Lo, a rabbit” is true because of the dispositions to assent to, or dissent from, it in the presence of a rabbit, then “Lo, an undetached rabbit-part” would also be true on the same basis. Quine argues that there can be rival translations of the native’s whole sentences such that the same sentence can be true in one and false in another.
The argument from above rests on the thesis of the “underdetermination of theory by evidence” and its relation to the indeterminacy thesis. Quine’s argument can have a very simple characterization: insofar as a theory is underdetermined by evidence, the translation of the theory is also indeterminate. In an even simpler way, Quine’s claim is that underdetermination together with physicalism results in the indeterminacy of translation. Contrary to its simple characterization, however, the argument is more complex than the argument from below because it is not based on any interesting example by which the argument can be established step by step; it is rather based on much theoretical discussion. To begin with, what does Quine mean by “underdetermination of theory by evidence”?
i. Confirmational Holism and Underdetermination
Quine’s thesis of underdetermination of theory by evidence claims that different theories of the world can be empirically equivalent (Quine 1990b). This thesis stems from Quine’s famous “confirmational holism” (or, as it is sometimes called, “epistemological holism”). Confirmational holism appears more vividly in “Two Dogmas of Empiricism”, where Quine famously states that “our statements about the external world face the tribunal of sense experience not individually, but only as a corporate body” (Quine 1951, 38). Let’s see what this claim implies.
A scientific theory consists of a variety of sentences, from observation sentences to theoretical ones. Observations sentences were particularly important because their stimulus meaning was directly linked to immediate observables. There are, however, theoretical sentences whose stimulus meaning is less directly linked to observables, such as “neutrinos have mass” or “space-time is curved”. Another part of such a theory consists in what are sometimes called “auxiliary hypotheses or assumptions” (Quine and Ullian 1978, 79). These are statements about, for instance, the conditions of the experiments, the experimenters, the lab, when the observations have been made, and so forth. We can take total science, or our total theory of the world, as “a man-made fabric which impinges on experience only along the edges. … [T]otal science is like a field of force whose boundary conditions are experience” (Quine 1951, 39). Such a theory is like a web with observation sentences at its outer layers and logic and mathematics at its core.
Quine’s confirmational holism implies that a single statement in isolation cannot be confirmed by any observation, evidence, or data because there would always be other factors involved in making such a decision. Suppose that some newly found evidence contradicts your theory. According to confirmational holism, the emergence of such a conflict between the theory and the evidence does not necessarily force you to abandon your theory and start constructing a new one. Rather you always have a choice: you can hold onto any part of your theory, provided that you can make some complementary changes, or proper compensations, elsewhere in your theory so that the theory can preserve its consistency. In this way, the conflicting evidence can be handled by manipulating some of the auxiliary hypotheses. Compensations can potentially be made in many different ways and thus different parts of the theory can be saved. Each alteration, however, can result in a different theory. The important point to note is that although these theories are different, they are empirically equivalent because they are all compatible with the same body of evidence. In this case, your theory is underdetermined by that set of evidence. More generally, for any set of data, no matter how big it is, there can always be different theories which are compatible with that set.
There are different characterizations of underdetermination. Strong underdetermination, which Quine initially works with in his argument from above, states that our total theory is underdetermined even by the totality of all possible evidence. Quine also believed that there can be empirically equivalent theories which are logicallyincompatible (Quine 1970). Two theories are logically incompatible if the same sentence is true in one and false in another. But, in his later works, he almost gives up on this claim and takes all such theories to be empirically equivalent and logically compatible, though they are now counted as rival ones if they cannot be reduced to one another term by term and sentence by sentence (Quine 1990b). Moreover, your theory can be viewed as underdetermined by all data so far collected; it may be taken to be underdetermined by all data collectable from the past to the future, though some factors may remain unnoticed to you. In all such cases, underdetermination survives. For suppose that your theory A is compatible with all data collected from the past to the present. Other theories can be made out of A by changing different parts of it (and making proper compensations.) The result of (at least some of) such changes would be different theories. The theory A is thus underdetermined by such a set of data (Quine 1990c).
It is also important to note that the underdetermination thesis is an epistemological thesis, not a skeptical one with ontological consequences. Suppose that we have come up with a total theory of the world, within which the totality of truths about the world is now fixed. This theory too is naturally underdetermined by all possible data so that there will be rival theories which are compatible with all such possible data. This fact, however, does not result in the skeptical conclusion that there is thereby no fact of the matter about the world. It only implies that there are always different ways of describing it. The reason has its roots in Quine’s naturalism again, according to which there is no such a thing as a free-from-theory stance from which you can compare such theories. You are always working from within a theory. Although there are always rival ones, once you choose one underdetermined theory, all facts of the matter about the world are considered fixed within it. From within your favored theory, you expect no additional underdetermination to emerge with regard to what your theory says there are in the world. Now, what is the relation between underdetermination and indeterminacy?
ii. Underdetermination and Indeterminacy
Quine’s claim was that insofar as a theory is underdetermined by evidence its translation is indeterminate. The question is how we reach skepticism about translation from underdetermination of theory. This is an important question because underdetermination resulted in an epistemological problem only: even if all possible evidence is at hand, there always are rival theories which are compatible with such a set of evidence. For Quine, linguistics is part of natural science. Thus, it seems that, in the case of translation too, we should face nothing more serious than a similar epistemological problem, that is, the underdetermination of translation by evidence: even if we have the set of all behavioral evidence, there will always remain rival translations of the native’s sentences. This conclusion does not result in the skeptical conclusion that there is no fact of the matter about correct translation. Thus, one may complain that Quine would not be justified in claiming that, in the case of translation, we have to deal with the skeptical problem of indeterminacy. This is the objection which Chomsky (Chomsky 1968) makes to Quine’s indeterminacy thesis.
For Chomsky, we all agree that, for any set of data, there can always be different theories implying it. But the underdetermination of our scientific theory does not lead to any skepticism about the world: we do not claim that there is no fact of the matter about, for instance, tables and trees. Why should there be any difference when the case of our study becomes that of translation? Quine famously replies that the distinction between underdetermination and indeterminacy is what “Chomsky did not dismiss … He missed it” (Quine 1968, 276). For Quine, indeterminacy and underdetermination are parallel, but only up to a certain point. It is true that, in the case of translation too, we have the problem of underdetermination since the translation of the native’s sentences is underdetermined by all possible observations of the native’s verbal behavior so that there will always remain rival translations which are compatible with such a set of evidence. To this extent, Quine agrees with Chomsky. Nonetheless, he believes that indeterminacy is parallel but additional to underdetermination. When do these two theses differ in the case of translation?
Quine’s answer has its roots in his naturalistic claim that our best scientific theory is all there is to work with: it is the ultimate parameter. Even our total theory of the world would be underdetermined by the totality of all evidence. Quine’s point is that once you favor an underdetermined theory, the totality of truths about the world is thereby fixed. Take such a theory to be A. According to Quine, even withinA, translation still varies and thereby remains underdetermined. Translation is thus doubly underdetermined: an additional underdetermination reoccurs in the case of translation. But, as previously indicated, this recurrence of underdetermination cannot be accepted by Quine since within our theory, we expect no further underdetermination to emerge. Recall Quine’s physicalism: if no fact about correct translation can be found in the set of all the physical facts about the world, we should conclude that there is simply no such fact. Having chosen the theory A, all facts of the matter about the world are fixed, and if despite that, translation still varies, we should conclude that the totality of facts about the world has failed to fix the facts about correct translation. As Quine famously says, translation “withstands even … the whole truth about nature” (Quine 1968, 275). Therefore, there is no fact of the matter about correct translation, which establishes the skeptical conclusion that Quine was after. This is the reason why the argument from above was characterized as claiming that underdetermination plus physicalism results in indeterminacy. “Where indeterminacy of translation applies, there is no real question of right choice; there is no fact of the matter even to within the acknowledged under-determination of a theory of nature” (Quine 1968, 275).
The last question to answer is how it is that, within our total theory A, the totality of facts fails to fix the facts about correct translation. In order to see how Quine reaches this skeptical conclusion, imagine that our translator is given the task of translating the native’s total theory. The translator starts by translating the observation sentences of the native’s theory. Suppose that the translator’s theory is A and her aim is to match the observation sentences of the native’s theory with the observation sentences of A. What is the translator’s justification for deciding whether the observation sentences of her theory are matched with the observation sentences of the native theory? It is, as before, the fact that the observation sentences have the same stimulus meaning. Assume that the translator has matched up all such observation sentences. This is just to say that facts about translation are thereby fixed: the observation sentences are paired in terms of having the same stimulus meaning. Thus, it seems that our translator can now justifiably take A to be the unique, correct translation of the native’s theory: from the fact that all the observation sentences are matched up, she concludes that the native believes in the same theory as she does. But can she really make such a claim? Quine’s answer is negative.
The reason for Quine’s negative answer can be put as follows. Suppose that there is a second translator who, like the first translator, holds A for himself and aims to translate the native’s theory. As with our first translator, he matches the observation sentences of A with the observation sentences of the native’s theory in terms of the sentences’ having the same stimulus meaning. Having done that, however, he decides to attribute theory B to the native. The difference between A and B is this: they are different but empirically equivalent theories. Both theories share the same observation sentences but differ with regard to, for instance, some of their auxiliary assumptions, theoretical sentences, and the like. Neither the first nor the second translator really believes in B; they both find B to be too odd, complex, or unnatural to believe. Nonetheless, while the first translator attributes A to the native, the second translator, for whatever reason, attributes B to him. Quine’s crucial claim is that although the translators’ theory is A, that is, although they are both working from within one theory, they are still free to attribute either A or B to the native as the translation of his theory. There is no objective criterion on the basis of which they can decide which of A or B is the theory which the native, as a matter fact, believes since both A and B are alike with regard to the totality of facts about stimulus meaning. Therefore, as Quine’s physicalism implied, we should conclude that there is no fact of the matter as to which of A or B is to be chosen as the correct translation of the native’s total theory. Despite the fact that the totality of facts is fixed within A, the translators still have freedom of choice between rival translations of the native’s theory. This underdetermination with regard to rival translations is additional to our old underdetermination of theory by evidence. The translation of the native’s theory is thereby indeterminate. This argument is called the “argument from above” since it does not start by investigating how the reference of sub-sentential parts of sentences is fixed; it rather deals with the whole theory and the translation of its whole sentences.
As with the argument from below, the argument from above too has been subject to a variety of objections. Chomsky’s objection (Chomsky 1968) has been reviewed, but it is worth briefly reviewing the general form of two further objections. Robert Kirk (Kirk 1986) objects that Quine’s argument from above is not successful because it has to rely on the conclusion of the argument from below. In other words, it faces a dilemma: either it presupposes the argument from below, in which case it would be a question-begging argument because the argument from above was supposed to be an independent argument, or it does not presuppose the argument from below, in which case it fails to establish Quine’s desired skeptical conclusion. The reason for the latter is that Quine’s claim that the translator’s only justification for matching the observation sentences is that they have the same stimulus meaning does not, in any combination with underdetermination, result in the indeterminacy of translation, unless we read this claim as implying that these matchings form the totality of facts about translation and that they fail to pin down one unique translation of the native’s theory. By doing so, however, we have already reached the indeterminacy of translation without even using the underdetermination thesis at all.
A different sort of objection has been made (Blackburn 1984), (Searle 1987) and (Glock 2003), according to which the skeptical conclusions of Quine’s argument (no fact of the matter about meaning and translation and indeterminacy at home too) leads to an entirely unacceptable conclusion: a denial of first-person authority. It can be intuitively conceded that speakers have first-person authority over the meaning of their own utterances and the content of their own mental states, such as their beliefs. They know what they mean and believe, and they know this differently from the way others, like the translators, know such meanings and beliefs. But, if radical translation starts at home, then indeterminacy arises at home, too. This means that, for the speaker too, it would be indeterminate what her own words mean. This implication is highly counterintuitive.
Let’s end our discussion of Quine by removing a potential confusion about the skeptical conclusions of Quine’s arguments. Although Quine claims that, as a matter of fact, translation is indeterminate, he does not claim that, in practice, translation is impossible. After all, we do translate other languages and understand what others mean by their words. This means that we should distinguish between two claims here. First, Quine has argued that the traditional conception of meaning and translation is to be abandoned: insofar as our concern is theoretical and philosophical, there is no such a thing as one correct translation of another sentence. But, from a practical and pragmatic point of view, translation is perfectly possible. The reason is that although there is no objective criterion on the basis of which we can pick out one correct translation of a sentence, we have good pragmatic reasons to choose between the rival ones. The translator translates the native’s utterances with “empathy”. She treats the native as a rational person who, like her, believes that there are rabbits, trees, and so forth, rather than only rabbit-parts or tree time-slices. This is a maxim, which can be called Quine’s version of “the principle of charity” (Quine 1973). Our translator would choose “rabbit” as the translation of “gavagai” simply because this translation makes the communication between her and the native smoother. But, these norms cannot tell which translation is, as a matter of fact, correct. Although this maxim is also known as “the principle of charity”, it was not Quine who coined the term (though he started using a version of it in Word and Object (Quine 1960), and its role gradually became more important in his later works.) It was Neil L. Wilson (Wilson 1959, 532) who called a similarity to the aforementioned maxim “the principle of charity”, as Quine himself mentions. More or less similar to Wilson, Quine used it to emphasize that if the translator notices that her translation of the native’s sentences is resulting in a beyond-normal range of strange or “silly” translations, it is more likely that something is wrong with her translation than the native himself. The translator needs to choose those methods that lead to the attribution of the largest number of true sentences to the native. We are to maximize the agreement between us and the native with regard to holding true statements. As we will see, however, Davidson’s use of this principle is more extensive and substantial than Quine’s.
3. Davidson’s Semantic Project
Although Donald Davidson was inspired by Quine’s project of radical translation, he preferred to focus on what he calls “radical interpretation” (Davidson 1973a), (Davidson 1974b). Radical interpretation manifests itself in Davidson’s endeavor to uncover, from a theoretical point of view, how speakers’ ability to speak, and to understand the speech of others, can best be modeled or described. While Quine was interested in studying how the process of translation can proceed and what can be extracted from it regarding meaning determination and linguistic understanding, Davidson’s interest is wider. He is concerned with how a theory of meaning can be constructed for a language, a theory which can systematically entail the truth-conditions of all sentences of that language. His view of meaning is thus truth-conditional, according to which we understand a sentence, or what it means, by knowing under what condition the sentence would be true (Davidson 1967). For instance, the sentence “it’s raining” is true if and only if it is raining and false if it is not. We say that the truth-condition of the sentence is that it is raining. Someone who understands this sentence knows under what condition it would be true. If we succeed in constructing such a theory of meaning, which correctly specifies the truth-conditions of all sentences of a language, we have interpreted it, and we can, theoretically speaking, treat the speakers of that language as if they know such a theory, as if they speak in accordance with it and understand each other on that basis.
There are important differences between translation and interpretation. One difference is that, in the process of translating, our aim is to pair sentences of our language with sentences of the native’s on the basis of having the same meaning. In interpretation, our aim is to give the truth-conditions of the native’s sentences by using sentences of our own language. Obviously, the concept of truth has an important role to play in such a view. It is supposed to help us to clarify the concept of meaning. Davidson takes Alfred Tarski’s Truth-Theory, or Tarski’s definition of truth, to be the best tool for building his truth-based theory of meaning (Davidson 1967), (Davidson 1973b).
a. Davidson’s Use of Tarski’s Truth-Theory
For Davidson, any adequate theory of truth, if it is supposed to work as a theory of meaning entailing the right sort of truth-conditions for all sentences of a language, must meet certain constraints. One of the most important ones is to satisfy Tarski’s Convention-T, according to which our theory must entail all and only true instances of what is called Tarski’s “T-schema”:
(T) “s” is true in L if and only if p.
Our theory must entail true sentences in the form of (T) for all sentences of the native’s language L. Here, the native’s language is called the “object-language”: the language for the sentences of which our theory entails truth-conditions. Our own language is called the “meta-language,” the language whose sentences are used to specify such truth-conditions. In (T), the sentence in the quotation marks, “s”, mentions a sentence in the native’s language, and “p” is a sentence from our language that is used to give the truth-condition of the mentioned sentence.
Suppose that the object-language is German and the mentioned sentence is “Der Schnee ist weiss”. Which sentence in our language should be chosen to replace “p” in order to give the truth-condition of the German sentence? An important point to note here is that Tarski’s intent was to define truth (or, the truth-predicate, “is true”) for the object-language. In order to do so, he used the notion of translation, or sameness in meaning, and claims that what should be replaced by “p” must be either “s” itself (if the object-language is part of the meta-language) or the translation of “s” (if the object-language and the meta-language are different). Thus, the sentence which is put in place of “p” should be “snow is white”. Having done that, we come up with the following instance of the T-schema, or the following “T-sentence”:
(T1) “Der Schnee ist weiss” is true in German if and only if snow is white.
Tarski believed that each of such T-sentences yields a particular definition of truth since it defines truth for a particular sentence. A conjunction of all such instances will provide us with a definition of the concept of truth for the object-language. As a historical point, we should note that Tarski’s own goal was to define truth for a formal (that is, wholly translated into the first-order predicate logic) language, “L”, in a meta-language which contained L together with a few additional terms. He was very doubtful whether such a project could be consistently applied to the case of natural languages at all, mostly because natural languages can lead to a variety of paradoxes, such as the liar paradox. Although admitting that Tarski was suspicious of extending such a project to the case of natural languages, Davidson nevertheless attempts to carry out this project. He suggests that truth can potentially be defined for one natural language (as an object language) in another (as the meta-language). This is the reason why the examples used in this section are from natural languages, such as English, rather than from purely formal ones.
Tarski’s theory works recursively: it entails T-sentences like (T1) systematically from a finite set of axioms, as well as a finite set of rules for how different sub-sentential parts, or simple expressions, can be put together to form more complex expressions. The axioms’ job is to assign certain semantic properties to different parts of sentences: they assign reference to terms and satisfaction conditions to predicates. For instance, for (T1), we would have the following two axioms:
(A1) “Der Schnee” refers to snow.
(A2) “…ist weiss” is satisfied by white things.
Nonetheless, Davidson, who wants to specify meaning in terms of truth, cannot simply follow Tarski in claiming that the two sentences appearing in the T-sentences must have the same meaning; doing so presupposes the concept of meaning. Therefore, Davidson makes a weaker claim: our theory must produce all and only true T-sentences. That is to say, “p” should be replaced by a sentence that is true if and only if the object-language’s sentence is true. But it is easy to see why this constraint would not be strong enough to succeed in having the truth-theory entail the right sort of truth-conditions for the sentences of the object-language. Suppose that the object-language is English and so is the meta-language. In this case, the following T-sentences are both true:
(T2) “Snow is white” is true if and only if snow is white.
(T3) “Snow is white” is true if and only if grass is green.
The above T-sentences are true simply because both “snow is white” and “grass is green” are true. (The same would be true if we had “Der Schnee ist weiss”, rather than “snow is white”, on the left-hand side of (T2) and (T3) because our assumption is that “Der Schnee ist weiss” is a true sentence in German.) However, our theory must entail correct truth-conditions. Assume that the theory entailing (T2) is F and the theory entailing (T3) is Ψ. Both F and Ψ meet Davidson’s requirement of entailing only true T-sentences and must be considered as correct theories. But we know that Ψ is false and certainly does not give the correct truth-condition of “snow is white”. On the other hand, we cannot take this fact for granted. Thus, we still need more constraints on our theory.
Insofar as the discussion of radical interpretation is concerned, the most important constraint which Davidson imposes on his theory is that the totality of the theory’s T-sentences must optimally fit the evidence with regard to the speaker’s responses (Davidson 1973a). This is an empirical constraint: the theory should be treated as an empirical theory which is employed by an interpreter to specify the meaning of the speaker’s utterances. It should be constructed, checked, and verified as an interpretive theory which produces interpretive T-sentences and axioms. By an “interpretive theory” Davidson means the theory that entails correct truth-conditions for the speaker’s sentences, considering the evidence the interpreter has access to in the process of radical interpretation.
b. Davidson’s Radical Interpretation Scenario
Imagine that someone is sent again to the jungle to meet our native speaker but, this time, he is given the task of interpreting Davidson-style the native’s utterances, that is, finding appropriate sentences in his own language that can be used to correctly specify the truth-conditions of the native’s sentences. In order to do so, he is required to construct a theory which entails the truth-conditions of the native’s sentences. Call him the “interpreter”. The Davidsonian interpreter, like the Quinean translator, starts his interpretation from scratch: he has no prior knowledge of the native’s language or her mental states.
Like radical translation, radical interpretation primarily focuses on the native’s observation sentences. A difference between Quine and Davidson emerges at this point. Although Quine took stimulations, or “proximal stimulation”, to be basic, Davidson takes the ordinary objects and events in the world, or “distal stimuli”, to be basic (Davidson 1999a). Another important difference between these two projects concerns the sort of evidence the interpreter is allowed to work with. Quine limited it to purely behavioral evidence, that is, the speaker’s assent or dissent. Davidson agrees that what the interpreter can ultimately rely on is nothing but observing the native’s verbal behavior, but since he rejects behaviorism, he claims that we can allow the interpreter to have access to what he calls the “holding-true attitudes” of the speaker (Davidson 1980). These are attitudes which the speaker possesses towards her own sentences; when the speaker utters, or assents to, a sentence on a specific occasion, she holds the sentence to be true on that occasion. For Davidson, the interpreter knows this much already, though he emphasizes that from this assumption it does not follow that the interpreter thereby has access to any detailed information about what the speaker means and believes.
Suppose that our native speaker, S, utters the sentence “Es regnet” at time t. The evidence the interpreter can work with would have the following form:
(E) S holds true “Es regnet” at time t if and only if it is raining at time t near S.
For Davidson, belief and meaning are interdependent. When a speaker utters a sentence, she expresses her thoughts, especially her beliefs. This interdependence between meaning and belief manifests itself in his emphasis on the role of the speaker’s holding-true attitudes in his project since, according to Davidson, a speaker holds a sentence to be true partly because of what those words mean in her language and partly because of the beliefs she has about the world. This means that if we know that the speaker holds a sentence to be true on an occasion and we know what she means by it, we would know what she believes, and if we know what she believes, we can infer what she means by her utterance.
Our radical interpreter, therefore, has a difficult job to do. He should determine the meaning of the native’s utterances and, at the same time, attribute suitable beliefs to her. This leads to what is called the “problem of interpretation”, according to which the interpreter, on the basis of the same sort of evidence, like (E), has to determine both meaning and belief. Obviously, one of these two variables must be fixed; otherwise, interpretation cannot take off. Davidson attempts to solve this problem by appealing to his version of the principle of charity (Davidson 1991). According to this principle, as employed by Davidson, the interpreter must do her best to make the native’s behavior as intelligible as possible: she ought to aim at maximizing the intelligibility (and not necessarily the truth) of the native’s responses in the process of interpreting them The interpreter takes the native to be a rational agent whose behavior is intelligible and whose patterns of beliefs, desires, and other propositional attitudes, are more or less similar to ours. Obeying such a rational norm does not necessarily result in attributing true beliefs to the subject all the time; sometimes attributing a false belief to the subject may make his behavior more intelligible and comprehensive. This reveals another difference between Davidson and Quine with regard to their use of such a maxim or principle. More is said about this in the next section.
For Davidson, when it is raining around the native and she utters “Es regnet”, the interpreter takes her to be expressing the belief that it is raining. Charity helps to fix one of the above two variables, that is, the belief part. On the basis of the evidence (E), and with the help of the principle of charity, the interpreter can come up with the following hypothetical T-sentence:
(T4) “Es regnet” is true in S’s language if and only if it is raining.
As with the case of radical translation, here too (T4) is to be treated as hypothetical for now because one observation would not be enough for the interpreter to confirm that (T4) gives the correct truth-condition of the native’s sentence. The process of interpretation is a holistic process: terms like “regnet” and “Schnee” appear in many different sentences of the native’s language. The interpreter must go on and check if (T4) remains true when the native utters “Es regnet” on different occasions. As the interpretation proceeds, the interpreter gradually comes to identify different sub-sentential parts of the native’s sentences and thereby constructs specific axioms which assign reference to the terms and satisfaction conditions to the predicates of the native’s language (such as (A1) and (A2) above). In this case, the interpreter would be able to verify whether “Schnee” in the native’s language refers to snow or grass. The interpreter would then be able to throw away the true-but-not-interpretive T-sentences like the following:
(T5) “Der Schnee ist weiss” is true if and only if grass is green.
The reason is that the interpreter has checked, in several cases, that the native uses “Schnee” where there is snow, not grass, and that “… ist weiss” is used by the native where there are white things, not green things. The correct truth-condition of the native’s sentence seems to be snow is white, not grass is green.
At the end of this long process, the interpreter eventually comes up with a theory of interpretation, which correctly interprets the native’s verbal behavior. It systematically entails correct truth-conditions for all sentences of the native’s language. But, does this mean that indeterminacy has no chance to emerge in this process?
4. Davidson on the Indeterminacy of Interpretation
Davidson believes that some degree of indeterminacy survives in the process of radical interpretation. First of all, the inscrutability of reference cannot be avoided. Our words and things in the world can be connected in different ways, and we may never be able to tell which way of connecting words with things is right (Davidson 1997). If one way works, an infinite number of other ways will work as well, though some of them may seem strange or too complex to us. Davidson gives an example.
Predicates are said to have satisfaction conditions: they are satisfied, or are true of, certain things only. For instance, the predicate “… is red” is satisfied by red things, not blue things. Nonetheless, it seems that, for any predicate, we can find other predicates which have the same sort of satisfaction condition. In this case, the truth-value of the sentences in which such predicates appear would remain unchanged: if they are true (or false), they remain true (or false). But, since the totality of evidence is the same for all such cases, no evidence can help to decide which satisfaction condition is right. For example, suppose we have the following axioms:
(A3) “Rome” refers to Rome.
(A4) “… is a city in Italy” is satisfied by cities in Italy.
From these axioms, we can reach the following T-sentence:
(T6) “Rome is a city in Italy” is true if and only if Rome is a city in Italy.
This is a true T-sentence. Now consider an alternative set of axioms:
(A5) “Rome” refers to an area 100 miles to the south of Rome.
(A6) “… is a city in Italy” is satisfied by areas 100 miles south of cities in Italy.
From these, we can reach (T7):
(T7) “Rome is a city in Italy” is true if and only if the area 100 miles south of Rome is an area 100 miles south of a city in Italy.
The point is that if (T6) is true, (T7) is also true, and vice versa. Not only this, but there can be indefinitely many such mapping relations. The reference of “Rome” is thereby inscrutable: there is no way to determine which reference for “Rome”, and which satisfaction condition for “… is a city in Italy”, is to be chosen as the correct one. As before, such terms appear in a potentially indefinite number of sentences and thus, the inscrutability of reference affects the whole language. One interpreter may come up with a theory which takes “Rome” to be referring to Rome, while another may come up with a theory which takes it to be referring to an area 100 miles to the south of Rome. Both theories work just well, provided that each interpreter sticks to her own theory. Obviously, we cannot freely switch between different methods of interpretation. Rather, once it is fixed within one theory that “Rome” refers to Rome, the term has this reference in any sentence in which it occurs.
Davidson takes this sort of indeterminacy to be harmless and leading to no skepticism about meaning since it is similar to the innocuous familiar fact that we can have different ways of measuring the temperature, height, or weight of an object (Davidson 1997). When we want to tell what the temperature of an object is, we face different scales for measuring it. What we should do is to decide whether we want to use the Fahrenheit scale or the centigrade one. For Davidson, the inscrutability of reference should be understood in a similar way: there are always different methods of interpreting a speaker’s verbal behavior; what we must do is to choose between such rival methods and hold onto it. These different theories of interpretation are all compatible with the native’s behavioral evidence. But as there is no contradiction regarding the existence of different scales of measuring temperature, there would be no contradiction regarding the existence of different methods of interpreting the speaker’s behavior.
A second sort of indeterminacy also emerges in the process of radical interpretation which, contrary to the inscrutability of reference, can affect the sentences’ truth-values. Suppose that you are interpreting someone who often applies “blue” to the objects which most people apply “green” to, such as emeralds. Davidson’s claim is that you, as her interpreter, face two options. (1) You can attribute to her true beliefs about the world by taking her to be in agreement with you regarding the things there are in the world and the properties they possess: you can treat her as believing that there are emeralds and they are green. But, since she applies “blue” to these things, you have to take her to mean something different by the term “blue”. You do so with the purpose of making her behavior intelligible. Thus, you interpret her utterance of “blue” to mean green, not blue. On this option, the speaker’s utterance of “that emerald is blue” is true because you have treated her as believing that emeralds are green and as meaning green by “blue”. Thus, what she means by her utterance is that that emerald is green. (2) The second option is to interpret her to mean the same thing as you mean by “blue”, that is, to be in agreement with you on what the correct use of “blue” is. For both of you, “blue” applies to blue things and thus means blue. Again, since she applies “blue” to the things to which you apply “green”, you take her to have some different (false) beliefs about the world in order to make her behavior intelligible. Thus, while you interpret her utterance of “blue” to mean blue, you attribute to her the false belief that emeralds are blue. On this option, however, the speaker’s utterance of “that emerald is blue” is false because what she claims is that that emerald is blue. According to Davidson, you, as the interpreter of her speech, can choose any of the above two options and, as before, it is enough that you continue interpreting her in that way. There is nothing in the world that can help you to decide which method is right (Davidson 1997). Rather, all there is for the interpreter to appeal to is the rational norms dictated by the principle of charity, which, as we can see now more clearly, may even result in attributing some false beliefs to the subject in order to make her behavior more intelligible. Therefore, Davidson believes that the inscrutability of reference and the indeterminacy with regard to the attributions of meaning and belief to the speaker arise in the process of radical interpretation too.
There is, however, a final point which is worth noting with regard to Davidson’s treatment of indeterminacy. For him, indeterminacy does not entail that there are no facts of the matter about meaning (Davidson 1999b). Rather, he treats indeterminacy as resulting in an epistemological problem only, with no ontological consequences. His reason is that if the overall pattern of the speaker’s behavior is stable, there can be alternative ways of describing it, that is, there can always be different theories of interpretation. Again, just as there were different ways of measuring temperature, there can be different ways of interpreting the speaker’s behavior. As we did not question the reality of temperature, we should not question the reality of meaning. Davidson, thus, departs from Quine on this matter: while Quine thought that the indeterminacy of translation has destructive ontological consequences, Davidson thinks that the indeterminacy shows only that there can be different ways of capturing facts about meaning.
In what follows, the article considers two important analytic philosophers who have been inspired by Davidson’s and Quine’s projects, David Lewis and Daniel Dennett.
5. Lewis on Radical Interpretation
David Lewis, in his famous paper “Radical Interpretation” (Lewis 1974), agrees with Davidson on the general claim that the aim of the process of radical interpretation is to determine what a speaker, say, Karl, means by his utterances and what he believes and desires. Lewis also agrees with Davidson and Quine that radical interpretation starts from scratch: at the outset, the interpreter has no prior information about Karl’s beliefs, desires, and meanings. Not only this, but our knowledge of Karl is also taken by Lewis to be limited to the sort of knowledge we can have of him as a physical system. Thus, on this latter point, he leans to Quine, rather than Davidson, and he invites us to imagine that we interpreters have access to the totality of physical facts about Karl. Lewis’s question is, what do such facts tell about Karl’s meanings, beliefs, and desires?
Lewis characterizes the “problem of radical interpretation” as follows. Suppose P is the set of all such facts about Karl viewed as a physical system, for instance, facts about his movements, his causal interactions with the world, his behavioral responses to others, the impact of physical laws on him, and so forth. Suppose also that we have two sets of specifications of Karl’s propositional attitudes, Ao and Ak. Ao is the set of specifications of Karl’s beliefs and desires as expressed in our language (for example, when we specify what Karl believes by using the English sentence “snow is white”), and Ak is the set of specifications of Karl’s beliefs and desires as expressed in Karl’s language (for example, given that Karl’s language is German, Karl’s belief (that snow is white) is expressed by the German sentence “Der Schnee ist weiss”). Finally, suppose that M is the set of our interpretations of Karl, that is, the specifications of the meaning of Karl’s utterances (for example, statements like “Karl means snow is white by his utterance of ‘Der Schnee ist weiss’”). Lewis’s question is: How are P, Ak and Ao related?
Some points about these sets are to be noted. (1) As with Davidson, Lewis is wishes to determine the truth-conditions of Karl’s sentences. So, the interpreter looks for correct interpretations such as “‘Der Schnee ist weiss’ as uttered by Karl is true if and only if snow is white”. (2) Following Davidson, Lewis also demands that these truth-conditions be entailed in a systematic way from a finite set of axioms. (3) Contrary to Davidson, however, Lewis puts a heavy emphasis on beliefs and desires, and claims that our most important goal in radical interpretation is to determine them first. This shift in emphasis leads us to two further points about the relation between the views of Lewis and Davidson. (a) Lewis agrees with Davidson that beliefs and desires play a significant role both in our treatment of the speaker as a rational agent and in our explanation of his behavior as an intentional action. For Davidson, a speaker is rational if she possesses a rich set of interrelated propositional attitudes, such as beliefs, desires, hopes, fears, and the like (Davidson 1982). An agent’s action can be explained as intentional if it can be properly described as caused by a belief-desire pair (Davidson 1963). For instance, to use Davidson’s example, suppose that someone adds sage to the stew with the purpose of improving its taste. This action is intentional if we can explain it as caused by the subject’s desire to improve the taste of the stew and the belief that adding sage would do that. (b) Nonetheless, Davidson did not take beliefs and desires (or, in general, any propositional attitudes) to be superior to meaning. He thought that meanings and beliefs are so interdependent that interpreters have to determine both at the same time. Lewis treats beliefs and desires as basic and claims that meanings can be determined only when the speaker’s beliefs and desires are determined first. This view is related to his analysis of success in our linguistic practices in terms of conventions and the crucial role speakers’ beliefs play in turning a sort of regularity-in-use into a convention(Lewis 1975). (3) Putting aside his delicate view of the notion of convention, the last point to note is that Lewis agrees with Quine rather than Davidson regarding the idea that the problem interpreters seek to answer in the process of meaning-determination is more than just an epistemological problem. The concern is not howP, the set of all physical facts about Karl, determines facts about Karl’s meanings, beliefs, and desires. Rather, one wants to know what facts P is capable of determining at all, that is, whether the totality of physical facts can fix the facts about what Karl means, believes, and desires.
Let’s see how the views of Lewis and Davidson particularly differ with regard to the constraints on the process of radical interpretation and the degree of indeterminacy which may survive after so restricting the process.
a. Lewis’s Constraints on Radical Interpretation
Lewis believes that the process of interpretation the interpreters needs to place more constraints than those introduced by Davidson. These extra constraints concern how meanings and propositional attitudes are related to one another, to the behavior of the speaker, and to the sensory stimulations. It is meeting these constraints that makes radical interpretation possible. (Lewis 1974) promotes six constraints on radical interpretation, only some of which are shared by Davidson:
(1) The Principle of Charity. The way Lewis characterizes this principle is slightly different from Davidson. According to this principle, in order to make Karl’s behavior most intelligible, the interpreter should interpret Karl’s behavior (as specified in P) by treating him as believing what he ought to believe and desiring what he ought to desire. Again, this does not mean that in order to make Karl’s behavior most intelligible, only true beliefs are to be attributed to him; rather, sometimes, treating him as holding some false beliefs may do much better in describing his behavior as rational, intelligible, and comprehensive. What Karl ought to believe and desire, from the interpreter’s point of view, is generally what she believes and desires (given by Ao). But, again, considering the particular circumstances under which Karl’s behavior is interpreted, as well as the available evidence, the values Karl accepts, and so forth, the interpreter should make room for attributing some errors or false beliefs to him.
(2) The Rationalization Principle. Karl should be interpreted as a rational agent. The beliefs and desires the interpreter attributes to Karl (in Ao) should be capable of providing good reasons for why Karl responds in the way he does. Nonetheless, it does not mean that there are thereby some sort of intentional (non-physical) facts about Karl. The facts about Karl are still limited to the physical ones specified in P. This principle rather implies that the interpreter attributes those desires and beliefs to Karl that not only make Karl’s behavior intelligible to us, but also provide him with reasons for acting in that way. For this reason, the rationalization principle and the principle of charity are different.
(3) The Principle of Truthfulness. Karl is to be considered a truthful speaker, that is, a speaker who is willing to assert only what he takes to be very probably true. This principle constrains the sort of desires and believes the interpreter is allowed to attribute to Karl (in Ao) in order to interpret his utterances (and specify their meaning in M).
(4) The Principle of Generativity. The truth-conditions which the interpreter assigns to Karl’s utterances (in M) must be finitely specifiable, uniform, and simple. The interpreter must do her best to avoid assigning too complex, odd, or unnatural meanings to Karl’s sentences, as well as the meanings that are not finitely and systematically inferable from a finite set of axioms.
(5) The Manifestation Principle. Karl’s attributed beliefs and desires should be capable of manifesting themselves in his behavior. Karl’s beliefs and other attitudes should be recognizable particularly in his use of his language. This means that when there is no evidence to the effect that Karl is self-deceived or lacks proper conception of what meaning and belief are, we should be able to extract beliefs and other propositional attitudes from Karl’s behavioral dispositions to respond to the world.
(6) The Triangle Principle. Karl’s meanings, beliefs, and desires should not change when they are specified in the interpreter’s language, whatever the interpreter’s language is. This principle may appear a bit puzzling. Suppose that Karl utters “Der Schnee ist weiss” and our interpreter, who speaks English, interprets it as follows:
(M1) “Der Schnee ist weiss” as uttered by Karl is true in German if and only if snow is white.
The truth-condition of Karl’s utterance is thus that snow is white. Suppose that another interpreter, Francesco, who speaks Italian, interprets Karl’s utterance as follows:
(M2) “Der Schnee is weiss”, proferita da Karl, è vera in Tedesco se e solo se la neve è bianca.
The truth-condition of Karl’s utterance is now given by the Italian sentence la neve è bianca. Lewis’s point is that what Karl means by his utterance and what belief he expresses by it must remain the same, no matter in what language they are specified. We can see this point by considering the way our English interpreter would interpret Francesco’s sentence “la neve è bianca”, used in (M2) to give the truth-condition of Karl’s utterance:
(M3) “La neve è bianca” as uttered by Francesco is true in Italian if and only if snow is white.
The truth-condition of Francesco’s sentence is that snow is white. Considering (M1) – (M3), one can see that what Karl expresses by his German sentence, that is, that snow is white, remains the same no matter in what language it is specified.
On the basis of these six principles, Lewis evaluates Davison’s project of radical interpretation. Davidson’s aim was to solve the problem of determining Karl’s beliefs and meanings at the same time. For Lewis, Davidson attempts to solve this problem by appealing to the Triangle Principle, the Principle of Charity, and the Principle of Generativity only. That is to say, what Davidson is concerned with is that the truth-conditions of Karl’s sentences are correctly specified in the interpreter’s language (the Triangle Principle), that such assignments of meaning are done with the purpose of maximizing the intelligibility of Karl’s behavior via attributing proper beliefs to him (the Principle of Charity), and finally, that the truth-conditions are inferred in a systematic way from a finite set of axioms (the Principle of Generativity). We temporarily fix beliefs, extract meanings, ask the interpreter to re-check her interpretation with further behavioral evidence, revise the beliefs if necessary, and re-check the interpretation. Lewis is not satisfied with Davidson’s method because, for him, Davidson has missed the other three principles. Davidson especially fails to take into account the Principle of Truthfulness and the Rationalization Principle, which constrain Karl’s beliefs and desires and which consequently lead the interpreter to view Karl’s behavior as an intentional action in advance. Davidson has put too much emphasis on the language part, rather than the mental part.
Lewis’s method is almost the opposite. It starts by considering Karl’s behavior as what forms the evidential basis for interpretation, but it considers such behavior in the light of the Rationalization Principle and the Principle of Charity. Karl’s behavior is taken to be already rationalized. Karl’s behavior can be treated in this way if his behavior allows for attributions of those beliefs and desires to him that we interpreters ourselves often hold. Evidence to the contrary forces us to reconsider his behavior as rational. Karl’s linguistic behavior, his utterances, are simply part of the history of his behavior. On this basis, we are then allowed to assign truth-conditions to Karl’s utterances by employing the Principle of Truthfulness. That is, we view Karl as a rational agent who asserts a sentence only when he believes it is true. The Principle of Generativity constrains our attributions of truth-conditions to Karl’s sentences by demanding systematicity, coherence, and consistency in such attributions. Finally, if other principles are met, the Triangle Principle assures us that Karl’s meanings, beliefs, and desires remain the same when they are specified in the interpreter’s language.
The question, however, is whether these extra constraints can avoid the emergence of the inscrutability of reference and the indeterminacy of interpretation.
b. Lewis on the Indeterminacy of Interpretation
Lewis believes that indeterminacy, at least in its strong and threatening form, can be avoided, though some degree of mild or moderate indeterminacy would inevitably survive. His position changed from that in his earlier works on the topic, especially in “Radical Interpretation” (Lewis 1974). There Lewis admits that it is reasonable to think that there would probably remain some rival systems of interpretation which are compatible with the set of all behavioral evidence and which can be considered as correct. He uses Quine’s gavagai example to clarify the sort of indeterminacy which he thinks may appear in the process of radical interpretation. As he puts it, “[w]e should regard with suspicion any method that purports to settle objectively whether, in some tribe, ‘gavagai’ is true of temporally continuant rabbits or time-slices thereof. You can give their language a good grammar of either kind—and that’s that” (Lewis 1975, 21).
Nonetheless, even in this earlier period, Lewis emphasized that no “radical indeterminacy” can come out in radical interpretation. If a theory of interpretation meets all of the six conditions introduced above and it does so perfectly, then we should expect no radical indeterminacy to appear, that is, no rival theories of interpretation which all perfectly meet the six constraints but attribute radically different beliefs and desires to Karl. For Lewis, even if it can be shown somehow that such indeterminacy may emerge even when the six constraints are met, the conclusion of the attempt would not be that the interpreter should thereby accept the existence of such indeterminacy. Rather, what would be proved would be that all the needed constraints have not yet been found. Lewis, however, thinks that no convincing example has yet been offered to persuade us that we should take such a sort of radical indeterminacy seriously. He also denies that the existence of radical indeterminacy can be shown by any proof (Lewis 1974).
Lewis subsequently returned to the problem of indeterminacy due to Putnam’s argument in favor of radical indeterminacy.
i. Putnam’s Model-Theoretic Argument and Lewis’s Reference Magnetism
Putnam, in (Putnam 1977), reformulates Quine’s thesis of the inscrutability of reference in a way in which it could not be treated as a mild indeterminacy anymore. His argument, the “model-theoretic argument”, is technical and is not unpacked here. But consider its general conclusion. This argument attempts to undermine metaphysical realism, according to which there are theory-independent, language-independent, or mind-independent objects in the world to which our terms are linked in a certain way, and such a linkage makes our sentences about the world true. Such mind-independent objects can have properties which may go beyond the epistemic and cognitive abilities of human beings. For such realists, there can be only one true complete description of the world because the world is as it is, and things in it are the way they are and have the properties they have independently of how we can describe it. Now suppose that we have an epistemically ideal theory of the world, that is, a theory that meets all the theoretical and similar constraints we can impose on our theory, such as consistency, full compatibility with all evidence, completeness, and so forth. According to metaphysical realism, even such an ideal theory can come out false because, after all, it is the theory that is ideal for us, that is, ideal as far as an idealization of our epistemic skills allows. It is possible that this theory still fails to be the one which correctly describes the world as it really is. Putnam’s argument, however, aims to show that we interpreters can have different interpretations of the link between our words and things in the world which make any such ideal theory come out true.
As indicated in the above discussion of Quine, our theories of the world are nothing but a collection of interconnected sentences containing a variety of expressions. For Putnam, however, even if we can fix the truth-values (and even the truth-conditions) of all sentences of our language, it can still be shown that the reference of its terms would remain unfixed: there can always be alternative reference systems that are incompatible with one another but preserve the truth-values of the sentences of our language or theory. For instance, if we change the reference of “rabbit” from rabbits to undetached rabbit-parts, the truth-values of the sentences in which “rabbit” occurs would not change. This much was familiar from Quine’s argument from below. What Putnam adds is that realists have to concede that if there is only one correct theory, or description, of the world, this theory should thereby be capable of making the reference of our terms fixed: it should uniquely determine to what objects each term refers. Putnam’s question is how realists can explain such a reference-determining process. According to Putnam, no reference of any term can be determined; there can be many theories that come out true dependent on how we intend to interpret the systematic connection between words and things. Anything you may like can potentially be taken to be the reference of any term, without causing any change in the truth-values of whole sentences. Not only this, but introducing any further constraints on your theory would inevitably fail to solve the problem because any new constraint introduces (at most) some new terms into your theory, and the reference of such terms would be susceptible to the same problem of indeterminacy. Putnam’s point is that we cannot think of the world as bestowing fixed reference on our terms: “We interpret our languages, or nothing does” (Putnam 1980, 482).
This argument is taken seriously by Lewis because, not only was he an advocate of realism, he also supported “global descriptivism”, a view which Putnam’s argument undermines. For Lewis, we should interpret terms as referring to those things which make our theory come out true. If we attribute to the speaker the belief that Aristotle was a great Greek philosopher and if we concede, as Lewis does, that the content of this belief is expressible in the sentence “Aristotle was a great Greek philosopher”, then we should interpret “Aristotle” to refer to those things which make our theory containing it turn out true. For this to happen, it seems that “Aristotle” is to be interpreted as referring to a specific person, Aristotle. The sort of “causal descriptivism” which Lewis worked with implies that, first of all, there is a causal relationship between terms and their referents and, second of all, terms (like “Aristotle”) are so connected with their referents (such as Aristotle) by means of a certain sort of description, or a cluster of them, which we associate with the terms and which specifies how the terms and their referents are so causally linked. In this sense, “Aristotle” refers to Aristotle because it is Aristotle that satisfies, for instance, the definite description “Plato’s pupil and the teacher of the Alexander the Great.” Global descriptivism states that the reference of (almost) all the terms of our language is determined in this way.
Putnam’s argument undermines this view because if the reference of our terms is indeterminate, the problem cannot be dealt with simply by introducing or associating further descriptions with those terms because no matter what such descriptions and constraints are, they would be nothing but a number of other words, whose reference is again indeterminate. Such words are to be interpreted; otherwise, they would be useless since they would be nothing but uninterpreted, meaningless symbols. But if they are to be interpreted, they can be interpreted in many different ways. Therefore, the problem, as Lewis puts it, is that there can be many different (non-standard) interpretations which can make our epistemically ideal theory turn out true: “any world can satisfy any theory…, and can do so in countless very different ways” (Lewis 1983, 370). New constraints on our theory just lead to “more theory”, and it would be susceptible to the same sort of problem. After all, as Putnam stated, we interpret our language, so, under different interpretations, or models, our terms, whatever they are, can refer to different things, even to anything we intend. It would not really matter how the world is or what the theory says.
In order to solve this problem, Lewis introduces “reference magnetism” or “inegalitarianism”. According to this solution, the reference of a term is, as before, what causes the speaker to use that term and, more importantly, among the rival interpretations of that term, the eligible interpretation is the one which picks out the most natural reference for the term. Lewis’s view of natural properties and naturalness is complex. According to Lewis, the world consists of nothing but a throng of spatiotemporal joints, at which the world is carved up and various properties are instantiated. Such properties are “perfectly natural” properties (Lewis 1984). For Lewis, however, it is “a primitive fact that some classes of things are perfectly natural properties” (Lewis 1983, 347). It is helpful to use an example originally given by Nelson Goodman. Suppose that the application of the terms “green” and “blue” are governed by the following rules:
Rule1: “Green” applies to green things always. In this case, “green” means green.
Rule2: “Blue” applies to blue things always. In this case, “blue” means blue.
Suppose that a skeptic claims that the rules governing the application of “green” and “blue” are not those mentioned above. They are rather the following:
Rule1*: “Green” applies to green things up to a specific time t and to blue things after t. In this case, “green” does not mean green, but means grue.
Rule2*: “Blue” applies to blue things up until time t and to green things after t. In this case, “blue” does not mean blue, but means bleen.
If the speaker has been following Rule1*, rather than Rule1, the application of “green” to an emerald at t+ would be incorrect. Obviously, there can be an infinite number of such alternative rules, and as t can be any given time, no behavioral evidence can help to decide whether by “green” the speaker really meant green or grue.
Lewis’s reference magnetism implies that the correct interpretation of the speaker’s utterance of “green” chooses the most natural property as the eligible referent of the term, that is, the property of being green rather than being grue. As he puts it, “grue things (or worse) are not all of a kind in the same way that … bits of gold…are all of a kind” (Lewis 1984, 228-229). Similarly, the most eligible referent for “rabbit” is that of being a rabbit, rather than an undetached rabbit-part. In this way, the sort of radical indeterminacy which Putnam argued for would be blocked. Lewis, therefore, thinks that we can eventually avoid the threatening radical indeterminacy.
6. Dennett’s Intentional Stance
Dennett’s position, unlike Lewis’s, does not claim that indeterminacy can be so controlled. Dennett follows Quine and Davidson in viewing the third-person point of view as our best and only viable view for investigating the behavior of objects, systems, or organisms. His concern is to find an answer to the question whether we can attribute propositional attitudes to certain objects or systems, that is, whether we can interpret their behavior as intentional and hence treat them as “true believers”.
Dennett famously distinguishes among three sorts of such a third-person standpoint: the “physical stance”, the “design stance” and the “intentional stance” (Dennett 2009). Which stance works best depends on how functional it is for our purposes, that is, whether it offers a useful interpretation of the system’s behavior. Such an interpretation must enable us to explain and predict the system’s future behavior in the most practical way. The physical stance or “strategy” is the one which we usually work with in our study of the behavior of objects like planets or a pot on the burner. This stance is the method which is often employed in natural sciences. Scientists use their knowledge of the laws of nature and the physical constitutions of objects to make predictions about the objects’ behavior. This stance seems to be our only option to scrutinize the behavior of things which are neither alive nor artifacts.
Sometimes, however, the physical stance may not be the best strategy for interpreting the object’s behavior. Rather, adopting the design stance would be far more helpful. By choosing the design stance, we add the assumption that the object or the system is designed in a certain way to accomplish a specific goal. This is the stance which we use in our explanation and prediction of the behavior of, for example, a heat-seeking missile or an alarm clock. Although we may not have enough detailed information about the physical constituents of such objects and how they work, the design stance enables us to successfully predict their behavior. What about the case of the objects which manifest far more complex behavior, such as humans or an advanced chess-playing computer?
In such cases, the design stance would not be as effective as it was in the simpler cases mentioned above. At this point, the intentional stance is available. By adopting the intentional stance, we presume that the object or the system is a rational agent, that is, an agent assumed to possess propositional attitudes such as beliefs and desires. Having granted that, we decide what sort of attitudes we ought to attribute to the object, on the basis of which we can interpret its behavior as an intentional action. Considering the system’s complex interactions with the environment, we attribute specific beliefs and desires to it. Attributing the right sort of beliefs and desires to the system, in turn, enables us to predict how, on the basis of having those attitudes, it will act and, more importantly, how it ought to act: we offer an “intentional interpretation” of the system’s behavior.
If we wanted, we could potentially use the intentional stance in our study of the behavior of the missile or the alarm clock; but we do not need it: the design strategy worked just well in predicting their future behavior. In order to understand the behavior of such objects, adopting the intentional stance is simply unnecessary. Moreover, we do not usually want to count these things as believers, that is, as the possessors of a complex set of interrelated propositional attitudes. Therefore, we can define an “intentional system” as any system whose behavior can be usefully predicted by adopting the intentional stance. We treat such things as if they are rational agents which ought to possess a certain sort of beliefs, desires, intentions, goals, and purposes, in the light of their needs and complex capacities to perceive the world (Dennett 1987) and (Dennett 2009).
a. Indeterminacy and the Intentional Stance
The intentional interpretation of a system naturally allows for the emergence of the indeterminacy of interpretation. Recall that Dennett’s concern was to find out how practical and useful the offered interpretation is for the purpose of predicting the system’s behavior. In this case, we cannot expect to come up with one unique intentional interpretation of the system’s behavior, which works so perfectly that it leaves no room for the existence of any other useful intentional interpretations. There can always be alternative interpretations which work just as well in predicting the system’s behavior. Two equally predictive interpretations may attribute different sets of attitudes to an intentional system. For Dennett, whenever there are two competing intentional interpretations of a system, which work well in predicting the system’s behavior, none can be said to have any clear advantage to the other because in order to make such a choice we have no further, especially no objective, criterion to rely on. As he clarifies, “we shouldn’t make the mistake of insisting that there has to be a fact of the matter about which interpretation is ‘the truth about’ the topic. Sometimes, in the end, one interpretation is revealed to be substantially better, all things considered. But don’t count on it” (Dennett 2018, 59).
To be a believer is to have the sort of behavior which is predictable by adopting the intentional stance. There is nothing to make it impossible that, for the same pattern of behavior, we can have rival intentional-stance interpretations. It is important to note that, for Dennett, the fact that such rival interpretations exist does not imply that these patterns are unreal. They are real patterns of observable behavior. The point is that our interpretation of them and the beliefs and desires that we attribute to them would depend on the sort of stance we choose to employ (Dennett 1991). There is no deeper fact than the fact that we choose to look at a system from a specific point of view and that we do so with the aim of making the best workable prediction of its behavior. In order to decide between rival interpretations, which are all compatible with all the evidence there is with regard to the system’s behavior, we have no objective criterion to rely on because the interpretations are compatible with all the facts there are, that is, facts about the system’s behavior. As a result, the indeterminacy of interpretation emerges.
Dennett states that the intentional stance with its rationality constraints is there to “explain why in principle…there is indeterminacy of radical translation/interpretation: there can always be a tie for first between two competing assignments of meaning to the behavior … of an agent, and no other evidence counts” (Dennett 2018, 58). When you intend to organize the behavior of a system, you can organize it in different ways; and whether such a system can be viewed as an intentional system would depend on whether its behavior can be usefully predicted from the point of view of the particular interpretive intentional stance which you adopt.
7. References and Further Reading
Blackburn, Simon. 1984. Spreading the Word. Oxford: Oxford University Press.
Davidson, Donald. 1991. “Three Varieties of Knowledge.” In A. J. Ayer: Memorial Essays, edited by A. P. Griffiths, 153–66. New York: Cambridge University Press.
Davidson, Donald. 1993. “Method and Metaphysics.” Deucalion 11: 239–48.
Davidson, Donald. 1997. “Indeterminism and Antirealism.” In Realism/Antirealism and Epistemology, edited by C. B. Kulp, 109–22. Lanham, Md.: Rowman and Littlefield.
Davidson, Donald. 1999a. “The Emergence of Thought.” Erkenntnis 51 (1): 7-17.
Davidson, Donald. 1999b. “Reply to Richard Rorty.” In The Philosophy of Donald Davidson, edited by L. E. Hahn, 595-600. US, Illinois: Open Court.
Dennett, Daniel. 1971. “Intentional Systems.” The Journal of Philosophy 68 (4): 87-106.
Dennett, Daniel. 1987 . The Intentional Stance. Cambridge, Mass.: MIT Press.
Dennett, Daniel. 1991. “Real Patterns.” The Journal of Philosophy 88 (1): 27-51.
Dennett, Daniel. 2009. “Intentional Systems Theory.” In The Oxford Handbook of Philosophy of Mind, edited by Brian P. McLaughlin, and Sven Walter Ansgar Beckermann, 340-350. Oxford: Oxford University Press.
Dennett, Daniel. 2018. “Reflections on Tadeusz Zawidzki.” In The Philosophy of Daniel Dennett, edited by Bryce Huebner, 57-61. Oxford: Oxford University Press.
Evans, Gareth. 1975. “Identity and Predication.” Journal of Philosophy LXXII(13): 343–362.
Fordor, Jerry. 1993. The Elm and the Expert: Mentalese and its Semantics. Cambridge, MA: Bradford.
Glock, Hans-Johann. 2003. Quine and Davidson on Language, Thought, and Reality. Cambridge: Cambridge University Press.
Lewis, David. 1975. “Languages and Language.” In Minnesota Studies in the Philosophy of Science, Volume VII, edited by Keith Gunderson, 3–35. Minneapolis: University of Minnesota Press.
Lewis, David. 1983. “New Work for a Theory of Universals,.” Australasian Journal o f Philosophy 61 (4): 343–77.
Putnam, Hilary. 1977. “Realism and Reason.” Proceedings and Addresses of the American Philosophical Association 50 (6): 483-498.
Putnam, Hilary. 1980. “Models and Reality.” The Journal of Symbolic Logic 45 (3): 464-482.
Quine, W. V. 1951. “Two Dogmas of Empiricism.” The Philosophical Review 60 (1): 20-43.
Quine, W. V. 1960. Word and Object. US, MA: MIT Press.
Quine, W. V. 1968. “Reply to Chomsky.” Synthese 19 (1/2): 274-283.
Quine, W. V. 1969a. Ontological Relativity and Other Essays. New York: Columbia University Press.
Quine, W. V. 1969b. “Epistemology Naturalized.” In Ontological Relativity and Other Essays, by W. V. Quine, 69–90. New York: Columbia University Press.
Quine, W. V. 1970. “On the Reasons for Indeterminacy of Translation.” The Journal of Philosophy 67 (6): 178-183.
Quine, W. V. 1973. The Roots of Reference . Open Court.
Quine, W. V. 1981. Theories and Things. Cambridge, MA: Harvard UP.
Quine, W. V. 1987. “Indeterminacy of Translation Again.” The Journal of Philosophy 84 (1): 5-10.
Quine, W. V. 1990a. Pursuit of Truth. Cambridge, MA: Harvard UP.
Quine, W. V. 1990b. “Three Indeterminacies.” In Perspectives on Quine, edited by R. B. Barrett and R. F. Gibson, 1-16. Cambridge, Mass.: Basil Blackwell.
Quine, W. V. 1990c. “Comment on Bergström.” In Perspectives on Quine, edited by Roger Gibson and R. Barrett, 53-54. Oxford: Blackwell.
Quine, W. V. 1995. From Stimulus to Science. Cambridge, Mass.: Harvard University Press.
Quine, W. V. and J. S. Ullian. 1978. The Web of Belief. New York: McGraw-Hill.
Richard, Mark. 1997. “Inscrutability.” Canadian Journal of Philosophy 27: 165-209.
Searle, John. 1987. “Indeterminacy, Empiricism, and the First Person.” The Journal of Philosophy 84 (3): 123-146.
Wilson, Neil L. 1959. ” Substances without Substrata.” The Review of Metaphysics 12 (4): 521-539.
Author Information
Ali Hossein Khani
Email: hosseinkhani@ipm.ir
Institute for Research in Fundamental Sciences, and
Iranian Institute of Philosophy
Iran
Ethics of Artificial Intelligence
This article provides a comprehensive overview of the main ethical issues related to the impact of Artificial Intelligence (AI) on human society. AI is the use of machines to do things that would normally require human intelligence. In many areas of human life, AI has rapidly and significantly affected human society and the ways we interact with each other. It will continue to do so. Along the way, AI has presented substantial ethical and socio-political challenges that call for a thorough philosophical and ethical analysis. Its social impact should be studied so as to avoid any negative repercussions. AI systems are becoming more and more autonomous, apparently rational, and intelligent. This comprehensive development gives rise to numerous issues. In addition to the potential harm and impact of AI technologies on our privacy, other concerns include their moral and legal status (including moral and legal rights), their possible moral agency and patienthood, and issues related to their possible personhood and even dignity. It is common, however, to distinguish the following issues as of utmost significance with respect to AI and its relation to human society, according to three different time periods: (1) short-term (early 21st century): autonomous systems (transportation, weapons), machine bias in law, privacy and surveillance, the black box problem and AI decision-making; (2) mid-term (from the 2040s to the end of the century): AI governance, confirming the moral and legal status of intelligent machines (artificial moral agents), human-machine interaction, mass automation; (3) long-term (starting with the 2100s): technological singularity, mass unemployment, space colonisation.
This section discusses why AI is of utmost importance for our systems of ethics and morality, given the increasing human-machine interaction.
a. What is AI?
AI may mean several different things and it is defined in many different ways. When Alan Turing introduced the so-called Turing test (which he called an ‘imitation game’) in his famous 1950 essay about whether machines can think, the term ‘artificial intelligence’ had not yet been introduced. Turing considered whether machines can think, and suggested that it would be clearer to replace that question with the question of whether it might be possible to build machines that could imitate humans so convincingly that people would find it difficult to tell whether, for example, a written message comes from a computer or from a human (Turing 1950).
The term ‘AI’ was coined in 1955 by a group of researchers—John McCarthy, Marvin L. Minsky, Nathaniel Rochester and Claude E. Shannon—who organised a famous two-month summer workshop at Dartmouth College on the ‘Study of Artificial Intelligence’ in 1956. This event is widely recognised as the very beginning of the study of AI. The organisers described the workshop as follows:
We propose that a 2-month, 10-man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer. (Proposal 1955: 2)
Another, later scholarly definition describes AI as:
the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings. The term is frequently applied to the project of developing systems endowed with the intellectual processes characteristic of humans, such as the ability to reason, discover meaning, generalize, or learn from past experience. (Copeland 2020)
In the early twenty-first century, the ultimate goal of many computer specialists and engineers has been to build a robust AI system which would not differ from human intelligence in any aspect other than its machine origin. Whether this is at all possible has been a matter of lively debate for several decades. The prominent American philosopher John Searle (1980) introduced the so-called Chinese room argument to contend that strong or general AI (AGI)—that is, building AI systems which could deal with many different and complex tasks that require human-like intelligence—is in principle impossible. In doing so, he sparked a long-standing general debate on the possibility of AGI. Current AI systems are narrowly focused (that is, weak AI) and can only solve one particular task, such as playing chess or the Chinese game of Go. Searle’s general thesis was that no matter how complex and sophisticated a machine is, it will nonetheless have no ‘consciousness’ or ‘mind’, which is a prerequisite for the ability to understand, in contrast to the capability to compute (see section 2.e.).
Searle’s argument has been critically evaluated against the counterclaims of functionalism and computationalism. It is generally argued that intelligence does not require a particular substratum, such as carbon-based beings, but that it will also evolve in silicon-based environments, if the system is complex enough (for example, Chalmers 1996, chapter 9).
In the early years of the twenty-first century, many researchers working on AI development associated AI primarily with different forms of the so-called machine learning—that is, technologies that identify patterns in data. Simpler forms of such systems are said to engage in ‘supervised learning’—which nonetheless still requires considerable human input and supervision—but the aim of many researchers, perhaps most prominently Yann LeCun, had been set to develop the so-called self-supervised learning systems. These days, some researchers began to discuss AI in a way that seems to equate the concept with machine learning. This article, however, uses the term ‘AI’ in a wider sense that includes—but is not limited to—machine learning technologies.
b. Its Ethical Relevance
The major ethical challenges for human societies AI poses are presented well in the excellent introductions by Vincent Müller (2020), Mark Coeckelbergh (2020), Janina Loh (2019), Catrin Misselhorn (2018) and David Gunkel (2012). Regardless of the possibility of construing AGI, autonomous AI systems already raise substantial ethical issues: for example, the machine bias in law, making hiring decisions by means of smart algorithms, racist and sexist chatbots, or non-gender-neutral language translations (see section 2.c.). The very idea of a machine ‘imitating’ human intelligence—which is one common definition of AI—gives rise to worries about deception, especially if the AI is built into robots designed to look or act like human beings (Boden et al. 2017; Nyholm and Frank 2019). Moreover, Rosalind Picard rightly claims that ‘the greater the freedom of a machine, the more it will need moral standards’ (1997: 19). This substantiates the claim that all interactions between AI systems and human beings necessarily entail an ethical dimension, for example, in the context of autonomous transportation (see section 2.d.).
The idea of implementing ethics within a machine is one of the main research goals in the field of machine ethics (for example, Lin et al. 2012; Anderson and Anderson 2011; Wallach and Allen 2009). More and more responsibility has been shifted from human beings to autonomous AI systems which are able to work much faster than human beings without taking any breaks and with no need for constant supervision, as illustrated by the excellent performance of many systems (once they have successfully passed the debugging phase).
It has been suggested that humanity’s future existence may depend on the implementation of solid moral standards in AI systems, given the possibility that these systems may, at some point, either match or supersede human capabilities (see section 2.g.). This point in time was called ‘technological singularity’ by Vernon Vinge in 1983 (see also: Vinge 1993; Kurzweil 2005; Chalmers 2010). The famous playwright Karl Čapek (1920), the renowned astrophysicist Stephen Hawking and the influential philosopher Nick Bostrom (2016, 2018) have all warned about the possible dangers of technological singularity should intelligent machines turn against their creators, that is, human beings. Therefore, according to Nick Bostrom, it is of utmost importance to build friendly AI (see the alignment problem, discussed in section 2.g.).
In conclusion, the implementation of ethics is crucial for AI systems for multiple reasons: to provide safety guidelines that can prevent existential risks for humanity, to solve any issues related to bias, to build friendly AI systems that will adopt our ethical standards, and to help humanity flourish.
2. Main Debates
The following debates are of utmost significance in the context of AI and ethics. They are not the only important debates in the field, but they provide a good overview of topics that will likely remain of great importance for many decades (for a similar list, see Müller 2020).
a. Machine Ethics
Susan Anderson, a pioneer of machine ethics, defines the goal of machine ethics as:
to create a machine that follows an ideal ethical principle or set of principles in guiding its behaviour; in other words, it is guided by this principle, or these principles, in the decisions it makes about possible courses of action it could take. We can say, more simply, that this involves “adding an ethical dimension” to the machine. (2011: 22)
In addition, the study of machine ethics examines issues regarding the moral status of intelligent machines and asks whether they should be entitled to moral and legal rights (Gordon 2020a, 2020b; Richardson 2019; Gunkel and Bryson 2014; Gunkel 2012; Anderson and Anderson 2011; Wallach and Allen 2010). In general, machine ethics is an interdisciplinary sub-discipline of the ethics of technology, which is in turn a discipline within applied ethics. The ethics of technology also contains the sub-disciplines of robot ethics (see, for example, Lin et al. 2011, 2017; Gunkel 2018; Nyholm 2020), which is concerned with questions of how human beings design, construct and use robots; and computer ethics (for example, Johnson 1985/2009; Johnson and Nissenbaum 1995; Himma and Tavani 2008), which is concerned with =commercial behaviour involving computers and information (for example, data security, privacy issues).
The first ethical code for AI systems was introduced by the famed science fiction writer Isaac Asimov, who presented his Three Laws of Robotics in Runaround (Asimov 1942). These three were later supplemented by a fourth law, called the Zeroth Law of Robotics, in Robots and Empire (Asimov 1986). The four laws are as follows:
A robot may not injure a human being or, through inaction, allow a human being to be harmed;
A robot must obey the orders given it by human beings except where such orders would conflict with the first law;
A robot must protect its own existence as long as such protection does not conflict with the first or second law;
A robot may not harm humanity or, by inaction, allow humanity to suffer harm.
Asimov’s four laws have played a major role in machine ethics for many decades and have been widely discussed by experts. The standard view regarding the four laws is that they are important but insufficient to deal with all the complexities related to moral machines. This seems to be a fair evaluation, since Asimov never claimed that his laws could cope with all issues. If that was really the case, then Asimov would perhaps not have written his fascinating stories about problems caused partly by the four laws.
The early years of the twenty-first century saw the proposal of numerous approaches to implementing ethics within machines, to provide AI systems with ethical principles that the machines could use in making moral decisions (Gordon 2020a). We can distinguish at least three types of approaches: bottom-up, top-down, and mixed. An example of each type is provided below (see also Gordon 2020a: 147).
i. Bottom-up Approaches: Casuistry
Guarini’s (2006) system is an example of a bottom-up approach. It uses a neural network which bases its ethical decisions on a learning process in which the neural network is presented with known correct answers to ethical dilemmas. After the initial learning process, the system is supposed to be able to solve new ethical dilemmas on its own. However, Guarini’s system generates problems concerning the reclassification of cases, caused by the lack of adequate reflection and exact representation of the situation. Guarini himself admits that casuistry alone is insufficient for machine ethics.
ii. Top-down Approaches: The MoralDM Approach
The system conceived by Dehghani et al. (2011) combines two main ethical theories, utilitarianism and deontology, along with analogical reasoning. Utilitarian reasoning applies until ‘sacred values’ are concerned, at which point the system operates in a deontological mode and becomes less sensitive to the utility of actions and consequences. To align the system with human moral decisions, Dehghani et al. evaluate it against psychological studies of how the majority of human beings decide particular cases.
The MoralDM approach is particularly successful in that it pays proper respect to the two main ethical theories (deontology and utilitarianism) and combines them in a fruitful and promising way. However, their additional strategy of using empirical studies to mirror human moral decisions by considering as correct only those decisions that align with the majority view is misleading and seriously flawed. Rather, their system should be seen as a model of a descriptive study of ethical behaviour but not a model for normative ethics.
iii. Mixed Approaches: The Hybrid Approach
The hybrid model of human cognition (Wallach et al. 2010; Wallach and Allen 2010) combines a top-down component (theory-driven reasoning) and a bottom-up (shaped by evolution and learning) component that are considered the basis of both moral reasoning and decision-making. The result thus far is LIDA, an AGI software offering a comprehensive conceptual and computational model that models a large portion of human cognition. The hybrid model of moral reasoning attempts to re-create human decision-making by appealing to a complex combination of top-down and bottom-up approaches leading eventually to a descriptive but not a normative model of ethics. In addition, its somewhat idiosyncratic understanding of both approaches from moral philosophy does not in fact match how moral philosophers understand and use them in normative ethics. The model presented by Wallach et al. is not necessarily inaccurate with respect to how moral decision-making works in an empirical sense, but their approach is descriptive rather than normative in nature. Therefore, their empirical model does not solve the normative problem of how moral machines should act. Descriptive ethics and normative ethics are two different things. The former tells us how human beings make moral decisions; the latter is concerned with how we should act.
b. Autonomous Systems
The proposals for a system of machine ethics discussed in section 2.a. are increasingly being discussed in relation to autonomous systems the operation of which poses a risk of harm to human life. The two most-often discussed examples—which are at times discussed together and contrasted and compared with each other—are autonomous vehicles (also known as self-driving cars) and autonomous weapons systems (sometimes dubbed ‘killer robots’) (Purves et al. 2015; Danaher 2016; Nyholm 2018a).
Some authors think that autonomous weapons might be a good replacement for human soldiers (Müller and Simpson 2014). For example, Arkin (2009, 2010) argues that having machines fight our wars for us instead of human soldiers could lead to a decrease in war crimes if the machines were equipped with an ‘ethical governor’ system that would consistently follow the rules of war and engagement. However, others worry about the widespread availability of AI-driven autonomous weapons systems, because they think the availability of such systems might tempt people to go to war more often, or because they are sceptical about the possibility of an AI system that could interpret and apply the ethical and legal principles of war (see, for example, Royakkers and van Est 2015; Strawser 2010). There are also worries that ‘killer robots’ might be hacked (Klincewicz 2015).
Similarly, while acknowledging the possible benefits of self-driving cars—such as increased traffic safety, more efficient use of fuel and better-coordinated traffic—many authors have also noted the possible accidents that could occur (Goodall 2014; Lin 2015; Gurney 2016; Nyholm 2018b, 2018c; Keeling 2020). The underlying idea is that autonomous vehicles should be equipped with ‘ethics settings’ that would help to determine how they should react to accident scenarios where people’s lives and safety are at stake (Gogoll and Müller 2017). This is considered another real-life application of machine ethics that society urgently needs to grapple with.
The concern for self-driving cars being involved in deadly accidents for which the AI system may not have been adequately prepared has already been realised, tragically, as some people have died in such accidents (Nyholm 2018b). The first instance of death while riding in an autonomous vehicle—a Tesla Model S car in ‘autopilot’ mode—occurred in May 2016. The first pedestrian was hit and killed by an experimental self-driving car, operated by the ride-hailing company Uber, in March 2018. In the latter case, part of the problem was that the AI system in the car had difficulty classifying the object that suddenly appeared in its path. It initially classified the victim as ‘unknown’, then as a ‘vehicle’, and finally as a ‘bicycle’. Just moments before the crash, the system decided to apply the brakes, but by then it was too late (Keeling 2020: 146). Whether the AI system in the car functions properly can thus be a matter of life and death.
Philosophers discussing such cases may propose that, even when it cannot brake in time, the car might swerve to one side (for example, Goodall 2014; Lin 2015). But what if five people were on the only side of the road the car could swerve onto? Or what if five people appeared on the road and one person was on the curb where the car might swerve? These scenarios are similar to the much-discussed ‘trolley problem’: the choice would involve killing one person to save five, and the question would become under what sorts of circumstances that decision would or would not be permissible. Several papers have discussed relevant similarities and differences between the ethics of crashes involving self-driving cars, on the one hand, and the philosophy of the trolley problem, on the other (Lin 2015; Nyholm and Smids 2016; Goodall 2016; Himmelreich 2018; Keeling 2020; Kamm 2020).
One question that has occupied ethicists discussing autonomous systems is what ethical principles should govern their decision-making process in situations that might involve harm to human beings. A related issue is whether it is ever acceptable for autonomous machines to kill or harm human beings, particularly if they do so in a manner governed by certain principles that have been programmed into or made part of the machines in another way. Here, a distinction is made between deaths caused by self-driving cars—which are generally considered a deeply regrettable but foreseeable side effect of their use—and killing by autonomous weapons systems, which some consider always morally unacceptable (Purves et al. 2015). Even a campaign has been launched to ‘stop killer robots’, backed by many AI ethicists such as Noel Sharkey and Peter Asaro.
One reason for arguing that autonomous weapons systems should be banned the campaign puts forward is that what they call ‘meaningful human control’ must be retained. This concept is also discussed in relation to self-driving cars (Santoni de Sio and van den Hoven 2018). Many authors have worried about the risk of creating ‘responsibility gaps’, or cases in which it is unclear who should be held responsible for harm that has occurred due to the decisions made by an autonomous AI system (Matthias 2004; Sparrow 2007; Danaher 2016). The key challenge here is to come up with a way of understanding moral responsibility in the context of autonomous systems that would allow us to secure the benefits of such systems and at the same time appropriately attribute responsibility for any undesirable consequences. If a machine causes harm, the human beings involved in the machine’s action may try to evade responsibility; indeed, in some cases it might seem unfair to blame people for what a machine has done. Of course, if an autonomous system produces a good outcome, which some human beings, if any, claim to deserve praise for, the result might be equally unclear. In general, people may be more willing to take responsibility for good outcomes produced by autonomous systems than for bad ones. But in both situations, responsibility gaps can arise. Accordingly, philosophers need to formulate a theory of how to allocate responsibility for outcomes produced by functionally autonomous AI technologies, whether good or bad (Nyholm 2018a; Dignum 2019; Danaher 2019a; Tigard 2020a).
c. Machine Bias
Many people believe that the use of smart technologies would put an end to human bias because of the supposed ‘neutrality’ of machines. However, we have come to realise that machines may maintain and even substantiate human bias towards women, different ethnicities, the elderly, people with medical impairments, or other groups (Kraemer et al. 2011; Mittelstadt et al. 2016). As a consequence, one of the most urgent questions in the context of machine learning is how to avoid machine bias (Daniels et al. 2019). The idea of using AI systems to support human decision-making is, in general, an excellent objective in view of AI’s ‘increased efficiency, accuracy, scale and speed in making decisions and finding the best answers’ (World Economic Forum 2018: 6). However, machine bias can undermine this seemingly positive situation in various ways. Some striking cases of machine bias are as follows:
Gender bias in hiring (Dastin 2018);
Racial bias, in that certain racial groups are offered only particular types of jobs (Sweeney 2013);
Racial bias in decisions on the creditworthiness of loan applicants (Ludwig 2015);
Racial bias in decisions whether to release prisoners on parole (Angwin et al. 2016);
Racial bias in predicting criminal activities in urban areas (O’Neil 2016);
Sexual bias when identifying a person’s sexual orientation (Wang and Kosinski 2018);
Racial bias in facial recognition systems that prefer lighter skin colours (Buolamwini and Gebru 2018);
Racial and social bias in using the geographic location of a person’s residence as a proxy for ethnicity or socio-economic status (Veale and Binns 2017).
We can recognise at least three reasons for machine bias: (1) data bias, (2) computational/algorithmic bias and (3) outcome bias (Springer et al. 2018: 451). First, a machine learning system that is trained using data that contain implicit or explicit imbalances reinforces the distortion in the data with respect to any future decision-making, thereby making the bias systematic. Second, a programme may suffer from algorithmic bias due to the developer’s implicit or explicit biases. The design of a programme relies on the developer’s understanding of the normative and non-normative values of other people, including the users and stakeholders affected by it (Dobbe et al. 2018). Third, outcome bias could be based on the use of historical records, for example, to predict criminal activities in certain particular urban areas; the system may allocate more police to a particular area, resulting in an increase in reported cases which would have been unnoticed before. This logic would substantiate the AI system’s decision to allocate the police to this area, even though other urban areas may have similar or even greater numbers of crimes, more of which would go unreported due to the lack of policing (O’Neil 2016).
Most AI researchers, programmers and developers as well as scholars working in the field of technology believe that we will never be able to design a fully unbiased system. Therefore, the focus is on reducing machine bias and minimising its detrimental effects on human beings. Nevertheless, various questions remain. What type of bias cannot be filtered out and when should we be satisfied with the remaining bias? What does it mean for a person in court to be subject not only to human bias but also to machine bias, with both forms of injustice potentially helping to determine the person’s sentence? Is one type of bias not enough? Should we not rather aim to eliminate human bias instead of introducing a new one?
d. The Problem of Opacity
AI systems are used to make many sorts of decisions that significantly impact people’s lives. AI can be used to make decisions about who gets a loan, who is admitted to a university, who gets an advertised job, who is likely to reoffend, and so on. Since these decisions have major impacts on people, we must be able to understand the underlying reasons for them. In other words, AI and its decision-making need to be explainable. In fact, many authors discussing the ethics of AI propose explainability (also referred to as explicability) as a basic ethical criterion, among others, for the acceptability of AI decision-making (Floridi et al. 2018). However, many decisions made by an autonomous AI system are not readily explainable to people. This came to be called the problem of opacity.
The opacity of AI decision-making can be of different kinds, depending on relevant factors. Some AI decisions are opaque to those who are affected by them because the algorithms behind the decisions, though quite easy to understand, are protected trade secrets which the companies using them do not want to share with anyone outside the company. Another reason for AI opacity is that most people lack the technical expertise to understand how an AI-based system works, even if there is nothing intrinsically opaque about the technology in question. With some forms of AI, not even the experts can understand the decision-making processes used. This has been dubbed the ‘black box’ problem (Wachter, Mittelstadt and Russell 2018).
On the individual level, it can seem to be an affront to a person’s dignity and autonomy when decisions about important aspects of their lives are made by machines if it is unclear—or perhaps even impossible to know—why machines made these decisions. On the societal level, the increasing prominence of algorithmic decision-making could become a threat to our democratic processes. Henry Kissinger, the former U.S. Secretary of State, once stated, ‘We may have created a dominating technology in search of a guiding philosophy’ (Kissinger 2018; quoted in Müller 2020). John Danaher, commenting on this idea, worries that people might be led to act in superstitious and irrational ways, like those in earlier times who believed that they could affect natural phenomena through rain dances or similar behaviour. Danaher has called this situation ‘the threat of algocracy’—that is, of rule by algorithms that we do not understand but have to obey (Danaher 2016b, 2019b).
But is AI opacity always, and necessarily, a problem? Is it equally problematic across all contexts? Should there be an absolute requirement that AI must in all cases be explainable? Scott Robbins (2019) has provided some interesting and noteworthy arguments in opposition to this idea. Robbins argues, among other things, that a hard requirement for explicability could prevent us from reaping all the possible benefits of AI. For example, he points out that if an AI system could reliably detect or predict some form of cancer in a way that we cannot explain or understand, the value of knowing the information would outweigh any concerns about not knowing how the AI system would have reached this conclusion. In general, it is also possible to distinguish between contexts where the procedure behind a decision matters in itself and those where only the quality of the outcome matters (Danaher and Robbins 2020).
Another promising response to the problem of opacity is to try to construct alternative modes of explaining AI decisions that would take into account their opacity but would nevertheless offer some form of explanation that people could act on. Sandra Wachter, Brent Mittelstadt, and Chris Russell (2019) have developed the idea of a ‘counterfactual explanation’ of such decisions, one designed to offer practical guidance for people wishing to respond rationally to AI decisions they do not understand. They state that ‘counterfactual explanations do not attempt to clarify how [AI] decisions are made internally. Instead, they provide insight into which external facts could be different in order to arrive at a desired outcome’ (Wachter et al. 2018: 880). Such an external, counterfactual way of explaining AI decisions might be a promising alternative in cases where AI decision-making is highly valuable but functions according to an internal logic that is opaque to most or all people.
e. Machine Consciousness
Some researchers think that when machines become more and more sophisticated and intelligent, they might at some point become spontaneously conscious as well (compare Russell 2019). This would be a sort of puzzling—but potentially highly significant from an ethical standpoint—side effect of the development of advanced AI. Some people are intentionally seeking to create machines with artificial consciousness. Kunihiro Asada, a successful engineer, set his goal as to create a robot that can experience pleasure and pain, on the basis that such a robot could engage in the kind of pre-linguistic learning that a human baby is capable of before it acquires language (Marchese 2020). Another example is Sophia the robot, whose developers at Hanson Robotics say that they wish to create a ‘super-intelligent benevolent being’ that will eventually become a ‘conscious, living machine’.
Others, such as Joanna Bryson, note that depending on how we define consciousness, some machines might already have some form of consciousness. Bryson argues that if we take consciousness to mean the presence of internal states and the ability to report on these states to other agents, then some machines might fulfil these criteria even now (Bryson 2012). In addition, Aïda Elamrani-Raoult and Roman Yampolskiy (2018) have identified as many as twenty-one different possible tests of machine consciousness.
Moreover, similar claims could be made about the issue of whether machines can have minds. If mind is defined, at least in part, in a functional way, as the internal processing of inputs from the external environment that generates seemingly intelligent responses to that environment, then machines could possess minds (Nyholm 2020: 145–46). Of course, even if machines can be said to have minds or consciousness in some sense, they would still not necessarily be anything like human minds. After all, the particular consciousness and subjectivity of any being will depend on what kinds of ‘hardware’ (such as brains, sense organs, and nervous systems) the being in question has (Nagel 1974).
Whether or not we think some AI machines are already conscious or that they could (either by accident or by design) become conscious, this issue is a key source of ethical controversy. Thomas Metzinger (2013), for example, argues that society should adopt, as a basic principle of AI ethics, a rule against creating machines that are capable of suffering. His argument is simple: suffering is bad, it is immoral to cause suffering, and therefore it would be immoral to create machines that suffer. Joanna Bryson contends similarly that although it is possible to create machines that would have a significant moral status, it is best to avoid doing so; in her view, we are morally obligated not to create machines to which we would have obligations (Bryson 2010, 2019). Again, this might all depend on what we understand by consciousness. Accordingly, Eric Schwitzgebel and Mara Garza (2015: 114–15) comment, ‘If society continues on the path towards developing more sophisticated artificial intelligence, developing a good theory of consciousness is a moral imperative’.
Another interesting perspective is provided by Nicholas Agar (2019), who suggests that if there are arguments both in favour of and against the possibility that certain advanced machines have minds and consciousness, we should err on the side of caution and proceed on the assumption that machines do have minds. On this basis, we should then avoid any actions that might conceivably cause them to suffer. In contrast, John Danaher (2020) states that we can never be sure as to whether a machine has conscious experience, but that this uncertainty does not matter; if a machine behaves similarly to how conscious beings with moral status behave, this is sufficient moral reason, according to Danaher’s ‘ethical behaviourism’, to treat the machine with the same moral considerations with which we would treat a conscious being. The standard approach considers whether machines do actually have conscious minds and then how this answer should influence the question of whether to grant machines moral status (see, for example, Schwitzgebel and Garza 2015; Mosakas 2020; Nyholm 2020: 115–16).
f. The Moral Status of Artificial Intelligent Machines
Traditionally, the concept of moral status has been of utmost importance in ethics and moral philosophy because entities that have a moral status are considered part of the moral community and are entitled to moral protection. Not all members of a moral community have the same moral status, and therefore they differ with respect to their claims to moral protection. For example, dogs and cats are part of our moral community, but they do not enjoy the same moral status as a typical adult human being. If a being has a moral status, then it has certain moral (and legal) rights as well. The twentieth century saw a growth in the recognition of the rights of ethnic minorities, women, and the LGBTQ+ community, and even the rights of animals and the environment. This expanding moral circle may eventually grow further to include artificial intelligent machines once they exist (as advocated by the robot rights movement).
The notion of personhood (whatever that may mean) has become relevant in determining whether an entity has full moral status and whether, depending on its moral status, it should enjoy the full set of moral rights. One prominent definition of moral status has been provided by Frances Kamm (2007: 229):
So, we see that within the class of entities that count in their own right, there are those entities that in their own right and for their own sake could give us reason to act. I think that it is this that people have in mind when they ordinarily attribute moral status to an entity. So, henceforth, I shall distinguish between an entity’s counting morally in its own right and its having moral status. I shall say that an entity has moral status when, in its own right and for its own sake, it can give us reason to do things such as not destroy it or help it.
Things can be done for X’s own sake, according to Kamm, if X is either conscious and/or able to feel pain. This definition usually includes human beings and most animals, whereas non-living parts of nature are mainly excluded on the basis of their lack of consciousness and inability to feel pain. However, there are good reasons why one should broaden their moral reasoning and decision-making to encompass the environment as well (Stone 1972, 2010; Atapattu 2015). For example, the Grand Canyon could be taken into moral account in human decision-making, given its unique form and great aesthetic value, even though it lacks personhood and therefore moral status. Furthermore, some experts have treated sentient animals such as great apes and elephants as persons even though they are not human (for example, Singer 1975; Cavalieri 2001; Francione 2009).
In addition, we can raise the important question of whether (a) current robots used in social situations or (b) artificial intelligent machines, once they are created, might have a moral status and be entitled to moral rights as well, comparable to the moral status and rights of human beings. The following three main approaches provide a brief overview of the discussion.
i. The Autonomy Approach
Kant and his followers place great emphasis on the notion of autonomy in the context of moral status and rights. A moral person is defined as a rational and autonomous being. Against this background, it has been suggested that one might be able to ascribe personhood to artificial intelligent machines once they have reached a certain level of autonomy in making moral decisions. Current machines are becoming increasingly autonomous, so it seems only a matter of time until they meet this moral threshold. A Kantian line of argument in support of granting moral status to machines based on autonomy could be framed as follows:
Rational agents have the capability to decide whether to act (or not act) in accordance with the demands of morality.
The ability to make decisions and to determine what is good has absolute value.
The ability to make such decisions gives rational persons absolute value.
A rational agent can act autonomously, including acting with respect to moral principles.
Rational agents have dignity insofar as they act autonomously.
Acting autonomously makes persons morally responsible.
Such a being—that is, a rational agent—has moral personhood.
It might be objected that machines—no matter how autonomous and rational—are not human beings and therefore should not be entitled to a moral status and the accompanying rights under a Kantian line of reasoning. But this objection is misleading, since Kant himself clearly states in his Groundwork (2009) that human beings should be considered as moral agents not because they are human beings, but because they are autonomous agents (Altman 2011; Timmermann 2020: 94). Kant has been criticised by his opponents for his logocentrism, even though this very claim has helped him avoid the more severe objection of speciesism—of holding that a particular species is morally superior simply because of the empirical features of the species itself (in the case of human beings, the particular DNA). This has been widely viewed as the equivalent of racism at the species level (Singer 2009).
ii. The Indirect Duties Approach
The indirect duties approach is based on Kant’s analysis of our behaviour towards animals. In general, Kant argues in his Lectures on Ethics (1980: 239–41) that even though human beings do not have direct duties towards animals (because they are not persons), they still have indirect duties towards them. The underlying reason is that human beings may start to treat their fellow humans badly if they develop bad habits by mistreating and abusing animals as they see fit. In other words, abusing animals may have a detrimental, brutalising impact on human character.
Kate Darling (2016) has applied the Kantian line of reasoning to show that even current social robots should be entitled to moral and legal protection. She argues that one should protect lifelike beings such as robots that interact with human beings when society cares deeply enough about them, even though they do not have a right to life. Darling offers two arguments why one should treat social robots in this way. Her first argument concerns people who witness cases of abuse and mistreatment of robots, pointing out that they might become ‘traumatized’ and ‘desensitized’. Second, she contends that abusing robots may have a detrimental impact on the abuser’s character, causing her to start treating fellow humans poorly as well.
Indeed, current social robots may be best protected by the indirect duties approach, but the idea that exactly the same arguments should also be applied to future robots of greater sophistication that either match or supersede human capabilities is somewhat troublesome. Usually, one would expect that these future robots—unlike Darling’s social robots of today—will be not only moral patients but rather proper moral agents. In addition, the view that one should protect lifelike beings ‘when society cares deeply enough’ (2016: 230) about them opens the door to social exclusion based purely on people’s unwillingness to accept them as members of the moral community. Morally speaking, this is not acceptable. The next approach attempts to deal with this situation.
iii. The Relational Approach
Mark Coeckelbergh (2014) and David Gunkel (2012), the pioneers of the relational approach to moral status, believe that robots have a moral status based on their social relation with human beings. In other words, moral status or personhood emerges through social relations between different entities, such as human beings and robots, instead of depending on criteria inherent in the being such as sentience and consciousness. The general idea behind this approach comes to the fore in the following key passage (Coeckelbergh 2014: 69–70):
We may wonder if robots will remain “machines” or if they can become companions. Will people start saying, as they tend to say of people who have “met their dog” … , that someone has “met her robot”? Would such a person, having that kind of relation with that robot, still feel shame at all in front of the robot? And is there, at that point of personal engagement, still a need to talk about the “moral standing” of the robot? Is not moral quality already implied in the very relation that has emerged here? For example, if an elderly person is already very attached to her Paro robot and regards it as a pet or baby, then what needs to be discussed is that relation, rather than the “moral standing” of the robot.
The personal experience with the Other, that is, the robot, is the key component of this relational and phenomenological approach. The relational concept of personhood can be fleshed out in the following way:
A social model of autonomy, under which autonomy is not defined individually but stands in the context of social relations;
Personhood is absolute and inherent in every entity as a social being; it does not come in degrees;
An interactionist model of personhood, according to which personhood is relational by nature (but not necessarily reciprocal) and defined in non-cognitivist terms.
The above claims are not intended as steps in a conclusive argument; rather, they portray the general line of reasoning regarding the moral importance of social relations. The relational approach does not require the robot to be rational, intelligent or autonomous as an individual entity; instead, the social encounter with the robot is morally decisive. The moral standing of the robot is based on exactly this social encounter.
The problem with the relational approach is that the moral status of robots is thus based completely on human beings’ willingness to enter into social relations with a robot. In other words, if human beings (for whatever reasons) do not want to enter into such relations, they could deny robots a moral status to which the robots might be entitled on more objective criteria such as rationality and sentience. Thus, the relational approach does not actually provide a strong foundation for robot rights; rather, it supports a pragmatic perspective that would make it easier to welcome robots (who already have moral status) in the moral community (Gordon 2020c).
iv. The Upshot
The three approaches discussed in sections 2.f.i-iii. all attempt to show how one can make sense of the idea of ascribing moral status and rights to robots. The most important observation is, however, that robots are entitled to moral status and rights independently of our opinion, once they have fulfilled the relevant criteria. Whether human beings will actually recognise their status and rights are a different matter.
g. Singularity and Value Alignment
Some of the theories of the potential moral status of artificial intelligent agents discussed in section 2.f. have struck some authors as belonging to science fiction. The same can be said about the next topic to be considered: singularity. The underlying argument regarding technological singularity was introduced by statistician I. J. Good in ‘Speculations Concerning the First Ultraintelligent Machine’ (1965):
Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion”, and the intelligence of man would be left far behind. Thus, the first ultraintelligent machine is the last invention that man need ever make.
The idea of an intelligence explosion involving self-replicating, super-intelligent AI machines seems inconceivable to many; some commentators dismiss such claims as a myth about the future development of AI (for example, Floridi 2016). However, prominent voices both inside and outside academia are taking this idea very seriously—in fact, so seriously that they fear the possible consequence of the so-called ‘existential risks’ such as the risk of human extinction. Among those voicing such fears are philosophers like Nick Bostrom and Toby Ord, but also prominent figures like Elon Musk and the late Stephen Hawking.
Authors discussing the idea of technological singularity differ in their views about what might lead to it. The famous futurist Ray Kurzweil is well-known for advocating the idea of singularity with exponentially increasing computing power, associated with ‘Moore’s law’, which points out that the computing power of transistors, at the time of writing, had been doubling every two years since the 1970s and could reasonably be expected to continue to do so in future (Kurzweil 2005). This approach sees the path to superintelligence as likely to proceed through a continuing improvement of the hardware Another take on what might lead to superintelligence—favoured by the well-known AI researcher Stuart Russell—focuses instead on algorithms. From Russell’s (2019) point of view, what is needed for singularity to occur are conceptual breakthroughs in such areas as the studies of language and common-sense processing as well as learning processes.
Researchers concerned with singularity approach the issue of what to do to guard humanity against such existential risks in several different ways, depending in part on what they think these existential risks depend on. Bostrom, for example, understands superintelligence as consisting of a maximally powerful capacity to achieve whatever aims might be associated with artificial intelligent systems. In his much-discussed example (Bostrom 2014), a super-intelligent machine threatens the future of human life by becoming optimally efficient at maximising the number of paper clips in the world, a goal whose achievement might be facilitated by removing human beings so as to make more space for paper clips. From this point of view, it is crucial to equip super-intelligent AI machines with the right goals, so that when they pursue these goals in maximally efficient ways, there is no risk that they will extinguish the human race along the way. This is one way to think about how to create a beneficial super-intelligence.
Russell (2019) presents an alternative picture, formulating three rules for AI design, which might perhaps be viewed as an updated version of or suggested replacement for Asimov’s fictional laws of robotics (see section 2.a.):
The machine’s only objective is to maximise the realisation of human preferences.
The machine is initially uncertain about what those preferences are.
The ultimate source of information about human preferences is human behaviour.
The theories discussed in this section represent different ideas about what is sometimes called ‘value alignment’—that is, the concept that the goals and functioning of AI systems, especially super-intelligent future AI systems, should be properly aligned with human values. AI should be tracking human interests and values, and its functioning should benefit us and not lead to any existential risks, according to the ideal of value alignment. As noted in the beginning of this section, to some commentators, the idea that AI could become super-intelligent and pose existential threats is simply a myth that needs to be busted. But according to others, thinkers such as Toby Ord, AI is among the main reasons why humanity is in a critical period where its very future is at stake. According to such assessments, AI should be treated on a par with nuclear weapons and other potentially highly destructive technologies that put us all at great risk unless proper value alignment happens (Ord 2020).
A key problem concerning value alignment—especially if understood along the lines of Russell’s three principles—is whose values or preferences AI should be aligned with. As Iason Gabriel (2020) notes, reasonable people may disagree on what values and interests are the right ones with which to align the functioning of AI (whether super-intelligent or not). Gabriel’s suggestion for solving this problem is inspired by John Rawls’ (1999, 2001) work on ‘reasonable pluralism’. Rawls proposes that society should seek to identify ‘fair principles’ that could generate an overlapping consensus or widespread agreement despite the existence of more specific, reasonable disagreements about values among members of society. But how likely is it that this kind of convergence in general principles would find widespread support? (See section 3.)
h. Other Debates
In addition to the topics highlighted above, other issues that have not received as much attention are beginning to be discussed within AI ethics. Five such issues are discussed briefly below.
i. AI as a form of Moral Enhancement or a Moral Advisor
AI systems tend to be used as ‘recommender systems’ in online shopping, online entertainment (for example, music and movie streaming), and other realms. Some ethicists have discussed the advantages and disadvantages of AI systems whose recommendations could help us to make better choices and ones more consistent with our basic values. Perhaps AI systems could even, at some point, help us improve our values. Works on these and related questions include Borenstein and Arkin (2016), Giubilini et al. (2015, 2018), Klincewicz (2016), and O’Neill et al. (2021).
ii. AI and the Future of Work
Much discussion about AI and the future of work concerns the vital issue of whether AI and other forms of automation will cause widespread ‘technological unemployment’ by eliminating large numbers of human jobs that would be taken over by automated machines (Danaher 2019a). This is often presented as a negative prospect, where the question is how and whether a world without work would offer people any prospects for fulfilling and meaningful activities, since certain goods achieved through work (other than income) are hard to achieve in other contexts (Gheaus and Herzog 2016). However, some authors have argued that work in the modern world exposes many people to various kinds of harm (Anderson 2017). Danaher (2019a) examines the important question of whether a world with less work might actually be preferable. Some argue that existential boredom would proliferate if human beings can no longer find a meaningful purpose in their work (or even their life) because machines have replaced them (Bloch 1954). In contrast, Jonas (1984) criticises Bloch, arguing that boredom will not be a substantial issue at all. Another related issue—perhaps more relevant in the short and medium-term—is how we can make increasingly technologised work remain meaningful (Smids et al. 2020).
iii. AI and the Future of Personal Relationships
Various AI-driven technologies affect the nature of friendships, romances and other interpersonal relationships and could impact them even more in future. Online ‘friendships’ arranged through social media have been investigated by philosophers who disagree as to whether relationships that are partly curated by AI algorithms, could be true friendships (Cocking et al. 2012; McFall 2012; Kaliarnta 2016; Elder 2017). Some philosophers have sharply criticised AI-driven dating apps, which they think might reinforce negative stereotypes and negative gender expectations (Frank and Klincewicz 2018). In more science-fiction-like philosophising, which might nevertheless become increasingly present in real life, there has also been discussion about whether human beings could have true friendships or romantic relationships with robots and other artificial agents equipped with advanced AI (Levy 2008; Sullins 2012; Elder 2017; Hauskeller 2017; Nyholm and Frank 2017; Danaher 2019c; Nyholm 2020).
iv. AI and the Concern About Human ‘Enfeeblement’
If more and more aspects of our lives are driven by the recommendations of AI systems (since we do not understand its functioning and we might question the propriety of its functioning), the results could include ‘a crisis in moral agency’ (Danaher 2019d), human ‘enfeeblement’ (Russell 2019), or ‘de-skilling’ in different areas of human life (Vallor 2015, 2016). This scenario becomes even more likely should technological singularity be attained, because at that point all work, including all research and engineering, could be done by intelligent machines. After some generations, human beings might indeed be completely dependent on machines in all areas of life and unable to turn the clock back. This situation is very dangerous; hence it is of utmost importance that human beings remain skilful and knowledgeable while developing AI capacities.
v. Anthropomorphism
The very idea of artificial intelligent machines that imitate human thinking and behaviour might incorporate, according to some, a form of anthropomorphising that ought to be avoided. In other words, attributing humanlike qualities to machines that are not human might pose a problem. A common worry about many forms of AI technologies (or about how they are presented to the general public) is that they are deceptive (for example, Boden et al. 2017). Many have objected that companies tend to exaggerate the extent to which their products are based on AI technology. For example, several prominent AI researchers and ethicists have criticised the makers of Sophia the robot for falsely presenting her as much more humanlike than she really is (for example, Sharkey 2018; Bryson 2010, 2019), and as being designed to prompt anthropomorphising responses in human beings that are somehow problematic or unfitting. The related question of whether anthropomorphising responses to AI technologies are always problematic requires further consideration, which it is increasingly receiving (for example, Coeckelbergh 2010; Darling 2016, 2017; Gunkel 2018; Danaher 2020; Nyholm 2020; Smids 2020).
This list of emerging topics within AI ethics is not exhaustive, as the field is very fertile, with new issues arising constantly. This is perhaps the fastest-growing field within the study of ethics and moral philosophy.
3. Ethical Guidelines for AI
As a result of widespread awareness of and interest in the ethical issues related to AI, several influential institutions (including governments, the European Union, large companies and other associations) have already tasked expert panels with drafting policy documents and ethical guidelines for AI. Such documents have proliferated to the point at which it is very difficult to keep track of all the latest AI ethical guidelines being released. Additionally, AI ethics is receiving substantial funding from various public and private sources, and multiple research centres for AI ethics have been established. These developments have mostly received positive responses, but there have also been some worries about the so-called ‘ethics washing’—that is, giving an ethical stamp of approval to something that might be, from a more critical point of view, ethically problematic (compare Tigard 2020b)—along with concerns that some efforts may be relatively toothless or too centred on the West, ignoring non-Western perspectives on AI ethics. This section, before discussing such criticisms, reviews examples of already published ethical guidelines and considers whether any consensus can emerge between these differing guidelines.
An excellent resource in this context is the overview by Jobin et al. (2019), who conducted a substantial comparative review of 84 sets of ethical guidelines issued by national or international organisations from various countries. Jobin et al. found strong convergence around five key principles—transparency, justice and fairness, non-maleficence, responsibility, and privacy, among many. Their findings are reported here to illustrate the extent of this convergence on some (but not all) of the principles discussed in the original paper. The number on the left indicates the number of ethical guideline documents, among the 84 examined, in which a particular principle was prominently featured. The codes Jobin et al. used are included so that readers can see the basis for their classification.
The review conducted by Jobin et al. (2019) reveals, at least with respect to the first five principles on the list, a significant degree of overlap in these attempts to create ethical guidelines for AI (see Gabriel 2020). On the other hand, the last six items on the list (beginning with beneficence) appeared as key principles in fewer than half of the documents studied. Relatedly, researchers working on the ‘moral machine’ research project, which examined people’s attitudes as to what self-driving cars should be programmed to do in various crash dilemma scenarios, also found great variation, including cross-cultural variation (Awad et al. 2018).
These ethical guidelines have received a fair amount of criticism—both in terms of their content and with respect to how they were created (for example, Metzinger 2019). For Metzinger, the very idea of ‘trustworthy AI’ is ‘nonsense’ since only human beings and not machines can be, or fail to be, trustworthy. Furthermore, the EU high-level expert group on AI had very few experts from the field of ethics but numerous industry representatives, who had an interest in toning down any ethical worries about the AI industry. In addition, the EU document ‘Ethical Guidelines for Trustworthy AI’ uses vague and non-confrontational language. It is, to use the term favoured by Resseguier and Rodrigues (2020), a mostly ‘toothless’ document. The EU ethical guidelines that industry representatives have supposedly made toothless illustrate the concerns raised about the possible ‘ethics washing’.
Another point of criticism regarding these kinds of ethical guidelines is that many of the expert panels drafting them are non-inclusive and fail to take non-Western (for example, African and Asian) perspectives on AI and ethics into account. Therefore, it would be important for future versions of such guidelines—or new ethical guidelines—to include non-Western contributions. Notably, in academic journals that focus on the ethics of technology, there has been modest progress towards publishing more non-Western perspectives on AI ethics—for example, applying Dao (Wong 2012), Confucian virtue-ethics perspectives (Jing and Doorn 2020), and southern African relational and communitarian ethics perspectives including the ‘ubuntu’ philosophy of personhood and interpersonal relationships (see Wareham 2020).
4. Conclusion
The ethics of AI has become one of the liveliest topics in philosophy of technology. AI has the potential to redefine our traditional moral concepts, ethical approaches and moral theories. The advent of artificial intelligent machines that may either match or supersede human capabilities poses a big challenge to humanity’s traditional self-understanding as the only beings with the highest moral status in the world. Accordingly, the future of AI ethics is unpredictable but likely to offer considerable excitement and surprise.
5. References and Further Reading
Agar, N. (2020). How to Treat Machines That Might Have Minds. Philosophy & Technology, 33(2): 269–82.
Altman, M. C. (2011). Kant and Applied Ethics: The Uses and Limits of Kant’s Practical Philosophy. Malden, NJ: Wiley-Blackwell.
Anderson, E. (2017). Private Government: How Employers Rule Our Lives (and Why We Don’t Talk about It). Princeton, NJ: Princeton University Press.
Anderson, M., and Anderson, S. (2011). Machine Ethics. Cambridge: Cambridge University Press.
Anderson, S. L. (2011). Machine Metaethics. In M. Anderson and S. L. Anderson (Eds.), Machine Ethics, 21–27. Cambridge: Cambridge University Press.
Angwin, J., Larson, J., Mattu, S., and Kirchner, L. (2016). Machine Bias. In ProPublica, May 23. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.
Arkin, R. (2009). Governing Lethal Behavior in Autonomous Robots. Boca Raton, FL: CRC Press.
Arkin, R. (2010). The Case for Ethical Autonomy in Unmanned Systems. Journal of Military Ethics, 9(4), 332–41.
Asimov, I. (1942). Runaround: A Short Story. New York: Street and Smith.
Asimov, I. (1986). Robots and Empire: The Classic Robot Novel. New York: HarperCollins.
Atapattu, S. (2015). Human Rights Approaches to Climate Change: Challenges and Opportunities. New York: Routledge.
Awad, E., Dsouza, S., Kim, R., Schulz, J., Henrich, J., Shariff, A., Bonnefon, J. F., and Rahwan, I. (2018). The Moral Machine Experiment. Nature, 563, 59–64.
Bloch, E. (1985/1954). Das Prinzip Hoffnung, 3 vols. Frankfurt am Main: Suhrkamp.
Boden, M., Bryson, J., Caldwell, D., Dautenhahn, K., Edwards, L., Kember, S., Newman, P., Parry, V., Pegman, G., Rodden, T., Sorell, T., Wallis, M., Whitby, B., and Winfield, A. (2017). Principles of Robotics: Regulating Robots in the Real World. Connection Science, 29(2), 124–29.
Borenstein, J. and Arkin, R. (2016). Robotic Nudges: The Ethics of Engineering a More Socially Just Human Being. Science and Engineering Ethics, 22, 31–46.
Bostrom, N. (2014). Superintelligence: Paths, Dangers, Strategies. Oxford: Oxford University Press.
Bryson, J. (2010). Robots Should Be Slaves. In Y. Wilks (Ed.), Close Engagements with Artificial Companions, 63–74. Amsterdam: John Benjamins.
Bryson, J. (2012). A Role for Consciousness in Action Selection. International Journal of Machine Consciousness, 4(2), 471–82.
Bryson, J. (2019). Patiency Is Not a Virtue: The Design of Intelligent Systems and Systems of Ethics. Ethics and Information Technology, 20(1), 15–26.
Buolamwini, J., and Gebru, T. (2018). Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of the 1st Conference on Fairness, Accountability, and Transparency. PMLR, 81, 77–91.
Čapek, K. (1920). Rossum’s Universal Robots. Adelaide: The University of Adelaide.
Cavalieri, P. (2001). The Animal Question: Why Non-Human Animals Deserve Human Rights. Oxford: Oxford University Press.
Chalmers, D. (1996). The Conscious Mind: In Search of a Fundamental Theory. New York/Oxford: Oxford University Press.
Chalmers, D. (2010). The Singularity: A Philosophical Analysis. Journal of Consciousness Studies, 17, 7–65.
Cocking, D., Van Den Hoven, J., and Timmermans, J. (2012). Introduction: One Thousand Friends. Ethics and Information Technology, 14, 179–84.
Coeckelbergh, M. (2010). Robot Rights? Towards a Social-Relational Justification of Moral Consideration. Ethics and Information Technology, 12(3), 209–21.
Coeckelbergh, M. (2014). The Moral Standing of Machines: Towards a Relational and Non- Cartesian Moral Hermeneutics. Philosophy & Technology, 27(1), 61–77.
Coeckelbergh, M. (2020). AI Ethics. Cambridge, MA and London: MIT Press.
Danaher, J. (2016a). Robots, Law, and the Retribution Gap. Ethics and Information Technology, 18(4), 299–309.
Danaher, J. (2016b). The Threat of Algocracy: Reality, Resistance and Accommodation. Philosophy & Technology, 29(3), 245–68.
Danaher, J. (2019a). Automation and Utopia. Cambridge, MA: Harvard University Press.
Danaher, J. (2019b). Escaping Skinner’s Box: AI and the New Era of Techno-Superstition. Philosophical Disquisitions blog: https://philosophicaldisquisitions.blogspot.com/2019/10/escaping-skinners-box-ai-and-new-era-of.html.
Danaher, J. (2019c). The Philosophical Case for Robot Friendship. Journal of Posthuman Studies, 3(1), 5–24.
Danaher, J. (2019d). The Rise of the Robots and the Crises of Moral Patiency. AI & Society, 34(1), 129–36.
Danaher, J. (2020). Welcoming Robots into the Moral Circle? A Defence of Ethical Behaviourism. Science and Engineering Ethics, 26(4), 2023–49.
Danaher, J., and Robbins, S. (2020). Should AI Be Explainable? Episode 77 of the Philosophical Disquisitions Podcast: https://philosophicaldisquisitions.blogspot.com/2020/07/77-should-ai-be-explainable.html.
Daniels, J., Nkonde, M. and Mir, D. (2019). Advancing Racial Literacy in Tech. https://datasociety.net/output/advancing-racial-literacy-in-tech/.
Darling, K. (2016). Extending Legal Protection to Social Robots: The Effects of Anthro- pomorphism, Empathy, and Violent Behavior towards Robotic Objects. In R. Calo, A. M. Froomkin and I. Kerr (eds.), Robot Law, 213–34. Cheltenham: Edward Elgar.
Darling, K. (2017). “Who’s Johnny?” Anthropological Framing in Human-Robot Interaction, Integration, and Policy. In P. Lin, K. Abney and R. Jenkins (Eds.), Robot Ethics 2.0: From Autonomous Cars to Artificial Intelligence, 173–92. Oxford: Oxford University Press.
Dastin, J. (2018). Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women. Reuters, October 10. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G.
Dehghani, M., Forbus, K., Tomai, E., and Klenk, M. (2011). An Integrated Reasoning Approach to Moral Decision Making. In M. Anderson and S. L. Anderson (Eds.), Machine Ethics, 422–41. Cambridge: Cambridge University Press.
Dignum, V. (2019). Responsible Artificial Intelligence: How to Develop and Use AI in a Responsible Way. Berlin: Springer.
Dobbe, R., Dean, S., Gilbert, T., and Kohli, N. (2018). A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics. In 2018 Workshop on Fairness, Accountability and Transparency in Machine Learning during ICMI, Stockholm, Sweden (July 18 version). https://arxiv.org/abs/1807.00553.
Elamrani, A., and Yampolskiy, R. (2018). Reviewing Tests for Machine Consciousness. Journal of Consciousness Studies, 26(5–6), 35–64.
Elder, A. (2017). Friendship, Robots, and Social Media: False Friends and Second Selves. London: Routledge.
Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., Vayena, E. (2018). AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations. Minds and Machines, 28(4), 689–707. https://doi.org/10.1007/s11023-018-9482-5
Francione, G. L. (2009). Animals as Persons. Essay on the Abolition of Animal Exploitation. New York: Columbia University Press.
Frank, L., and Klincewicz, M. (2018): Swiping Left on the Quantified Relationship: Exploring the Potential Soft Impacts. American Journal of Bioethics, 18(2), 27–28.
Gabriel, I. (2020). Artificial Intelligence, Values, and Alignment. Minds and Machines, available online at https://link.springer.com/article/10.1007/s11023-020-09539-2.
Gheaus, A., and Herzog, L. (2016). Goods of Work (Other than Money!). Journal of Social Philosophy, 47(1), 70–89.
Giubilini, A., and Savulescu, J. (2018). The Artificial Moral Advisor: The “Ideal Observer” Meets Artificial Intelligence. Philosophy & Technology, 1–20. https://doi.org/10.1007/s13347-017-0285-z.
Gogoll, J., and Müller, J. F. (2017). Autonomous Cars: In Favor of a Mandatory Ethics Setting. Science and Engineering Ethics, 23(3), 681–700.
Good, I. J. (1965). Speculations Concerning the First Ultraintelligent Machine. In F. Alt and M. Rubinoff (Eds.), Advances in Computers, vol. 6, 31–88. Cambridge, MA: Academic Press.
Goodall, N. J. (2014). Ethical Decision Making during Automated Vehicle Crashes. Transportation Research Record: Journal of the Transportation Research Board, 2424, 58–65.
Goodall, N. J. (2016). Away from Trolley Problems and Toward Risk Management. Applied Artificial Intelligence, 30(8), 810–21.
Gordon, J.-S. (2020a). Building Moral Machines: Ethical Pitfalls and Challenges. Science and Engineering Ethics, 26, 141–57.
Gordon, J.-S. (2020b). What Do We Owe to Intelligent Robots? AI & Society, 35, 209–23.
Gordon, J.-S. (2020c). Artificial Moral and Legal Personhood. AI & Society, online first at https://link.springer.com/article/10.1007%2Fs00146-020-01063-2.
Guarini, M. (2006). Particularism and the Classification and Reclassification of Moral Cases. IEEE Intelligent Systems, 21(4), 22–28.
Gunkel, D. J., and Bryson, J. (2014). The Machine as Moral Agent and Patient. Philosophy & Technology, 27(1), 5–142.
Gunkel, D. (2012). The Machine Question. Critical Perspectives on AI, Robots, and Ethics. Cambridge, MA: MIT Press.
Gunkel, D. (2018). Robot Rights. Cambridge, MA: MIT Press.
Gurney, J. K. (2016). Crashing into the Unknown: An Examination of Crash-Optimization Algorithms through the Two Lanes of Ethics and Law. Alabama Law Review, 79(1), 183–267.
Himmelreich, J. (2018). Never Mind the Trolley: The Ethics of Autonomous Vehicles in Mundane Situations. Ethical Theory and Moral Practice, 21(3), 669–84.
Himma, K., and Tavani, H. (2008). The Handbook of Information and Computer Ethics. Hoboken, NJ: Wiley.
Jobin, A., Ienca, M., and Vayena, E. (2019). The Global Landscape of AI Ethics Guidelines. Nature Machine Intelligence, 1(9), 389–399.
Johnson, D. (1985/2009). Computer Ethics, 4th ed. New York: Pearson.
Johnson, D., and Nissenbaum, H. (1995). Computing, Ethics, and Social Values. Englewood Cliffs, NJ: Prentice Hall.
Jonas, H. (2003/1984). Das Prinzip Verantwortung. Frankfurt am Main: Suhrkamp.
Kaliarnta, S. (2016). Using Aristotle’s Theory of Friendship to Classify Online Friendships: A Critical Counterpoint. Ethics and Information Technology, 18(2), 65–79.
Kamm, F. (2007). Intricate ethics: Rights, responsibilities, and permissible harm. Oxford, UK: Oxford University Press.
Kamm, F. (2020). The Use and Abuse of the Trolley Problem: Self-Driving Cars, Medical Treatments, and the Distribution of Harm. In S. M. Liao (Ed.) The Ethics of Artificial Intelligence, 79–108. New York: Oxford University Press.
Kant, I. (1980). Lectures on Ethics, trans. Louis Infield, Indianapolis, IN: Hackett Publishing Company.
Kant, I. (2009). Groundwork of the Metaphysic of Morals. New York: Harper Perennial Modern Classics.
Keeling, G. (2020). The Ethics of Automated Vehicles. PhD Dissertation, University of Bristol. https://research-information.bris.ac.uk/files/243368588/Pure_Thesis.pdf.
Kissinger, H. A. (2018). How the Enlightenment Ends: Philosophically, Intellectually—in Every Way—Human Society Is Unprepared for the Rise of Artificial Intelligence. The Atlantic, June. https://www.theatlantic.com/magazine/archive/2018/06/henry-kissinger-ai-could-mean-the-end-of-human-history/559124/.
Klincewicz, M. (2016). Artificial Intelligence as a Means to Moral Enhancement. In Studies in Logic, Grammar and Rhetoric. https://doi.org/10.1515/slgr-2016-0061.
Klincewicz, M. (2015). Autonomous Weapons Systems, the Frame Problem and Computer Security. Journal of Military Ethics, 14(2), 162–76.
Kraemer, F., Van Overveld, K., and Peterson, M. (2011). Is There an Ethics of Algorithms? Ethics and Information Technology, 13, 251–60.
Kurzweil, R. (2005). The Singularity Is Near. London: Penguin Books.
Levy, D. (2008). Love and Sex with Robots. London: Harper Perennial.
Lin, P. (2015). Why Ethics Matters for Autonomous Cars. In M. Maurer, J. C. Gerdes, B. Lenz and H. Winner (Eds.), Autonomes Fahren: Technische, rechtliche und gesellschaftliche Aspekte, 69–85. Berlin: Springer.
Lin, P., Abney, K. and Bekey, G. A. (Eds). (2014). Robot Ethics: The Ethical and Social Implications of Robotics. Intelligent Robotics and Autonomous Agents. Cambridge, MA and London: MIT Press.
Lin, P., Abney, K. and Jenkins, R. (Eds.) (2017). Robot Ethics 2.0: From Autonomous Cars to Artificial Intelligence. New York: Oxford University Press.
Loh, J. (2019). Roboterethik. Eine Einführung. Frankfurt am Main: Suhrkamp.
Ludwig, S. (2015). Credit Scores in America Perpetuate Racial Injustice: Here’s How. The Guardian, October 13. https://www.theguardian.com/commentisfree/2015/oct/13/your-credit-score-is-racist-heres-why.
Marchese, K. (2020). Japanese Scientists Develop “Blade Runner” Robot That Can Feel Pain. Design Boom, February 24. https://www.designboom.com/technology/japanese-scientists-develop-hyper-realistic-robot-that-can-feel-pain-02-24-2020/.
Matthias, A. (2004). The Responsibility Gap: Ascribing Responsibility for the Actions of Learning Automata. Ethics and Information Technology, 6(3), 175–83.
McCarthy, J., Minsky, M. L., Rochester, N. and Shannon, C. E. (1955). A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence.http://raysolomonoff.com/dartmouth/boxa/dart564props.pdf.
McFall, M. T. (2012). Real Character-Friends: Aristotelian Friendship, Living Together, And Technology. Ethics and Information Technology, 14, 221–30.
Metzinger, T. (2013), Two Principles for Robot Ethics. In E. Hilgendorf and J.-P. Günther (Eds.), Robotik und Gesetzgebung, 263–302. Baden-Baden: Nomos.
Metzinger, T. (2019). Ethics Washing Made in Europe. Der Tagesspiegel. https://www.tagesspiegel.de/politik/eu-guidelines-ethics-washing-made-in-europe/24195496.html.
Misselhorn, C. (2018). Grundfragen der Maschinenethik. Stuttgart: Reclam.
Mittelstadt, B., Allo, P., Taddeo, M., Wachter, S. and Floridi, L. (2016). The Ethics of Algorithms: Mapping the Debate. 3(2). https://journals.sagepub.com/doi/full/10.1177/2053951716679679.
Mosakas, K. (2020). On the Moral Status of Social Robots: Considering the Consciousness Criterion. AI & Society, online first at https://link.springer.com/article/10.1007/s00146-020-01002-1.
Müller, V. C., and Simpson, T. W. (2014). Autonomous Killer Robots Are Probably Good News. Frontiers in Artificial Intelligence and Applications, 273, 297–305.
Müller, V. C. (2020). Ethics of Artificial Intelligence and Robotics. Stanford Encyclopedia of Philosophy, https://plato.stanford.edu/entries/ethics-ai/.
Nyholm, S. (2018a). Attributing Agency to Automated Systems: Reflections on Human-Robot Collaborations and Responsibility-Loci. Science and Engineering Ethics, 24(4), 1201–19.
Nyholm, S. (2018b). The Ethics of Crashes with Self-Driving Cars: A Roadmap, I. Philosophy Compass, 13(7), e12507.
Nyholm, S. (2018c). The Ethics of Crashes with Self-Driving Cars, A Roadmap, II. Philosophy Compass, 13(7), e12506.
Nyholm, S. (2020). Humans and Robots: Ethics, Agency, and Anthropomorphism. London: Rowman and Littlefield.
Nyholm, S., and Frank. L. (2017). From Sex Robots to Love Robots: Is Mutual Love with a Robot Possible? In J. Danaher and N. McArthur, Robot Sex: Social and Ethical Implications. Cambridge, MA: MIT Press.
Nyholm, S., and Frank, L. (2019). It Loves Me, It Loves Me Not: Is It Morally Problematic to Design Sex Robots That Appear to Love Their Owners? Techne: Research in Philosophy and Technology, 23(3), 402–24.
Nyholm, S., and Smids, J. (2016). The Ethics of Accident-Algorithms for Self-Driving Cars: An Applied Trolley Problem? Ethical Theory and Moral Practice, 19(5), 1275–89.
Okyere-Manu, B. (Ed.) (2021). African Values, Ethics, and Technology: Questions, Issues, and Approaches. London: Palgrave MacMillan.
O’Neil, C. (2016). Weapons of Math Destruction. London: Allen Lane.
O’Neill, E., Klincewicz, M. and Kemmer, M. (2021). Ethical Issues with Artificial Ethics Assistants. In C. Veliz (Ed.), Oxford Handbook of Digital Ethics. Oxford: Oxford University Press.
Ord, T. (2020): The Precipice: Existential Risk and the Future of Humanity. London: Hachette Books.
Picard, R. (1997). Affective Computing. Cambridge, MA and London: MIT Press.
Purves, D., Jenkins, R. and Strawser, B. J. (2015). Autonomous Machines, Moral Judgment, and Acting for the Right Reasons. Ethical Theory and Moral Practice, 18(4), 851–72.
Rawls, J. (1999). The Law of Peoples, with The Idea of Public Reason Revisited. Cambridge, MA: Harvard University Press.
Rawls, J. (2001). Justice as Fairness: A Restatement. Cambridge, MA: Harvard University Press.
Resseguier, A., and Rodrigues, R. (2020). AI Ethics Should Not Remain Toothless! A Call to Bring Back the Teeth of Ethics. Big Data & Society, online first at https://journals.sagepub.com/doi/full/10.1177/2053951720942541.
Richardson, K. (2019). Special Issue: Ethics of AI and Robotics. AI & Society, 34(1).
Robbins, S. (2019). A Misdirected Principle with a Catch: Explicability for AI. Minds and Machines, 29(4), 495–514.
Royakkers, L., and van Est, R. (2015). Just Ordinary Robots: Automation from Love to War. Boca Raton, FL: CRC Press.
Russell, S. (2019). Human Compatible. New York: Viking Press.
Ryan, M., and Stahl, B. (2020). Artificial Intelligence Guidelines for Developers and Users: Clarifying Their Content and Normative Implications. Journal of Information, Communication and Ethics in Society, online first at https://www.emerald.com/insight/content/doi/10.1108/JICES-12-2019-0138/full/html
Santoni de Sio, F., and Van den Hoven, J. (2018). Meaningful Human Control over Autonomous Systems: A Philosophical Account. Frontiers in Robotics and AI. https://www.frontiersin.org/articles/10.3389/frobt.2018.00015/full.
Savulescu, J., and Maslen, H. (2015). Moral Enhancement and Artificial Intelligence: Moral AI? In Beyond Artificial Intelligence, 79–95. Springer.
Schwitzgebel, E., and Garza, M. (2015). A Defense of the Rights of Artificial Intelligences. Midwest Studies in Philosophy, 39(1), 98–119.
Searle, J. R. (1980). Minds, Brains, and Programs. Behavioural and Brain Sciences, 3(3), 417–57.
Sharkey, Noel (2018), Mama Mia, It’s Sophia: A Show Robot or Dangerous Platform to Mislead? Forbes, November 17. https://www.forbes.com/sites/noelsharkey/2018/11/17/mama-mia-its- sophia-a-show-robot-or-dangerous-platform-to-mislead/#407e37877ac9.
Singer, P. (1975). Animal liberation. London, UK: Avon Books.
Singer, P. (2009). Speciesism and Moral Status. Metaphilosophy, 40(3–4), 567–81.
Smids, J. (2020). Danaher’s Ethical Behaviourism: An Adequate Guide to Assessing the Moral Status of a Robot? Science and Engineering Ethics, 26(5), 2849–66.
Smids, J., Nyholm, S. and Berkers, H. (2020). Robots in the Workplace: A Threat to—or Opportunity for—Meaningful Work? Philosophy & Technology, 33(3), 503–22.
Sparrow, R. (2007). Killer Robots. Journal of Applied Philosophy, 24(1), 62–77.
Springer, A., Garcia-Gathright, J. and Cramer, H. (2018). Assessing and Addressing Algorithmic Bias – But Before We Get There. In 2018 AAAI Spring Symposium Series, 450–54. https://www.aaai.org/ocs/index.php/SSS/SSS18/paper/viewPaper/17542.
Stone, C. D. (1972). Should Trees Have Standing? Toward Legal Rights for Natural Objects. Southern California Law Review, 45, 450–501.
Stone, C. D. (2010). Should Trees Have Standing? Law, Morality and the Environment. Oxford: Oxford University Press.
Strawser, B. J. (2010). Moral Predators: The Duty to Employ Uninhabited Aerial Vehicles. Journal of Military Ethics, 9(4), 342–68.
Sullins, J. (2012), Robots, Love, and Sex: The Ethics of Building a Love Machine. IEEE Transactions on Affective Computing, 3(4), 398–409.
Sweeney, L. (2013). Discrimination in Online Ad Delivery. Acmqueue, 11(3), 1–19.
Tigard, D. (2020a). There is No Techno-Responsibility Gap. Philosophy & Technology, online first at https://link.springer.com/article/10.1007/s13347-020-00414-7.
Tigard, D. (2020b). Responsible AI and Moral Responsibility: A Common Appreciation. AI and Ethics, online first at https://link.springer.com/article/10.1007/s43681-020-00009-0.
Timmermann, J. (2020). Kant’s “Groundwork of the Metaphysics of Morals”: A Commentary. Cambridge: Cambridge University Press.
Turing, A. (1950). Computing Machinery and Intelligence. Mind, 59(236), 433–60.
Vallor, S. (2015). Moral Deskilling and Upskilling in a New Machine Age: Reflections on the Ambiguous Future of Character. Philosophy & Technology, 28(1), 107–24.
Vallor, S. (2016). Technology and the Virtues: A Philosophical Guide to a Future Worth Wanting. New York: Oxford University Press.
Veale, M., and Binns, R. (2017). Fairer Machine Learning in the Real World: Mitigating Discrimination without Collecting Sensitive Data. Big Data & Society, 4(2).
Vinge, V. (1983). First Word. Omni, January, 10.
Vinge, V. (1993). The Coming Technological Singularity. How to Survive in the Post-Human Era. Whole Earth Review, Winter.
Wachter, S., Mittelstadt, B. and Russell, C. (2018). Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR. Harvard Journal of Law & Technology, 31(2), 841–87.
Wallach, W., and Allen, C. (2010). Moral Machines. Teaching Robots Right from Wrong. Oxford: Oxford University Press.
Wallach, W., Franklin, S. and Allen, C. (2010). A Conceptual and Computational Model of Moral Decision Making in Human and Artificial Agents. Topics in Cognitive Science, 2(3), 454–85.
Wang, Y., and Kosinski, M. (2018). Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation from Facial Images, Journal of Personality and Social Psychology, 114(2), 246–57.
Wareham, C. S. (2020): Artificial Intelligence and African Conceptions of Personhood. Ethics and Information Technology, online first at https://link.springer.com/article/10.1007/s10676-020-09541-3
Wong, P. H. (2012). Dao, Harmony, and Personhood: Towards a Confucian Ethics of Technology. Philosophy & Technology, 25(1), 67–86.
World Economic Forum and Global Future Council on Human Rights (2018). How to Prevent Discriminatory Outcomes in Machine Learning (white paper). http://www3.weforum.org/docs/WEF_40065_White_Paper_How_to_Prevent_Discriminatory_Outcomes_in_Machine_Learning.pdf.
Author Information
John-Stewart Gordon
Email: johnstgordon@pm.me
Vytautas Magnus University
Lithuania
and
Sven Nyholm
Email: s.r.nyholm@uu.nl
The University of Utrecht
The Netherlands
Abstractionism in Mathematics
Abstractionism is a philosophical account of the ontology of mathematics according to which abstract objects are grounded in a process of abstraction (although not every view that places abstraction front and center is a version of abstractionism, as we shall see). Abstraction involves arranging a domain of underlying objects into classes and then identifying an object corresponding to each class—the abstract of that class. While the idea that the ontology of mathematics is obtained, in some sense, via abstraction has its origin in ancient Greek thought, the idea found new life, and a new technical foundation, in the late 19thcentury due to pioneering work by Gottlob Frege. Although Frege’s project ultimately failed, his central ideas were reborn in the late 20thcentury as a view known as neo-logicism.
This article surveys abstractionism in five stages. §1 looks at the pre-19thcentury history of abstraction and its role in the philosophy of mathematics. §2 takes some time to carefully articulate what, exactly, abstractionism is, and to provide a detailed description of the way that abstraction is formalized, within abstractionist philosophy of mathematics, using logical formulas known as abstraction principles. §3 looks at the first fully worked out version of abstractionism—Frege’s logicist reconstruction of mathematics—and explores the various challenges that such a view faces. The section also examines the fatal flaw in Frege’s development of this view: Russell’s paradox. §4 presents a survey of the 20th century neo-logicist revival of Frege’s abstractionist program, due to Crispin Wright and Bob Hale, and carefully explicates the way in which this new version of an old idea deals with various puzzles and problems. Finally, §5 takes a brief tour of a re-development of Frege’s central ideas: Øystein Linnebo’s dynamic abstractionist account.
Abstractionism, very broadly put, is a philosophical account of the epistemology and metaphysics of mathematics (or of abstract objects more generally) according to which the nature of, and our knowledge of, the subject matter of mathematics is grounded in abstraction. More is said about the sort of abstraction that is at issue in abstractionist accounts of the foundations of mathematics below (and, in particular, more about why not every view that involves abstraction is an instance of abstractionism in the sense of the term used here), but first, something needs to be said about what, exactly, abstraction is.
Before doing so, a bit of mathematical machinery is required. Given a domain of entities (these could be objects, or properties, or some other sort of “thing”), one says that a relation is an equivalence relation on if and only if the following three conditions are met:
is reflexive (on ):
For any in , .
is symmetric (on ):
For any in , if then .
is transitive (on ):
For any in , if and , then .
Intuitively, an equivalence relation partitions a collection of entities into sub-collections , where each is a subset of ; the s are exclusive (no entity in is a member of more than one of the classes ); the s are exhaustive (every entity in is in one of the classes ); and an object in one of the sub-collections is related by to every other object in that same sub-collection, and is related by to no other objects in . The classes are known as the equivalence classes generated by on .
Abstraction is a process that begins via the identification of an equivalence relation on a class of entities—that is, a class of objects (or properties, or other sorts of “thing”) is partitioned into equivalence classes based on some shared trait. To make things concrete, let us assume that the class with which we begin is a collection of medium-sized physical objects, and let us divide this class into sub-classes of objects based on whether they are the same size (that is, the equivalence relation in question is sameness of size). We then (in some sense) abstract away the particular features of each object that distinguishes it from the other objects in the same equivalence class, identifying (or creating?) an object (the abstract) corresponding to each equivalence class (and hence corresponding to or codifying the trait had in common by all and only the members of that equivalence class). Thus, in our example, we abstract away all properties, such as color, weight, or surface texture, that vary amongst objects in the same equivalence class. The novel objects arrived at by abstraction—sizes—capture what members of each equivalence class have in common, and thus we obtain a distinct size corresponding to each equivalence class of same-sized physical objects.
Discussions of abstraction, and of the nature of the abstracts so obtained, can be found throughout the history of Western philosophy, going back to Aristotle’s Prior Analytics (Aristotle 1975). Another well-discussed ancient example is provided by (one way of interpreting) Definition 5 of Book V of Euclid’s Elements. In that definition, Euclid introduces the notion of ratio as follows:
Magnitudes are said to be in the same ratio, the first to the second and the third to the fourth, when, if any equimultiples whatever be taken of the first and third, and any equimultiples whatever of the second and fourth, the former equimultiples alike exceed, are alike equal to, or alike fall short of, the latter equimultiples respectively taken in corresponding order. (Euclid 2012, V.5)
Simply put, Euclid is introducing a complicated equivalence relation:
being in the same ratio
that holds (or not) between pairs of magnitudes. Two pairs of magnitudes and stand in the being in the same ratio relation if and only if, for any numbers and we have:
if and only if ;
if and only if ;
if and only if .
Taken literally, it is not clear that Euclid’s Definition 5 is a genuine instance of the process of abstraction, since Euclid does not seem to explicitly take the final step: introducing individual objects—that is, ratios—to “stand for” the relationship that holds between pairs of magnitudes that instantiate the being in the same ratio relation. But, to take that final step, we need merely introduce the following (somewhat more modern) notation:
where if and only if and stand in the same ratio to one another as and . If we take the logical form of this equation at face value—that is, as asserting the identity of the ratio and the ratio —then we now have our new objects, ratios, and the process of abstraction is complete.
We can (somewhat anachronistically, but nevertheless helpfully) reformulate this reconstruction of the abstraction involved in the introduction of ratios as the following Ratio Principle:
The new objects, ratios, are introduced in the identity statement on the left-hand side of the biconditional, and their behavior (in particular, identity conditions for ratios) is governed by the equivalence relation occurring on the right-hand side of the biconditional.
As this discussion of Euclid illustrates, it is often unclear (especially prior to the late 19thcentury, see below) whether a particular definition or discussion is meant to be an application of abstraction, since it is unclear which of the following is intended:
The definition or discussion merely introduces a new relation that holds between various sorts of object (for example, it introduces the relation being in the same ratio), but does nothing more.
The definition or discussion is meant to explicate the relationships that hold between previous identified and understood objects (for example, it precisely explains when two ratios are identical, where it is assumed that we already know, in some sense, what ratios are).
The definition or discussion is meant to introduce a new sort of object defined in terms of a relation that holds between objects of a distinct, and previously understood, sort (for example, it introduces ratios as novel objects obtained via application of the process of abstraction to the relation being in the same ratio).
Only the last of these counts as abstraction, properly understood (at least, in terms of the understanding of abstraction mobilized in the family of views known as abstractionism).
With regard to those cases that are explicit applications of abstraction—that is, cases where an equivalence relation on a previously understood class of entities is used to introduce new objects (abstracts) corresponding to the resulting equivalence classes—there are three distinct ways that the objects so introduced can be understood:
The abstract corresponding to each equivalence class is identified with a canonical representative member of that equivalence class (for example, we identify the ratio 1 : 2 with the particular pair of magnitudes ⟨1 meter, 2 meters⟩).
The abstract corresponding to each equivalence class is identified with that equivalence class (for example, we identity the ratio 1 : 2 with the equivalence class of pairs of magnitudes that are in the same ratio as ⟨1 meter, 2 meters⟩).
The abstract corresponding to each equivalence class is taken to be a novel abstract.
Historically, uses of abstraction within number theory have taken the first route, since the abstract corresponding to an equivalence class of natural numbers (or of any sub-collection of a collection of mathematical objects with a distinguished well-ordering) can always be taken to be the least number in that equivalence class. Somewhat surprisingly, perhaps, the second option—the identification of abstracts with the corresponding equivalence classes themselves—was somewhat unusual before Frege’s work. The fact that it remains unusual after Frege’s work, however, is less surprising, since the dangers inherent in this method were made clear by the set-theoretic paradoxes that plagued his work. The third option—taking the abstracts to be novel abstract objects—was relatively common within geometry by the 19thcentury, and it is this method that is central to the philosophical view called neo-logicism, discussed in §4 below.
This brief summary of the role of abstraction in the history of mathematics barely scratches the surface, of course, and the reader interested in a more detailed presentation of the history of abstraction prior to Frege’s work is encouraged to consult the early chapters of the excellent (Mancosu 2016). But it is enough for our purposes, since our primary target is not abstraction in general, but its use in abstractionist approaches to the philosophy of mathematics (and, as noted earlier, of abstract objects more generally).
2. Defining Abstractionism
Abstractionism, as we will understand the term here, is an account of the foundations of mathematics that involves the use of abstraction principles (or of principles equivalent to, or derived from, abstraction principles, see the discussion of dynamic abstraction in §5 below). An abstraction principle is a formula of the form:
where and range over the same type (typically objects, concepts, n-ary relations, or sequences of such), is an equivalence relation on entities of that type, and is a function from that type to objects. “” is the abstraction operator, and terms of the form “” are abstraction terms. The central idea underlying all forms of abstractionism is that abstraction principles serve to introduce mathematical concepts by providing identity conditions for the abstract objects falling under those concepts (that is, objects in the range of ) in terms of the equivalence relation .
Since this all might seem a bit esoteric at first glance, a few examples will be useful. One of the most well-discussed abstraction principles—one that we will return to when discussing the Caesar problem in §3 below—is the Directions Principle:
where and are variables ranging over (straight) lines, is the parallelism relation, and is an abstraction operator mapping lines to their directions. Thus, a bit more informally, this principle says something like:
For any two lines and , the direction of is identical to the direction of if and only if is parallel to .
On an abstractionist reading, the Directions Principle introduces the concept direction, and it provides access to new objects falling under this concept—that is, directions—via abstraction. We partition the class of straight lines into equivalence classes, where each equivalence class is a collection of parallel lines (and any line parallel to a line in one of these classes is itself in that class), and then we obtain new objects—directions—by applying the abstraction operator to a line, resulting in the direction of that line (which will be the same object as the direction of any other line in the same equivalence class of parallel lines).
It should now be apparent that the Directions Principle is not the first abstraction principle that we have seen in this essay: the Ratio Principle is also an abstraction principle which serves, on an abstractionist reading, to introduce the concept ratio and whose abstraction operator provides us with new objects falling under this concept.
The Directions Principle involves a unary objectual abstraction operator : that is, the abstraction operator in the Directions Principle maps individual objects (that is, individual lines) to their abstracts (that is, their directions). The Ratio Principle is a bit more complicated. It involves a binary objectual abstraction operator: the abstraction operator maps pairs of objects (that is, pairs of magnitudes) to their abstracts (that is, the ratio of that pair). But the Directions Principle and the Ratio Principle have this much in common: the argument or arguments of the abstraction operator are objectual—they are objects.
It turns out, however, that much of the philosophical discussion of abstraction principles has focused on a different, and much more powerful, kind of abstraction principle—conceptual abstraction principles. In a conceptual abstraction principle, the abstraction operator takes, not an object or sequence of objects, but a concept (or relation, or a sequence of concepts and relations, and so forth) as its argument. Here, we will be using the term “concept” in the Fregean sense, where concepts are akin to properties and are whatever it is that second-order unary variable range over, since, amongst other reasons, this is the terminology used by most of the philosophical literature on abstractionism. The reader uncomfortable with this usage can uniformly substitute “property” for “concept” throughout the remainder of this article.
Thus, a conceptual abstraction principle requires higher-order logic for its formation—for a comprehensive treatment of second- and higher-order logic, see (Shapiro 1991). The simplest kind of conceptual abstraction principle, and the kind to which we will restrict our attention in the remainder of this article, are unary conceptual abstraction principles of the form:
where and are second-order variables ranging over unary concepts, and is an equivalence relation on concepts.
The two most well-known and well-studied conceptual abstraction principles are Hume’s Principle and Basic Law V. Hume’s Principle is:
where abbreviates the purely logical second-order claim that there is a one-to-one onto mapping from to , that is:
Hume’s Principle introduces the concept cardinal number and the cardinal numbers that fall under that concept. Basic Law V is:
which (purports to) introduce the concept set or extension. As we shall see in the next section (and as is hinted in the parenthetical comment in the previous sentence), one of these abstraction principles does a decidedly better job than the other.
As already noted, although the process of abstraction has been a central philosophical concern since philosophers began thinking about mathematics, abstractionism only arose once abstraction principles were introduced. And, although he was not the first to use them—again, see (Mancosu 2016)—it is in the work of Gottlob Frege in the late 19thcentury that made abstraction principles first become a central concern in the philosophy of mathematics, and Frege’s logicism is the first defense of a full-blown version of abstractionism. Thus, we now turn to Frege.
3. Frege’s Logicism
Frege’s version of abstractionism is (appropriately enough, as we shall see) known as logicism. The primary motivation behind the project was to defend arithmetic and real and complex analysis (but interestingly, not geometry) from Kant’s charge that these areas of mathematics were a priori yet synthetic (Kant 1787/1999). The bulk of Frege’s defense of logicism occurs in his three great books, which can be summarized as follows:
Begriffsschrift, or Concept Script (Frege 1879/1972): Frege invents modern higher-order logic.
Die Grundlagen Der Arithmetic, or The Foundations of Arithmetic (Frege 1884/1980): Frege criticizes popular accounts of the nature of mathematics, and provides an informal exposition of his logicism.
Grundgesetze der Arithmetik, or Basic Laws of Arithmetic (Frege 1893/1903/2013): Frege further develops the philosophical details of his logicism, and carries out the formal derivations of the laws of arithmetic in an extension of the logic of Begriffsscrift.
Here we will examine a reconstruction of Frege’s logicism based on both the Grundlagen and Grundesetze. It should be noted, however, that there are subtle differences between the project informally described in the Grundlagen and the project carried out formally in Grundgesetze, differences we will for the most part ignore here. For discussion of some of these differences, see (Heck 2013) and (Cook & Ebert 2016). We will also carry out this re-construction in contemporary logical formalism, but it should also be noted that Frege’s logical system differs from contemporary higher-order logic in a number of crucial respects. For discussion of some of these differences, see (Heck 2013) and (Cook 2013).
As noted, Frege’s main goal was to argue that arithmetic was analytic. Frege’s understanding of the analytic/synthetic distinction, much like his account of the apriori/a posteriori distinction, has a decidedly epistemic flavor:
Now these distinctions between a priori and a posteriori, synthetic and analytic, concern not the content of the judgement but the justification for making the judgement. Where there is no such justification, the possibility of drawing the distinctions vanishes. An a priori error is thus as complete a nonsense as, say, a blue concept. When a proposition is called a posteriori or analytic in my sense, this is not a judgement about the conditions, psychological, physiological and physical, which have made it possible to form the content of the proposition in our consciousness; nor is it a judgement about the way in which some other man has come, perhaps erroneously, to believe it true; rather, it is a judgement about the ultimate ground upon which rests the justification for holding it to be true. (Frege 1884/1980, §3)
In short, on Frege’s view, whether or not a claim is analytic or synthetic, a priori or a posteriori, depends on the kind of justification that it would be appropriate to give for that judgment (or judgments of that kind). Frege fills in the details regarding exactly what sorts of justification are required for analyticity and aprioricity later in the same section:
The problem becomes, in fact, that of finding the proof of the proposition, and of following it up right back to the primitive truths. If, in carrying out this process, we come only on general logical laws and on definitions, then the truth is an analytic one, bearing in mind that we must take account also of all propositions upon which the admissibility of any of the definitions depends. If, however, it is impossible to give the proof without making use of truths which are not of a general logical nature, but belong to some special science, then the proposition is a synthetic one. For a truth to be a posteriori, it must be impossible to construct a proof of it without including an appeal to facts, that is, to truths which cannot be proved and are not general, since they contain assertions about particular objects. But if, on the contrary, its proof can be derived exclusively from general laws, which themselves neither need not admit of proof, then the truth is a priori. (Frege 1884/1980, §3)
Thus, for Frege, a judgment is analytic if and only if it has a proof that depends solely upon logical laws and definitions, and a judgment is a priori if and only if it has a proof that depends only upon self-evident, general truths. All logical laws and definitions are self-evident general truths, but not vice versa. This explains the fact mentioned earlier, that Frege did not think his logicism applicable to geometry. For Frege, geometry relied on self-evident general truths about the nature of space, but these truths were neither logical truths nor definitions—hence geometry was a priori, but not analytic.
Thus, Frege’s strategy for refuting Kant’s claim that arithmetic was synthetic was simple: logic (and anything derivable from logic plus definitions) is analytic, hence, if we reduce arithmetic to logic, then we will have shown that arithmetic is analytic after all (and similarly for real and complex analysis, and so forth).
Before digging in to the details of Frege’s attempt to achieve this reduction of arithmetic to logic, however, a few points of clarification are worth making. First, as we shall see below, not all versions of abstractionism are versions of logicism, since not all versions of abstractionism will take abstraction principles to be truths of logic. The converse fails as well: Not all versions of logicism are versions of abstractionism: (Tennant 1987) contains a fascinating constructivist, proof-theoretically oriented attempt to reduce arithmetic to logic that, although it involves operators that are typeset similarly to our abstraction operator , nevertheless involves no abstraction principles. Second, Frege’s actual primary target was neither to show that arithmetic was logical nor to show that it could be provided a foundation via abstraction generally or via abstraction principles in particular. His primary goal was to show that arithmetic was, contra Kant, analytic, and both the use of abstraction principles and the defense of these principles as logical truths were merely parts of this project. These distinctions are important to note, not only because they are, after all, important, but also because the terminology for the various views falling under the umbrella of abstractionism is not always straightforwardly accurate (for example neo-logicism is not a “new” version of logicism).
The first half of Grundlagen is devoted to Frege’s unsparing refutation of a number of then-current views regarding the nature of mathematical entities and the means by which we obtain mathematical knowledge, including the views put forth by Leibniz, Mill, and Kant. While these criticisms are both entertaining and, for the most part, compelling, it is Frege’s brief comments on Hume that are most relevant for our purposes. In his discussion of Hume, Frege misattributes a principle to him that becomes central both to his own project and to the later neo-logicist programs discussed below—the abstraction principle known (rather misleadingly) as Hume’s Principle.
a. Hume’s Principle and Frege’s Theorem
Frege begins by noting that Hume’s Principle looks rather promising, in many ways, as a potential definition of the concept cardinal number. First, despite the fact that this abstraction principle is likely not what Hume had in mind when he wrote that:
When two numbers are so combined as that the one has always an unit answering to every unit of the other we pronounce them equal; and it is for want of such a standard of equality in extension that geometry can scarce be esteemed a perfect and infallible science. (Hume 1888)[i.3.1]
Hume’s Principle nevertheless seems to codify a plausible idea regarding the nature of cardinal number: two numbers and are the same if and only if, for any two concepts and where the number of s is and the number of s is , there is a one-one onto mapping from the s to the s. Second, and much more importantly for our purposes, Hume’s Principle, plus some explicit definitions formulated in terms of higher-order logic plus the abstraction operator , allows us to prove all of the second-order axioms of Peano Arithmetic:
Dedekind-Peano Axioms:
We can express the Peano Axioms a bit more informally as:
Zero is a natural number.
No natural number is the predecessor of zero.
Every natural number is the predecessor of some natural number.
If two natural numbers are the predecessor of the same natural number, then they are identical.
Any property that holds of zero, and holds of a natural number if it holds of the predecessor of that natural number, holds of all natural numbers.
The definitions of zero, the predecessor relation, and the natural number predicate are of critical importance to Frege’s reconstruction of arithmetic. The definitions of zero and of the predecessor relation are relatively simple. Zero is just the cardinal number of the empty concept:
The predecessor relation is defined as:
Thus, holds between two objects and (that is, is the predecessor of ) just in case there is some concept and object falling under such that is the cardinal number of (that is, it is the number of s) and is the cardinal number of the concept that holds of exactly the objects that holds of, except for (that is, it is the number of the s that are not ).
Constructing the definition of the natural number concept is somewhat more complicated, however. First, we need to define the notion of a concept being hereditary on a relation :
Intuitively, is hereditary on if and only if, whenever we have two objects and , if falls under the concept , and is related by to , then must fall under as well.
Next, Frege uses hereditariness to define the strong ancestral of a relation :
The definition of the anscestral is imposing, but the idea is straightforward: given a relation , the strong ancestral of is a second relation such that holds between two objects and if and only if these is a sequence of objects:
such that:
This operation is called the ancestral for a reason: the relation that holds between oneself and one’s ancestors is the ancestral of the parenthood relation.
For Frege’s purposes, a slightly weaker notion—the weak ancestral—turns out to be a bit more convenient:
The weak ancestral of a relation holds between two objects and just in case either the strong ancestral does, or and are identical. Returning to our intuitive genealogical example, the difference between the weak ancestral and the strong ancestral of the parenthood relation is that the weak ancestral holds between any person and themselves. Thus, it is the strong ancestral that most closely corresponds to the everyday notion of ancestor, since we do not usually say that someone is their own ancestor.
Finally, we can define the natural numbers as those objects such that the weak ancestral of the predecessor relation holds between zero and :
In other words, an object is a natural number if and only if either it is 0, or 0 is its predecessor (that is, it is 1), or zero is the predecessor of its predecessor (that is, it is 2), or 0 is the predecessor of the predecessor of its predecessor (that is, it is 3), and so forth.
It is worth noting that all of this work defining the concept of natural number is, in fact, necessary. One might think at first glance that we could just take the following notion of cardinal number:
and use that instead of the much more complicated . This, however, won’t work: Since Hume’s Principle entails all of the Peano Axioms for arithmetic, it thereby entails that there are infinitely many objects (since there are infinitely many natural numbers). Hence there is a cardinal number—that is, an object falling under —that is not a finite natural number, namely anti-zero, the number of the universal concept (the term “anti-zero” is due to (Boolos 1997)):
Infinite cardinals numbers like anti-zero do not satisfy the Peano Axioms (anti-zero is its own predecessor, for example), thus, if we are to do arithmetic based on Hume’s Principle, we need to restrict our attention to those numbers falling in .
In the Grundlagen Frege sketches a proof that, given these definitions, we can prove the Peano Axioms, and he carries it out in full formal detail in Grundgesetze. This result, which is a significant mathematical result independently of its importance to abstractionist accounts of the foundations of mathematics, has come to be known as Frege’s Theorem. The derivation of the Peano Axioms from Hume’s Principle plus these definitions is long and complicated, and we will not present it here. The reader interested in reconstructions of, and discussions of, the proof of Frege’s Theorem should consult (Wright 1983), (Boolos 1990a), (Heck 1993), and (Boolos & Heck 1998).
b. Hume’s Principle and the Caesar Problem
This all looks quite promising so far. We have an abstraction principle that introduces the concept cardinal number (and, as our definitions above demonstrate, the sub-concept natural number), and this abstraction principle entails a quite strong (second-order) version of the standard axioms for arithmetic. In addition, although Frege did not prove this, Hume’s Principle is consistent. We can build a simple model as follows. Let the domain be the natural numbers , and then interpret the abstraction operator as follows:
This simple argument can be extended to show that Hume’s Principle has models whose domains are of size for any infinite cardinal (Boolos 1987). Thus, Hume’s Principle seems like a good candidate for an abstractionist definition of the concept cardinal number.
Frege, however, rejected the idea that Hume’s Principle could serve as a definition of cardinal number. This was not because he was worried that Hume’s Principle failed to be true, or even that it failed to be analytic. On the contrary, as we shall see below, Frege eventually proves a version of Hume’s Principle from other principles that he takes to be logical truths, and hence analytic. Thus, the proved version of Hume’s Principle (were Frege’s project successful) would inherit the analyticity of the principles used to prove it.
Frege instead rejects Hume’s Principle as a definition of the concept cardinal number because it does not settle questions regarding which particular objects the numbers are—questions that, on Frege’s view, an adequate definition should settle. In particular, although abstraction principles provide us with a criterion for determining whether or not two abstracts of the same kind—that is, two abstracts introduced by the same abstraction principle—are identical, they are silent with regard to whether, or when, an abstract introduced by an abstraction principle might be identical to an object introduced by some other means. Frege raises this problem with respect to Hume’s Principle as follows:
. . . but we can never—to take a crude example—decide by means of our definitions whether any concept has the number Julius Caesar belonging to it, or whether that conqueror of Gaul is a number or is not. (Frege 1884/1980, §55)
and he returns to the problem again, pointing out that the Directions Principle fares no better:
It will not, for instance, decide for us whether England is the same as the direction of the Earth’s axis—if I may be forgiven an example which looks nonsensical. Naturally no one is going to confuse England with the direction of the Earth’s axis; but that is no thanks to our definition of direction. (Frege 1884/1980, 66)
The former passage has led to this problem being known as the Caesar Problem.
The root of the Caesar Problem is this. Although abstraction principles provide criteria for settling identities between pairs of abstraction terms of the same type—hence Hume’s Principles provides a criterion for settling identities of the form:
for any concepts and —abstraction principles do not provide any guidance for settling identities where one of the terms is not an abstraction term. In short, and using our favorite example, Hume’s Principle provides no guidance for settling any identities of the form:
where is not an abstraction term (hence might be an everyday name like “England” or “Julius Caesar”). Both:
and:
can be consistently added to Hume’s Principle (although obviously not both at once).
Frege’s worry here is not that, as a result of this, we are left wondering whether the number seven really is identical to Julius Caesar. As he notes, we know that it is not. The problem is that an adequate definition of the concept natural number should tell us this, and Hume’s Principle fails to weigh in on the matter.
That being said, Frege’s worry does not stem from thinking that a definition of a mathematical concept should answer all questions about that concept (after all, the definition of cardinal number should not be expected to tell us what Frege’s favorite cardinal number was). Rather, Frege is concerned here with the idea that a proper definition of a concept should, amongst other things, draw a sharp line between those things that fall under the concept and those that do not—that is, a definition of a mathematical concept should determine the kinds of objects that fall under that concept. Hume’s Principle does not accomplish this, and thus it cannot serve as a proper definition of the concept in question. We will return to the Caesar Problem briefly in our discussion of neo-logicism below. But first, we need to look at Frege’s response.
c. Hume’s Principle and Basic Law V
Since Frege rejected the idea that Hume’s Principle could serve as a definition of cardinal number, but appreciated the power and simplicity that the reconstruction of Peano Arithmetic based on Hume’s Principle provided, he devised a clever strategy: to provide an explicit definition of cardinal number that depended on previously accepted and understood principles, and then derive Hume’s Principle using those principles and the explicit definition in question.
As a result, there are two main ingredients in Frege’s final account of the concept cardinal number. The first is the following explicit definition of the concept in question (noting that “equal” here indicates equinumerosity, not identity):
My definition is therefore as follows:
The number which belongs to the concept is the extension of the concept “equal to the concept ”. (Frege 1884/1980, §68)
Thus, Frege’s definition of cardinal numbers specifies that the cardinal numbers are a particular type of extension. But of course, this isn’t very helpful until we know something about extensions. Thus, the second central ingredient in the account is a principle that governs extensions of concepts generally—a principle we have already seen: Basic Law V.
We should pause here to note that the version of Basic Law V that Frege utilized in Grundgesetze did not assign extensions to sets, but instead assigned value ranges to functions. Thus, a better (but still slightly anachronistic) way to represent Frege’s version of this principle would be something like:
where and range over unary functions from objects to objects. Since Frege thought that concepts were a special case of functions (in particular, a concept is a function that maps each object to either the true or the false), the conceptual version of Basic Law V given in §1 above is a special case of Frege’s basic law. Hence, we will work with the conceptual version here and below, since (i) this allows our discussion of Frege to align more neatly with our discussion of neo-logicism in the next section, and (ii) any derivation of a contradiction from a special case of a general principle is likewise a derivation of a contradiction from the general principle itself.
Given Basic Law V, we can formalize Frege’s definition of cardinal number as follows:
where is the abstraction operator found in Basic Law V, which maps each concept to its extension. In other words, on Frege’s account the cardinal number corresponding to a concept is the extension (or “value-range”, in Frege’s terminology) of the concept which holds of an object just in case it is the extension of a concept that is equinumerous to .
Frege informally sketches a proof that Hume’s Principle follows from Basic Law V plus this definition in Grundlagen, and he provides complete formal proofs in Grundgesetze. For a careful discussion of this result, see (Heck 2013). Thus, Basic Law V plus this definition of cardinal number entails Hume’s Principle, which then (with a few more explicit definitions) entails full second-order Peano Arithmetic. So what went wrong? Why aren’t we all Fregean logicists?
d. Basic Law V and Russell’s Paradox
Before looking at what actually did go wrong, it is worth heading off a potential worry that one might have at this point. As already noted, Frege rejected Hume’s Principle as a definition of cardinal number because of the Caesar Problem. But Basic Law V, like Hume’s Principle, is an abstraction principle. And, given any abstraction principle:
if is consistent, then will entail neither:
nor:
(where is not an abstraction term). Since Frege obviously believed that Basic Law V was consistent, he should have also realized that it fails to settle the very sorts of identity claims that led to his rejection of Hume’s Principle. Thus, shouldn’t Frege have rejected Basic Law V for the same reasons?
The answer is “no”, and the reason is simple: Frege did not take Basic Law V to be a definition of Extension. As just noted, he couldn’t, due to the Caesar Problem. Instead, Frege merely claims that Basic Law V is exactly that—a basic law, or a basic axiom of the logic that he develops in Grundgesetze. Frege never provides a definition of extension, and he seems to think that a definition of this concept is not required. For example, at the end of a footnote in Grundlagen suggesting that, in the definition of cardinal number given above, we could replace “extension of a concept” with just “concept”, he says that:
I assume that it is known what the extension of a concept is. (Frege 1884/1980, §69)
Thus, this is not the reason that Frege’s project failed.
The reason that Frege’s logicism did ultimately fail, however, is already hinted at in our discussion of Basic Law V and the Caesar Problem. Note that we took a slight detour through an arbitrary (consistent) abstraction principle in order to state that (non-)worry. The reason for this complication is simple: Basic Law V does prove one or the other of:
and:
In fact, it proves both (and any other formula, for that matter), because it is inconsistent.
In 1902, just as the second volume of Grundgesetze was going to press, Frege received a letter from a young British logician by the name of Bertrand Russell. In the letter Russell sketched a derivation of a contradiction within the logical system of Grundgesetze—one which showed the inconsistency of Basic Law V in particular. We can reconstruct the reasoning as follows. First, consider the (“Russell”) concept expressed by the following predicate:
Simply put, the Russell concept holds of an object just in case that object is the extension of a concept that does not hold of . Now, clearly, if extensions are coherent at all, then the extension of this concept should be self-identical—that is:
which, by the definition of , gives us:
We then apply Basic Law V to obtain:
An application of universal instantiation, replacing the variable with , provides:
The following is a truth of higher-order logic:
Given Basic Law V, however, the preceding claim is equivalent to:
But now we combine this with the formula three lines up, to get:
an obvious contradiction.
This paradox is known as Russell’s Paradox, and is often presented in a somewhat different context—naïve set theory—where it involves, not Frege’s abstraction-principle based extension operator, but consideration of the set of all sets that are not members of themselves.
After receiving Russell’s letter, Frege added an Afterword to the second volume of Grundgesetze, where he proposed an amended version of Basic Law V that stated, roughly put, that two concepts receive the same extension if and only if they hold of exactly the same objects except possibly disagreeing on their (shared) extension. This version turned out to have similar problems. For a good discussion, see (Cook 2019).
Eventually, however, Frege abandoned logicism. Other efforts to reduce all of mathematics to logic were attempted, the most notable of which was Bertrand Russell and Alfred North Whitehead’s attempted reduction of arithmetic to a complicated logical theory known as ramified type theory in their three-volume Principia Mathematica (Russell & Whitehead 1910/1912/1913). But while the system of Principia Mathematica adopted Frege’s original idea of reducing mathematics to logic, it did not do so via the mobilization of abstraction principles, and hence is somewhat orthogonal to our concerns. The next major chapter in abstractionist approaches to mathematics would not occur for almost a century.
4. Neo-Logicism
The revival of abstractionism in the second half of the 20thcentury is due in no small part to the publication of Crispin Wright’s Frege’s Conception of Numbers as Objects (Wright 1983), although other publications from around this time, such as (Hodes 1984), explored some of the same ideas. In this work Wright notes that Hume’s Principle, unlike Basic Law V, is consistent. Thus, given Frege’s Theorem, which ensures that full second-order Peano Arithmetic follows from Hume’s Principle plus the definitions covered in the last section, we can arrive at something like Frege’s original logicist project if we can defend Hume’s Principle as (or as something much like) an implicit definition of the concept cardinal number. In a later essay Wright makes the point as follows:
Frege’s Theorem will ensure . . . that the fundamental laws of arithmetic can be derived within a system of second order logic augmented by a principle whose role is to explain, if not exactly to define, the general notion of identity of cardinal number. . . If such an explanatory principle . . . can be regarded as analytic, then that should suffice . . . to demonstrate the analyticity of arithmetic. Even if that term is found troubling, as for instance by George Boolos, it will remain that Hume’s Principle—like any principle serving to implicitly define a certain concept—will be available without significant epistemological presupposition . . . Such an epistemological route would be an outcome still worth describing as logicism. (Wright 1997, 210—211)
Subsequent work on neo-logicism has focused on a number of challenges.
The first, and perhaps most obvious, is to fully develop the story whereby abstraction principles are implicit definitions of mathematical concepts that not only provide us with terminology for talking about the abstract objects in question, but somehow guarantee that those objects exist. The account in question has been developed for the most part in individual and joint essays by Crispin Wright and Bob Hale—many of these essays are contained in the excellent collection (Hale & Wright 2001a). The central idea underlying the approach is a principle called the syntactic priority thesis, which, although it has its roots in Frege’s work, finds perhaps its earliest explicit statement in Wright’s Frege’s Conception of Numbers as Objects (but see also (Dummett 1956)):
When it has been established . . . that a given class of terms are functioning as singular terms, and when it has been verified that certain appropriate sentences containing them are, by ordinary criteria, true, then it follows that those terms do genuinely refer. (Wright 1983, 14)
This principle turns the intuitive account of the connection between singular terms and the objects to which they purport to refer on its head. Instead of explaining when a singular term refers, and to what it refers, in terms of (in some sense) prior facts regarding the existence of certain objects (in particular, the objects to which the terms in question purport to refer), the syntactic priority thesis instead explains what it is for certain sorts of object to exist in terms of (in some sense) prior facts regarding whether or not appropriate singular terms appear in true (atomic) sentences.
Wright and Hale then argue that, first, the apparent singular terms (that is, abstraction terms) appearing on the left-hand side of abstraction principles such as Hume’s Principle are genuine singular terms, and, second, that Hume’s Principle serves as a genuine definition of these terms, guaranteeing that there are true atomic sentences that contain those terms. In particular, since for any concept :
is a logical truth, Hume’s Principle entails that any identity claim of the form:
is true. As a result, terms of the form refer (and refer to the abstract objects known as cardinal numbers). Hence, both the existence of the abstract objects that serve as the subject matter of arithmetic, and our ability to obtain knowledge of such objects, is guaranteed.
a. Neo-Logicism and Comprehension
Another problem that the neo-logicist faces involves responding to Russell’s Paradox. Neo-logicism involves the claim that abstraction principles are implicit definitions of mathematical concepts. But, as Russell’s Paradox makes clear, it would appear that not every abstraction principle can play this role. Thus, the neo-logicist owes us an account of the line that divides the acceptable abstraction principles—that is, the ones that serve as genuine definitions of mathematical concepts—from those that are unacceptable.
Before looking at ways we might draw such a line between acceptable and unacceptable abstraction principles, it is worth noting that proceeding in this fashion is not forced upon the neo-logicist. In our presentation of Russell’s Paradox in the previous section, a crucial ingredient of the argument was left implicit. The second-order quantifiers in an abstraction principle such as Basic Law V range over concepts, and hence Basic Law V tells us, in effect, that each distinct concept receives a distinct extension. But, in order to get the Russell’s Paradox argument going, we need to know that there is a concept corresponding to the Russell predicate .
Standard accounts of second-order logic ensure that there is a concept corresponding to each predicate by including the comprehension scheme:
Comprehension : For any formula where does not occur free in :
Frege did not have an explicit comprehension principle in his logic, but instead had inference rules that amounted to the same thing. If we substitute in for in the comprehension scheme, it follows that there is a concept corresponding to , and hence we can run the Russell’s Paradox reasoning.
But now that we have made the role of comprehension explicit, another response to Russell’s Paradox becomes apparent. Why not reject comprehension, rather than rejecting Basic Law V? In other words, maybe it is the comprehension scheme that is the problem, and Basic Law V (and in fact any abstraction principle) is acceptable.
Of course, we don’t want to just drop comprehension altogether, since then we have no guarantee that any concepts exist, and as a result there is little point to the second-order portion of our second-order logic. Instead, the move being suggested is to replace comprehension with some restricted version that entails the existence of enough concepts that abstraction principles such as Hume’s Principle and Basic Law V can do significant mathematical work for us, but does not entail the existence of concepts, like the one corresponding to the Russell predicate, that lead to contradictions. A good bit of work has been done exploring such approaches. For example, we might consider reformulating the comprehension scheme so that it only applies to predicates that are predicative (that is, contain no bound second-order variables) or are (that is, are equivalent both to a formula all of whose second-order quantifiers are universal and appear at the beginning of the formula, and to a formula all of whose second-order quantifiers are existential and appear at the beginning of the formula). (Heck 1996) shows that Basic Law V is consistent with the former version of comprehension, and (Wehmeier 1999) and (Ferreira & Wehmeier 2002) show that Basic Law V is consistent with the latter (considerably stronger) version.
One problem with this approach is that if we restrict the comprehension principles used in our neo-logicist reconstruction of mathematical theories, then the quantifiers that occur in the theories so reconstructed are weakened as well. Thus, if we adopt comprehension restricted to some particular class of predicates, then even if we can prove the induction axiom for arithmetic:
it is not clear that we have what we want. The problem is that, in this situation, we have no guarantee that induction will hold of all predicates that can be formulated in our (second-order) language, but instead are merely guaranteed that induction will hold for those predicates that are in the restricted class to which our favored version of comprehension applies. It is not clear that this should count as a genuine reconstruction of arithmetic, since induction is clearly meant to hold for any meaningful condition whatsoever (and presumably any condition that can be formulated within second-order logic is meaningful). As a result, most work on neo-logicism has favored the other approach: retain full comprehension, accept that Basic Law V is inconsistent, and then search for philosophically well-motivated criteria that separate the good abstraction principles from the bad.
b. Neo-Logicism and the Bad Company Problem
At first glance, one might think that the solution to this problem is obvious: Can’t we just restrict our attention to the consistent abstraction principles? After all, isn’t that the difference between Hume’s Principle and Basic Law V—the former is consistent, while the latter is not? Why not just rule out the inconsistent abstraction principles, and be done with it?
Unfortunately, things are not so simple. First off, it turns out that there is no decision procedure for determining which abstraction principles are consistent and which are not. In other words, there is no procedure or algorithm that will tell us, of an arbitrary abstraction principle, whether that abstraction principle implies a contradiction (like Basic Law V) or not (like Hume’s Principle). See (Heck 1992) for a simple proof.
Second, and even more worrisome, is the fact that the class of individually consistent abstraction principles is not itself consistent. In other words, there are pairs of abstraction principles such that each of them is consistent, but they are incompatible with each other. A simple example is provided by the Nuisance Principle:
where abbreviates the purely logical second-order claim that there are only finitely many s. This abstraction principle, first discussed in (Wright 1997), is a simplification of a similar example given in (Boolos 1990a). Informally, this principle says that the nuisance of is identical to the nuisance of if and only if the collection of things that either fall under but not , or fall under but not , is finite. Even more simply, the nuisance of is identical to the nuisance of if and only if and differ on at most finitely many objects.
Now, the Nuisance Principle is consistent—in fact, it has models of size for any finite number . The problem, however, is that it has no infinite models. Since, as we saw in our discussion of Frege’s Theorem, Hume’s Principle entails the existence of infinitely many cardinal numbers, and thus all of its models have infinite domains, there is no model that makes both the Nuisance Principle and Hume’s Principle true. Thus, restricting our attention to consistent abstraction principles won’t do the job.
Unsurprisingly, Wright did not leave things there, and in the same essay in which he presents the Nuisance principle he proposes a solution to the problem:
A legitimate abstraction, in short, ought to do no more than introduce a concept by fixing truth conditions for statements concerning instances of that concept . . . How many sometime, someplace zebras there are is a matter between that concept and the world. No principle which merely assigns truth conditions to statements concerning objects of a quite unrelated, abstract kind—and no legitimate second-order abstraction can do any more than that—can possibly have any bearing on the matter. What is at stake . . . is, in effect, conservativeness in (something close to) the sense of that notion deployed in Hartry Field’s exposition of his nominalism. (Wright 1997, 296)
The reason that Wright invokes the version of conservativeness mobilized in (Field 2016) is that the standard notion of conservativeness found in textbooks on model theory won’t do the job. That notion is formulated as follows:
A formula in a language is conservative over a theory in a language where if any only if, for any formula , if:
then:
In other words, given a theory , a formula (usually involving new vocabulary not included in ) is conservative over if and only if any formula in the language of that follows from the conjunction of and follows from alone. In other words, if is conservative over , then although may entail new things not entailed by , it entails no new things that are expressible in the language of .
Now, while this notion of conservativeness is extremely important in model theory, it is, as Wright realized, too strong to be of use here, since even Hume’s Principle is not conservative in this sense. Take any theory that is compatible with the existence of only finitely many things (that is, has finite models), and let abbreviate the purely logical second-order claim expressing that the universe contains infinitely many objects. Then:
but:
This example makes the problem easy to spot: Acceptable abstraction principles, when combined with our favorite theories, may well entail new claims not entailed by those theories. For example, Hume’s Principle entails that there are infinitely many objects. What we want to exclude are abstraction principles that entail new claims about the subject matter of our favorite (non-abstractionist) theories. Thus, Hume’s Principle should not entail that the subject matter of involves infinitely many objects unless already entails that claim. Hence, what we want is something like the following: An abstraction principle is conservative in the relevant sense if and only if, given any theory and formula about some domain of objects , if combined with restricted to its intended, non-abstract domain entails restricted to its intended, non-abstract domain, then (unrestricted) should entail (unrestricted). This will block the example above, since, if is our theory of zebras (to stick with Wright’s example), then although Hume’s Principle plus entails the existence of infinitely many objects, it does not entail the existence of infinitely many zebras (unless our zebra theory does).
We can capture this idea more precisely via the following straightforward adaptation of Field’s criterion to the present context:
is Field-conservative if and only if, for any theory and formula not containing , if:
then:
The superscripts indicate that we are restricting each quantifier in the formula (or set of formulas) in question to the superscripted predicate. Thus, given a formula and a predicate , we obtain by replacing each quantifier in with a new quantifier whose range is restricted to along the following pattern:
becomes
becomes
becomes
becomes
Thus, according to this variant of conservativeness, an abstraction principle is conservative if, whenever that abstraction principle plus a theory whose quantifiers have been restricted to those objects that are not abstracts governed by entails a formula whose quantifiers have been restricted to those objects that are not abstracts governed by , then the theory (without such restriction) entails (also without such restriction).
Hume’s Principle (and many other abstraction principles) are conservative in this sense. Further, the idea that Field-conservativeness is a necessary condition on acceptable abstraction principles has been widely accepted in the neo-logicist literature. But Field-conservativeness, even combined with consistency, cannot be sufficient for acceptability, for a very simple (and now familiar-seeming) reason: It turns out that there are pairs of abstraction principles that are each both consistent and Field conservative, but which are incompatible with each other.
The first such pair of abstraction principles is presented in (Weir 2003). Here is a slight variation on his construction. First, we define a new equivalence relation:
In other words, ⇔ holds between two concepts and if and only if either they both hold of no more than one object, and they hold of the same objects, or they both hold of more than one object. Next, let abbreviate the purely logical second-order formula expressing the claim that the size of the universe is a limit cardinal, and abbreviate the purely logical second-order formula expressing the claim that the size of the universe is a successor cardinal. (Limit cardinals and successor cardinals are types of infinite cardinal numbers. The following facts are all that one needs for the example to work: Every cardinal number is either a limit cardinal or a successor cardinal (but not both); given any limit cardinal, there is a larger successor cardinal; and given any successor cardinal, there is a larger limit cardinal. For proofs of these result, and much more information on infinite cardinal numbers, the reader is encouraged to consult (Kunen 1980)). Now consider:
Both and are consistent: is satisfiable on domains whose cardinality is an infinite limit cardinal and is not satisfiable on domains whose cardinality is finite or an infinite successor cardinal (on the latter sort of domains it behaves analogously to Basic Law V). Things stand similarly for , except with the roles of limit cardinals and successor cardinals reversed. Further, both principles are Field-conservative. The proof of this fact is complex, but depends essentially on the fact that ⇔ generates equivalence classes in such a way that, on any infinite domain, the number of equivalence classes of concepts is identical to the number of concepts. See (Cook & Linnebo 2018) for more discussion. But, since no cardinal number is both a limit cardinal and a successor cardinal, there is no domain that makes both principles true simultaneously. Thus, Field-conservativeness is not enough to guarantee that an abstraction principle is an acceptable neo-logicist definition of a mathematical concept.
The literature on Bad Company has focused on developing more nuanced criteria that we might impose on acceptable abstraction principles, and most of these have focused on three kinds of consideration:
Satisfiability: On what sizes of domain is the principle satisfiable?
Fullness: On what sizes of domain does the principle in question generate as many abstracts as there are objects in the domain?
Monotonicity: Is it the case that, if we move from one domain to a larger domain, the principle generates at least as many abstracts on the latter as it did on the former?
The reader is encouraged to consult (Cook & Linnebo 2018) for a good overview of the current state of the art with regard to proposals for dealing with the Bad Company problems that fall under one of (or a combination of) these three types of approach.
c. Extending Neo-Logicism Beyond Arithmetic
The next issue facing neo-logicism is extending the account to other branches of mathematics. The reconstruction of arithmetic from Hume’s Principle is (at least, in a technical sense), the big success story of neo-logicism, but if this is as far as it goes, then the view is merely an account of the nature of arithmetic, not an account of the nature of mathematics more generally. Thus, if the neo-logicist is to be successful, then they need to show that the approach can be extended to all (or at least much of) mathematics.
The majority of work done in this regard has focused on the two areas of mathematics that tend, in addition to arithmetic, to receive the most attention in the foundations of mathematics: set theory and real analysis. Although this might seem at first glance to be somewhat limited, it is well-motivated. The neo-logicist has already reconstructed arithmetic using Hume’s Principle, which shows that neo-logicism can handle (countably) infinite structures. If the neo-logicist can reconstruct real analysis, then this would show that the account can deal with continuous mathematical structures. And if the neo-logicist can reconstruct set theory as well, then this would show that the account can handle arbitrarily large transfinite structures. These three claims combined would make a convincing case for the claim that most if not all of modern mathematics could be so reconstructed. Neo-logicist reconstructions of real analysis have followed the pattern of Dedekind-cut-style set-theoretic treatments of the real numbers. They begin with the natural numbers as given to us by Hume’s Principle. We then use the (ordered) Pairing Principle:
to obtain pairs of natural numbers, and then apply another principle that provides us with a copy of the integers as equivalence classes of pairs of natural numbers. We then use the Pairing Principle again, to obtain ordered pairs of these integers, and then apply another principle to obtain a copy of the rational numbers as equivalence classes of ordered pairs of our copy of the integers. Finally, we use another principle to obtain an abstract corresponding to each “cut” on the natural ordering of the rational numbers, obtaining a collection of abstracts isomorphic to the standard real numbers.
Examples of this sort of reconstruction of the real numbers can be found in (Hale 2000) and (Shapiro 2000). There is, however, a significant difference between the two approaches found in these two papers. Shapiro’s construction halts when he has applied the abstraction principle that provides an abstract for each “cut” on the copy of the rationals, since at this point we have obtained a collection of abstracts whose structure is isomorphic to the standard real numbers. Hale’s, construction, however, involves one more step: he applies a version of the Ratio Principle discussed earlier to this initial copy of the reals, and claims that the structure that results consists of the genuine real numbers (and the abstracts from the prior step, while having the same structure, were merely a copy—not the genuine article).
The difference between the two approaches stems from a deeper disagreement with regard to what, exactly, is required for a reconstruction of a mathematical theory to be successful. The disagreement traces back directly to Frege, who writes in Grundgesetze that:
So the path to be taken here steers between the old approach, still preferred by H. Hankel, of a geometrical foundation for the theory of irrational numbers and the approaches pursued in recent times. From the former we retain the conception of a real number as a magnitude-ratio, or measuring number, but separate it from geometry and indeed form all specific kinds of magnitudes, thereby coming closer to the more recent efforts. At the same time, however, we avoid the emerging problems of the latter approaches, that either measurement does not feature at all, or that it features without any internal connection grounded in the nature of the number itself, but is merely tacked on externally, from which it follows that we would, strictly speaking, have to state specifically for each kind of magnitude how it should be measured, and how a number is thereby obtained. Any general criteria for where the numbers can be used as measuring numbers and what shape their application will then take, are here entirely lacking.
So we can hope, on the one hand, not to let slip away from us the ways in which arithmetic is applied in specific areas of knowledge, without, on the other hand, contaminating arithmetic with the objects, concepts, relations borrowed from these sciences and endangering its special nature and autonomy. One may surely expect arithmetic to present the ways in which arithmetic is applied, even though the application itself is not its subject matter. (Frege 1893/1903/2013, §159)
Wright sums up Frege’s idea here nicely:
This is one of the clearest passages in which Frege gives expression to something that I propose we call Frege’s Constraint: that a satisfactory foundation for a mathematical theory must somehow build its applications, actual and potential, into its core—into the content it ascribes to the statements of the theory—rather than merely “patch them on from the outside.” (Wright 2000, 324)
The reason for Hale’s extra step should now be apparent. Hale accepts Frege’s constraint, and further, he agrees with Frege that a central part of the explanation for the wide-ranging applicability of the real numbers within science is the fact that they are ratios of magnitudes. At the penultimate step of his construction (the one corresponding to Shapiro’s final step) we have obtained a manifold of magnitudes, but the final step is required in order to move from the magnitudes themselves to the required ratios. Shapiro, on the other hand, is not committed to Frege’s Constraint, and as a result is satisfied with merely obtaining a collection of abstract objects whose structure is isomorphic to the structure of the real numbers. As a result, he halts a step earlier than Hale does. This disagreement with regard to the role that Frege’s constraint should play within neo-logicism remains an important point of contention amongst various theorists working on the view.
The other mathematical theory that has been a central concern for neo-logicism is set theory. The initially most obvious approach to obtaining a powerful neo-logicist theory of sets—Basic Law V—is of course inconsistent, but the approach is nevertheless attractive, and as a result the bulk of work on neo-logicist set theory has focused on various ways that we might restrict Basic Law V so that the resulting principle is both powerful enough to reconstruct much or all of contemporary work in set theory yet also, of course, consistent. The principle along these lines that has received by far the most attention is the following principle proposed in (Boolos 1989):
where is an abbreviation for the purely logical second-order claim that there is a mapping from the s onto the universe—that is:
behaves like Basic Law V on concepts that hold of fewer objects than are contained in the domain as a whole, providing each such concept with its own unique extension, but it maps all concepts that hold of as many objects as there are in the domain as a whole to a single, “dummy” object. This principle is meant to capture the spirit of Georg Cantor’s analysis of the set-theoretic paradoxes. According to Cantor, those concepts that do not correspond to a set (for example, the concept corresponding to the Russell predicate) fail to do so because they are in some sense “too big” (Hallett 1986).
is consistent, and, given the following definitions:
it entails the extensionality, empty set, pairing, separation, and replacement axioms familiar from Zermelo-Fraenkel set theory (ZFC), and it also entails a slightly reformulated version of the union axiom. It does not entail the axioms of infinity, powerset, or foundation.
is not Field-conservative, however, since it implies that there is a well-ordering on the entire domain—see (Shapiro & Weir 1999) for details. Since, as we saw earlier, there is wide agreement that acceptable abstraction principles ought to be conservative in exactly this sense, neo-logicists will likely need to look elsewhere for their reconstruction of set theory.
Thus, while current debates regarding the reconstruction of the real numbers concern primarily philosophical issues, or which of various technical reconstructions is to be preferred based on philosophical considerations such as Frege’s Constraint, there remains a very real question regarding whether anything like contemporary set theory can be given a mathematically adequate reconstruction on the neo-logicist approach.
d. Neo-Logicism and the Caesar Problem
The final problem that the neo-logicist is faced with is one that is already familiar: the Caesar Problem. Frege, of course, side-stepped the Caesar Problem by denying, in the end, that abstraction principles such as Hume’s Principle or Basic Law V were definitions. But the neo-logicist accepts that these abstraction principles are (implicit) definitions of the mathematical concepts in question. An adequate definition of a mathematical concept should meet the following two desiderata:
Identity Conditions: An adequate definition should explicate the conditions under which two entities falling under that definition are identical or distinct.
Demarcation Conditions: An adequate definition should explicate the conditions under which an entity falls under that definition or not.
In short, if Hume’s Principle is to serve as a definition of the concept cardinal number, then it should tell us when two cardinal numbers are the same, and when they are different, and it should tell us when an object is a cardinal number, and when it is not. As we have already seen, Hume’s Principle (and other abstraction principles) do a good job on the first task, but fall decidedly short on the second.
Neo-logicist solutions to the Caesar Problem typically take one of three forms. The first approach is to deny the problem, arguing that it does not matter if the object picked out by the relevant abstraction term of the form really is the number two, so long as that object plays the role of two in the domain of objects that makes Hume’s Principle true (that is, so long as it is appropriately related to the other objects referred to by other abstraction terms of the form ). Although this is not the target of the essay, the discussion of the connections between logicism and structuralism about mathematics in (Wright 2000) touches on something like this idea. The second approach is to argue that, although abstraction principles as we have understood them here do not settle identity claims of the form (where is not an abstraction term), we merely need to reformulate them appropriately. Again, although the Caesar Problem is not the main target of the essay, this sort of approach is pursued in (Cook 2016), where versions of abstraction principles involving modal operators are explored. Finally, the third approach involves admitting that abstraction principles alone are susceptible to the Caesar Problem, but arguing that abstraction principles alone need not solve it. Instead, identities of the form (where is not an abstraction term) are settled via a combination of the relevant abstraction principle plus additional metaphysical or semantic principles. This is the approach taken in (Hale & Wright 2001b), where the Caesar Problem is solved by mobilizing additional theoretical constraints regarding categories—that is, maximal sortal concepts with uniform identity conditions—arguing that objects from different categories cannot be identical.
Before moving on to other versions of abstractionism, it is worth mentioning a special case of the Caesar Problem. Traditionally, the Caesar Problem is cast as a puzzle about determining the truth conditions of claims of the form:
where is not an abstraction term. But there is a second sort of worry that arises along these lines, one that involves identities where each term is an abstraction term, but they are abstraction terms governed by distinct abstraction principles. For concreteness, consider two distinct (consistent) conceptual abstraction principles:
For reasons similar to those that underlie the original Caesar Problem, the conjunction of these two principles fails to settle any identities of the form:
This problem, which has come to be called the Problem (since one particular case would be when introduces the complex numbers, and introduces the real numbers) is discussed in (Fine 2002) and (Cook & Ebert 2005). The former suggests (more for reasons of technical convenience than for reasons of philosophical principle) that we settle such identities by requiring that identical abstracts correspond to identical equivalence classes. Thus, given the two abstraction principles above, we would adopt the following additional Identity Principle:
If, for example, we apply the Identity Principle to the abstracts governed by and those governed by Hume’s Principle, then we can conclude that:
That is, . After all, the equivalence class of concepts containing the empty concept according to the equivalence relation mobilized in is identical to the equivalence class of concepts containing the empty concept according to the equivalence relation mobilized in Hume’s Principle (both contain the empty concept and no other concepts). But the following claim would turn out to be false (where is any term):
That is, for any object , . Again, the reason is simple. The equivalence class given by the equivalence relation from , applied to the concept that holds of and alone, gives us an equivalence class that contains only that concept, while the equivalence class given by the equivalence relation from Hume’s Principle, applied to the concept that holds of and alone, gives us an equivalence class that contains any concept that holds of exactly one object.
While this solution is technically simple and elegant, (Cook & Ebert 2005) raises some objections. The most striking of which is a generalization of the examples above: Cook and Ebert suggest that any account that makes some numbers (in particular, zero) identical to some sets (in particular, the empty set), but does not either entail that all numbers are sets, or that all sets are numbers, is metaphysically suspect at best.
5. Dynamic Abstraction
Now that we’ve looked closely at both Frege’s logicist version of abstractionism and contemporary neo-logicism, we’ll finish up this essay by taking a brief look at another variation on the abstractionist theme.
Øystein Linnebo has formulated a version of abstractionism—dynamic abstraction—that involves modal notions, but in a way very different from the way in which these notions are mobilized in more traditional work on neo-logicism (Linnebo 2018). Before summarizing this view, however, we need to note that this account presupposes a rather different reading of the second-order variable involved in conceptual abstraction principles—the plural reading. Thus, a formula of the form:
should not be read as:
There is a concept such that holds of .
but rather as:
There are objects—the s—such that those objects are .
We will continue to use the same notation as before, but the reader should keep this difference in mind.
Linnebo begins the development of his novel version of abstractionism by pointing out that Basic Law V can be recast as a pair of principles. The first:
says that every plurality of objects has an extension, and the second:
says that given any two pluralities and their extensions, the latter are identical just in case the former are co-extensive.
Linnebo then reformulates these principles, replacing identities of the form with a relational claim of the form (this is mostly for technical reasons, involving the desire to avoid the need to mobilize free logic within the framework). should be read as:
is the set of s.
We then obtain what he calls the principle of Collapse:
and the principle of Extensionality:
which says that given any two pluralities and the corresponding sets, the latter are identical just in case the former are co-extensive. Now, these principles are jointly just as inconsistent as the original formulation of Basic Law V was. But Linnebo suggests a new way of conceiving of the process of abstraction: We understand the universal quantifiers in these principles to range over a given class of entities, and the existential quantifiers then give us new entities that are abstracted off of this prior ontology. As a result, one gets a dynamic picture of abstraction: instead of a abstraction principle describing the abstracts that arise as a result of consideration of all objects—including all abstracts—in a static, unchanging universal domain, we instead conceive of our ontology in terms of an ever-expanding series of domains, obtained via application of the extensions-forming abstraction operation on each domain to obtain a new, more encompassing domain.
Linnebo suggests that we can formalize these ideas precisely via adopting a somewhat non-standard application of the modal operators and . Loosely put, we read as saying “on any domain, ” and as saying “the domain can be expanded such that ”. Using these operators, we can formulate new, dynamic versions of Collapse and Extension. The modalized version of Collapse
says that, given any domain and any plurality of objects from that domain, there is a (possibly expanded) domain where the set containing the members of that plurality exists, and the modalized version of Extension:
says that, given any pluralities and the sets corresponding to them, the latter are identical if and only if the former are necessarily coextensive (note that a plurality, unlike a concept, has the same instances in every world). This version of Basic Law V, which entails many of the standard set-theoretic axioms, is consistent. In fact, it is consistent with a very strong, modal version of comprehension for pluralities (Linnebo 2018, 68). Thus, the dynamic abstraction approach, unlike the neo-logicism of Wright and Hale, allows for a particularly elegant abstractionist reconstruction of set theory.
Of course, if the dynamic version of Basic Law V is consistent on this approach, then the dynamic version of any abstraction principle is. As a result, given any neo-logicist abstraction principle:
there will be a corresponding pair of dynamic principles:
and:
where says something like:
is the -abstract of the s.
And and , unlike , are guaranteed to be (jointly) consistent.
Thus, although Linnebo must still grapple with the Caesar Problem and many of the other issues that plague neo-logicism—and the reader is encouraged to consult the relevant chapters of (Linnebo 2018) to see what he says in this regard—his dynamic abstraction account does not suffer from the Bad Company Problem: all forms of abstraction, once they are re-construed dynamically, are in Good Company.
6. References and Further Reading
Aristotle, (1975), Posterior Analytics J. Barnes (trans.), Oxford: Oxford University Press.
Bueno, O. & Ø. Linnebo (eds.) (2009), New Waves in Philosophy of Mathematics, Basingstoke UK: Palgrave.
Boolos, G. (1987), ‘The Consistency of Frege’s Foundations of Arithmetic, in (Thompson 1987): 211—233.
Boolos, G. (1989), “Iteration Again”, Philosophical Topics 17(2): 5—21.
Boolos, G. (1990a), “The Standard of Equality of Numbers”, in (Boolos 1990b): 3—20.
Boolos, G. (ed.) (1990b), Meaning and Method: Essays in Honor of Hilary Putnam, Cambridge: Cambridge University Press.
Boolos, G. (1997) “Is Hume’s Principle Analytic?”, in (Heck 1997): 245—261.
Boolos, G. (1998), Logic, Logic, and Logic, Cambridge MA: Harvard University Press.
Boolos, G. & R. Heck (1998), “Die Grundlagen der Arithmetik §82—83”, in (Boolos 1998): 315—338.
Cook, R. (ed.) (2007), The Arché Papers on the Mathematics of Abstraction, Dordrecht: Springer.
Cook, R. (2009), “New Waves on an Old Beach: Fregean Philosophy of Mathematics Today”, in (Bueno & Linnebo 2009): 13—34.
Cook, R. (2013), “How to Read Frege’s Grundgesetze” (Appendix to (Frege 1893/1903/2013): A1—A41.
Cook, R. (2016), “Necessity, Necessitism, and Numbers” Philosophical Forum 47: 385—414.
Cook, R. (2019), “Frege’s Little Theorem and Frege’s Way Out”, in (Ebert & Rossberg 2019): 384—410.
Cook, R. & P. Ebert (2005), “Abstraction and Identity”, Dialectica 59(2): 121—139.
Cook, R. & P. Ebert (2016), “Frege’s Recipe”, The Journal of Philosophy 113(7): 309—345.
Cook, R. & Ø. Linnebo (2018), “Cardinality and Acceptable Abstraction”, Notre Dame Journal of Formal Logic 59(1): 61—74.
Dummett, M. (1956), “Nominalism”, Philosophical Review 65(4):491—505.
Dummett, M. (1991), Frege: Philosophy of Mathematics. Cambridge MA: Harvard University Press.
Ebert, P. & M. Rossberg (eds.) (2016), Abstractionism: Essays in Philosophy of Mathematics, Oxford: Oxford University Press.
Ebert, P. & M. Rossberg (eds.) (2019), Essays on Frege’s Basic Laws of Arithmetic, Oxford: Oxford University Press.
Euclid (2012), The Elements, T. Heath (trans.), Mineola, New York: Dover.
Ferreira, F. & K. Wehmeier (2002), “On the Consistency of the –CA Fragment of Frege’s Grundgesetze”, Journal of Philosophical Logic 31: 301—311.
Field, H (2016), Science Without Numbers, Oxford: Oxford University Press.
Fine, K. (2002), The Limits of Abstraction, Oxford: Oxford University Press.
Frege, G. (1879/1972) Conceptual Notation and Related Articles (T. Bynum trans.), Oxford: Oxford University Press.
Frege, G. (1884/1980), Die Grundlagen der Arithmetik (The Foundations of Arithmetic) 2ndEd., J. Austin (trans.), Chicago: Northwestern University Press.
Frege, G. (1893/1903/2013) Grundgesetze der Arithmetik Band I & II (The Basic Laws of Arithmetic Vols. I & II) P. Ebert & M. Rossberg (trans.), Oxford: Oxford University Press.
Hale, B. (2000), “Reals by Abstraction”, Philosophia Mathematica 8(3): 100—123.
Hale, B. & C. Wright (2001a), The Reason’s Proper Study, Oxford: Oxford University Press.
Hale, B. & C. Wright (2001b), “To Bury Caesar. . . ”, in (Hale & Wright 2001a): 335—396.
Hallett, M. (1986), Cantor Set Theory and Limitation of Size, Oxford: Oxford University Press.
Heck, R. (1992), “On the Consistency of Second-Order Contextual Definitions”, Nous 26: 491—494.
Heck, R. (1993), “The Development of Arithmetic in Frege’s Grundgesetze der Arithmetik ”, Journal of Symbolic Logic 10: 153—174.
Heck, R. (1996), “The Consistency of Predicative Fragments of Frege’s Grundgesetze der Arithmetik ”, History and Philosophy of Logic 17: 209—220.
Heck, R. (ed.) (1997), Language, Thought, and Logic: Essays in Honour of Michael Dummett, Oxford: Oxford University Press.
Heck, R. (2013), Reading Frege’s Grundgesetze, Oxford: Oxford University Press.
Hodes, H. (1984), “Logicism and the Ontological Commitments of Artihmetic”, The Journal of Philosophy 81: 123—149.
Hume, D. (1888), A Treatise of Human Nature, Oxford: Clarendon Press.
Kant, I. (1987/1999), Critique of Pure Reason, P. Guyer & A. Wood (trans.), Cambridge: Cambridge University Press.
Kunen, K. (1980), Set Theory: An Introduction to Independence Proofs, Amsterdam: North Holland.
Linnebo, Ø. (2018), Thin Objects: An Abstractionist Account, Oxford: Oxford University Press.
Mancosu, P. (2016), Abstraction and Infinity, Oxford: Oxford University Press.
Russell, B. & A. Whitehead (1910/1912/1913) Principia Mathematica Volumes 1—3, Cambridge, Cambridge University Press.
Shapiro, S. (1991), Foundations without Foundationalism: The Case for Second-Order Logic, Oxford: Oxford University Press.
Shapiro, S. (2000), “Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis”, Notre Dame Journal of Formal Logic 41(4): 335—364.
Shapiro, S. & A. Weir (1999), “New V, ZF, and Abstraction”, Philosophia Mathematica 7(3): 293—321.
Tennant, N. (1987) Anti-Realism and Logic: Truth as Eternal, Oxford: Oxford University Press.
Thompson, J. ed. (1987), On Being and Saying: Essays in Honor of Richard Cartwright, Cambridge MA: MIT Press.
Wehmeier, K. (1999), “Consistent Fragments of Grundgesetze and the Existence of Non-Logical Objects”, Synthese 121: 309—328.
Weir, A., (2003), “Neo-Fregeanism: An Embarrassment of Riches?” Notre Dame Journal of Formal Logic 44(1): 13—48.
Wright, C. (1983), Frege’s Conception of Numbers as Objects, Aberdeen: Aberdeen University Press.
Wright, C. (1997), “On the Philosophical Significance of Frege’s Theorem”, in (Heck 1997): 201—244.
Wright, C. (2000), “Neo-Fregean Foundations for Real Analysis: Some Reflections on Frege’s Constraint”, Notre Dame Journal of Formal Logic 41(4): 317—344.
Author Information
Roy T. Cook
Email: cookx432@umn.edu
University of Minnesota
U. S. A.
Fallacies
A fallacy is a kind of error in reasoning. The list of fallacies below contains 231 names of the most common fallacies, and it provides brief explanations and examples of each of them. Fallacious reasoning should not be persuasive, but it too often is.
The vast majority of the commonly identified fallacies involve arguments, although some involve only explanations, or definitions, or questions, or other products of reasoning. Some researchers, although not most, use the term “fallacy” very broadly to indicate any false belief or cause of a false belief. The long list below includes some fallacies of these sorts if they have commonly-known names, but most are fallacies that involve kinds of errors made while arguing informally in natural language, that is, in everyday discourse.
A charge of fallacious reasoning always needs to be justified. The burden of proof is on your shoulders when you claim that someone’s reasoning is fallacious. Even if you do not explicitly give your reasons, it is your responsibility to be able to give them if challenged.
A piece of reasoning can have more than one fault and thereby commit more than one fallacy. If it is fallacious, this can be because of its form or its content or both. The formal fallacies are fallacious only because of their logical form, their structure. The Slippery Slope Fallacy is an informal fallacy that has the following form: Step 1 often leads to step 2. Step 2 often leads to step 3. Step 3 often leads to…until we reach an obviously unacceptable step, so step 1 is not acceptable. That form occurs in both good arguments and faulty arguments. The quality of an argument of this form depends crucially on the strength of the probabilities in going from one step to the next. The probabilities involve the argument’s content, not merely its logical form.
The discussion below that precedes the long alphabetical list of fallacies begins with an account of the ways in which the term “fallacy” is imprecise. Attention then turns to some of the competing and overlapping ways to classify fallacies of argumentation. Researchers in the field of fallacies disagree about which name of a fallacy is more helpful to use, whether some fallacies should be de-emphasized in favor of others, and which is the best taxonomy of the fallacies. Researchers in the field are also deeply divided about how to define the term “fallacy” itself and how to define certain fallacies. There is no agreement on whether there are necessary and sufficient conditions for distinguishing between fallacious and non-fallacious reasoning generally. Analogously, there is doubt in the field of ethics regarding whether researchers should pursue the goal of providing necessary and sufficient conditions for distinguishing moral actions from immoral ones.
The first known systematic study of fallacies was due to Aristotle in his De Sophisticis Elenchis (Sophistical Refutations), an appendix to his Topics, which is one of his six works on logic. This six are collectively known as the Organon. He listed thirteen types of fallacies. Very few advances were made for many centuries after this. After the Dark Ages, fallacies again were studied systematically in Medieval Europe. This is why so many fallacies have Latin names. The third major period of study of the fallacies began in the later twentieth century due to renewed interest from the disciplines of philosophy, logic, communication studies, rhetoric, psychology, and artificial intelligence.
The more frequent the error within public discussion and debate the more likely it is to have a name. Nevertheless, there is no specific name for the fallacy of subtracting five from thirteen and concluding that the answer is seven, even though the error is common.
The term “fallacy” is not a precise term. One reason is that it is ambiguous. Depending on the particular theory of fallacies, it might refer either to (a) a kind of error in an argument, (b) a kind of error in reasoning (including arguments, definitions, explanations, questions, and so forth), (c) a false belief, or (d) the cause of any of the previous errors including what are normally referred to as “rhetorical techniques.”
Regarding (d), being ill, being hungry, being stupid, being hypercritical, and being careless are all sources of potential error in reasoning, so they could qualify as fallacies of kind (d), but they are not included in the list below, and most researchers on fallacies normally do not call them fallacies. These sources of errors are more about why people commit a fallacy than about what the fallacy is. On the other hand, wishful thinking, stereotyping, being superstitious, rationalizing, and having a poor sense of proportion also are sources of potential error and are included in the list below, though they would not be included in the lists of some researchers. Thus there is a certain arbitrariness to what appears in lists such as this. What have been left off the list below are the following persuasive techniques commonly used to influence others and to cause errors in reasoning: apple polishing, ridiculing, applying financial pressure, being sarcastic, selecting terms with strong negative or positive associations, using innuendo, weasling, and using other propaganda techniques. Basing any reasoning primarily on the effectiveness of one or more of these techniques is fallacious.
The fallacy literature has given some attention to the epistemic role of reasoning. Normally, the goal in reasoning is to take the audience from not knowing to knowing, or from not being justified in believing something to being justified in believing it. If a fallacy is required to fail at achieving this epistemic goal, then begging the question, which is a form of repeating the conclusion in the premises, does not achieve this goal even though it is deductively valid—so, reasoning validly is not a guarantee of avoiding a fallacy.
In describing the fallacies below, the custom is followed of not distinguishing between a reasoner using a fallacy and the reasoning itself containing the fallacy.
Real arguments are often embedded within a very long discussion. Richard Whately, one of the greatest of the 19th century researchers into informal logic, wisely said “A very long discussion is one of the most effective veils of Fallacy; …a Fallacy, which when stated barely…would not deceive a child, may deceive half the world if diluted in a quarto volume.”
2. Taxonomy of Fallacies
The importance of understanding the common fallacy labels is that they provide an efficient way to communicate criticisms of someone’s reasoning. However, there are a number of competing and overlapping ways to classify the labels. The taxonomy of the fallacies is in dispute.
The fallacies of argumentation can be classified as either formal or informal. A formal fallacy can be detected by examining the logical form of the reasoning, whereas an informal fallacy usually cannot be detected this way because it depends upon the content of the reasoning and possibly the purpose of the reasoning. So, informal fallacies are errors of reasoning that cannot easily be expressed in our standard system of formal logic, the first-order predicate logic. The long list below contains very few formal fallacies. Fallacious arguments (as well as perfectly correct arguments) can be classified as deductive or inductive, depending upon whether the fallacious argument is most properly assessed by deductive standards or instead by inductive standards. Deductive standards demand deductive validity, but inductive standards require inductive strength such as making the conclusion more likely.
Fallacies of argumentation can be divided into other categories. Some classifications depend upon the psychological factors that lead people to use them. Those fallacies also can be divided into categories according to the epistemological factors that cause the error. For example, arguments depend upon their premises, even if a person has ignored or suppressed one or more of them, and a premise can be justified at one time, given all the available evidence at that time, even if we later learn that the premise was false. Also, even though appealing to a false premise is often fallacious, it is not if we are reasoning about what would have happened even if it did not happen.
3. Pedagogy
It is commonly claimed that giving a fallacy a name and studying it will help the student identify the fallacy in the future and will steer them away from using the fallacy in their own reasoning. As Steven Pinker says in The Stuff of Thought (p. 129),
If a language provides a label for a complex concept, that could make it easier to think about the concept, because the mind can handle it as a single package when juggling a set of ideas, rather than having to keep each of its components in the air separately. It can also give a concept an additional label in long-term memory, making it more easily retrievable than ineffable concepts or those with more roundabout verbal descriptions.
For pedagogical purposes, researchers in the field of fallacies disagree about the following topics: which name of a fallacy is more helpful to students’ understanding; whether some fallacies should be de-emphasized in favor of others; and which is the best taxonomy of the fallacies.
It has been suggested that, from a pedagogical perspective, having a representative set of fallacies pointed out to you in others’ reasoning is much more effective than your taking the trouble to learn the rules of avoiding all fallacies in the first place. But fallacy theory is criticized by some teachers of informal reasoning for its over-emphasis on poor reasoning rather than good reasoning. Do colleges teach Calculus by emphasizing all the ways one can make mathematical mistakes? Besides, studying fallacies will make students be overly critical. These critics want more emphasis on the forms of good arguments and on the implicit rules that govern proper discussion designed to resolve a difference of opinion.
4. What is a Fallacy?
Researchers disagree about how to define the very term “fallacy.” For example, most researchers say fallacies may be created unintentionally or intentionally, but some researchers say that a supposed fallacy created unintentionally should be called a blunder and not a fallacy.
Could there be a computer program, for instance, that could always successfully distinguish a fallacy from a non-fallacy? A fallacy is a mistake, but not every mistake is a fallacy.
Focusing just on fallacies of argumentation, some researchers define such a fallacy as an argument that is deductively invalid or that has very little inductive strength. Because examples of false dilemma, inconsistent premises, and begging the question are valid arguments in this sense, this definition misses some standard fallacies. Other researchers say a fallacy is a mistake in an argument that arises from something other than merely false premises. But the false dilemma fallacy is due to false premises. Still other researchers define a fallacy as an argument that is not good. Good arguments are then defined as those that are deductively valid or inductively strong, and that contain only true, well-established premises, but are not question-begging. A complaint with this definition is that its requirement of truth would improperly lead to calling too much scientific reasoning fallacious; every time a new scientific discovery caused scientists to label a previously well-established claim as false, all the scientists who used that claim as a premise would become fallacious reasoners. This consequence of the definition is acceptable to some researchers but not to others. Because informal reasoning regularly deals with hypothetical reasoning and with premises for which there is great disagreement about whether they are true or false, many researchers would relax the requirement that every premise must be true or at least known to be true. One widely accepted definition defines a fallacious argument as one that either is deductively invalid or is inductively very weak or contains an unjustified premise or that ignores relevant evidence that is available and that should be known by the arguer. Finally, yet another theory of fallacy says a fallacy is a failure to provide adequate proof for a belief, the failure being disguised to make the proof look adequate.
Other researchers recommend characterizing a fallacy as a violation of the norms of good reasoning, the rules of critical discussion, dispute resolution, and adequate communication. The difficulty with this approach is that there is so much disagreement about how to characterize these norms.
In addition, all the above definitions are often augmented with some remark to the effect that the fallacies need to be convincing or persuasive to too many people. It is notoriously difficult to be very precise about these notions. Some researchers in fallacy theory have therefore recommended dropping the notions altogether; other researchers suggest replacing them in favor of the phrase “can be used to persuade.”
Some researchers complain that all the above definitions of fallacy are too broad and do not distinguish between mere blunders and actual fallacies, the more serious errors.
Researchers in the field are deeply divided, not only about how to define the term “fallacy” and how to define some of the individual fallacies, but also about whether there are necessary and sufficient conditions for distinguishing between fallacious and non-fallacious reasoning generally. Analogously, there is doubt in the field of ethics whether researchers should pursue the goal of providing necessary and sufficient conditions for distinguishing moral actions from immoral ones.
5. Other Controversies
How do we defend the claim that an item of reasoning should be labeled as a particular fallacy? A major goal in the field of informal logic is provide some criteria for each fallacy. Schwartz presents the challenge this way:
Fallacy labels have their use. But fallacy-label texts tend not to provide useful criteria for applying the labels. Take the so-called ad verecundiam fallacy, the fallacious appeal to authority. Just when is it committed? Some appeals to authority are fallacious; most are not. A fallacious one meets the following condition: The expertise of the putative authority, or the relevance of that expertise to the point at issue, are in question. But the hard work comes in judging and showing that this condition holds, and that is where the fallacy-label-texts leave off. Or rather, when a text goes further, stating clear, precise, broadly applicable criteria for applying fallacy labels, it provides a critical instrument [that is] more fundamental than a taxonomy of fallacies and hence to that extent goes beyond the fallacy-label approach. The further it goes in this direction, the less it need to emphasize or even to use fallacy labels (Schwartz, 232).
The controversy here is the extent to which it is better to teach students what Schwartz calls “the critical instrument” than to teach the fallacy-label approach. Is the fallacy-label approach better for some kinds of fallacies than others? If so, which others?
One controversy involves the relationship between the fields of logic and rhetoric. In the field of rhetoric, the primary goal is to persuade the audience, not guide them to the truth. Philosophers concentrate on convincing the ideally rational reasoner.
Advertising in magazines and on television is designed to achieve visual persuasion. And a hug or the fanning of fumes from freshly baked donuts out onto the sidewalk are occasionally used for visceral persuasion. There is some controversy among researchers in informal logic as to whether the reasoning involved in this nonverbal persuasion can always be assessed properly by the same standards that are used for verbal reasoning.
6. Partial List of Fallacies
Consulting the list below will give a general idea of the kind of error involved in passages to which the fallacy name is applied. However, simply applying the fallacy name to a passage cannot substitute for a detailed examination of the passage and its context or circumstances because there are many instances of reasoning to which a fallacy name might seem to apply, yet, on further examination, it is found that in these circumstances the reasoning is really not fallacious.
The Accent Fallacy is a fallacy of ambiguity due to the different ways a word or syllable is emphasized or accented. Also called Accentus, Misleading Accent, and Prosody.
Example:
A member of Congress is asked by a reporter if she is in favor of the President’s new missile defense system, and she responds, “I’m in favor of a missile defense system that effectively defends America.”
With an emphasis on the word “favor,” her response is likely to be for the President’s missile defense system. With an emphasis, instead, on the word “effectively,” her remark is likely to be against the President’s missile defense system. And by using neither emphasis, she can later claim that her response was on either side of the issue. For an example of the Fallacy of Accent involving the accent of a syllable within a single word, consider the word “invalid” in the sentence, “Did you mean the invalid one?” When we accent the first syllable, we are speaking of a sick person, but when we accent the second syllable, we are speaking of an argument failing to meet the deductive standard of being valid. By not supplying the accent, and not supplying additional information to help us disambiguate, then we are committing the Fallacy of Accent.
We often arrive at a generalization but don’t or can’t list all the exceptions. When we then reason with the generalization as if it has no exceptions, our reasoning contains the Fallacy of Accident. This fallacy is sometimes called the “Fallacy of Sweeping Generalization.”
Example:
People should keep their promises, right? I loaned Dwayne my knife, and he said he’d return it. Now he is refusing to give it back, but I need it right now to slash up my neighbors who disrespected me.
People should keep their promises, but there are exceptions to this generalization as in this case of the psychopath who wants Dwayne to keep his promise to return the knife.
Psychologically, it is understandable that you would try to rescue a cherished belief from trouble. When faced with conflicting data, you are likely to mention how the conflict will disappear if some new assumption is taken into account. However, if there is no good reason to accept this saving assumption other than that it works to save your cherished belief, your rescue is an Ad Hoc Rescue.
Example:
Yolanda: If you take four of these tablets of vitamin C every day, you will never get a cold.
Juanita: I tried that last year for several months, and still got a cold.
Yolanda: Did you take the tablets every day?
Juanita: Yes.
Yolanda: Well, I’ll bet you bought some bad tablets.
The burden of proof is definitely on Yolanda’s shoulders to prove that Juanita’s vitamin C tablets were probably “bad”—that is, not really vitamin C. If Yolanda can’t do so, her attempt to rescue her hypothesis (that vitamin C prevents colds) is simply a dogmatic refusal to face up to the possibility of being wrong.
Ad Hominem
Your reasoning contains this fallacy if you make an irrelevant attack on the arguer and suggest that this attack undermines the argument itself. “Ad Hominem” means “to the person” as in being “directed at the person.”
Example:
What she says about Johannes Kepler’s astronomy of the 1600s must be just so much garbage. Do you realize she’s only fifteen years old?
This attack may undermine the young woman’s credibility as a scientific authority, but it does not undermine her reasoning itself because her age is irrelevant to quality of her reasoning. That reasoning should stand or fall on the scientific evidence, not on the arguer’s age or anything else about her personally.
The major difficulty with labeling a piece of reasoning an Ad Hominem Fallacy is deciding whether the personal attack is relevant or irrelevant. For example, attacks on a person for their immoral sexual conduct are irrelevant to the quality of their mathematical reasoning, but they are relevant to arguments promoting the person for a leadership position in a church or mosque.
If the fallacious reasoner points out irrelevant circumstances that the reasoner is in, such as the arguer’s having a vested interest in people accepting the position, then the ad hominem fallacy may be called a Circumstantial Ad Hominem. If the fallacious attack points out some despicable trait of the arguer, it may be called an Abusive Ad Hominem. An Ad hominem that attacks an arguer by attacking the arguer’s associates is called the Fallacy of Guilt by Association. If the fallacy focuses on a complaint about the origin of the arguer’s views, then it is a kind of Genetic Fallacy. If the fallacy is due to claiming the person does not practice what is preached, it is the Tu Quoque Fallacy. Two Wrongs do not Make a Right is also a type of Ad Hominem fallacy.
The intentional use of the ad hominem fallacy is a tactic used by all dictators and authoritarian leaders. If you say something critical of them or their regime, their immediate response is to attack you as unreliable, or as being a puppet of the enemy, or as being a traitor.
If you have enough evidence to affirm the consequent of a conditional and then suppose that as a result you have sufficient reason for affirming the antecedent, your reasoning contains the Fallacy of Affirming the Consequent. This formal fallacy is often mistaken for Modus Ponens, which is a valid form of reasoning also using a conditional. A conditional is an if-then statement; the if-part is the antecedent, and the then-part is the consequent. The following argument affirms the consequent that she does speak Portuguese. Its form is an invalid form.
Example:
If she’s Brazilian, then she speaks Portuguese. Hey, she does speak Portuguese. So, she is Brazilian.
Noticing that she speaks Portuguese suggests that she might be Brazilian, but it is weak evidence by itself, and if the argument is assessed by deductive standards, then it is deductively invalid. That is, if the arguer believes or suggests that her speaking Portuguese definitely establishes that she is Brazilian, then the argumentation contains the Fallacy of Affirming the Consequent.
Any fallacy that turns on ambiguity. See the fallacies of Amphiboly, Accent, and Equivocation. Amphiboly is ambiguity of syntax. Equivocation is ambiguity of semantics. Accent is ambiguity of emphasis.
Amphiboly
This is an error due to taking a grammatically ambiguous phrase in two different ways during the reasoning.
Example:
Tests show that the dog is not part wolf, as the owner suspected.
Did the owner suspect the dog was part wolf, or was not part wolf? Who knows? The sentence is ambiguous, and needs to be rewritten to remove the fallacy. Unlike Equivocation, which is due to multiple meanings of a phrase, Amphiboly is due to syntactic ambiguity, that is, ambiguity caused by multiple ways of understanding the grammar of the phrase.
Anecdotal Evidence
This is fallacious generalizing on the basis of a some story that provides an inadequate sample. If you discount evidence arrived at by systematic search or by testing in favor of a few firsthand stories, then your reasoning contains the fallacy of overemphasizing anecdotal evidence.
Example:
Yeah, I’ve read the health warnings on those cigarette packs and I know about all that health research, but my brother smokes, and he says he’s never been sick a day in his life, so I know smoking can’t really hurt you.
Anthropomorphism
This is the error of projecting uniquely human qualities onto something that isn’t human. Usually this occurs with projecting the human qualities onto animals, but when it is done to nonliving things, as in calling the storm cruel, the Pathetic Fallacy is created. It is also, but less commonly, called the Disney Fallacy or the Walt Disney Fallacy.
Example:
My dog is wagging his tail and running around me. Therefore, he knows that I love him.
The fallacy would be averted if the speaker had said “My dog is wagging his tail and running around me. Therefore, he is happy to see me.” Animals do not have the ability to ascribe knowledge to other beings such as humans. Your dog knows where it buried its bone, but not that you also know where the bone is.
Appeal to Authority
You appeal to authority if you back up your reasoning by saying that it is supported by what some authority says on the subject. Most reasoning of this kind is not fallacious, and much of our knowledge properly comes from listening to authorities. However, appealing to authority as a reason to believe something is fallacious whenever the authority appealed to is not really an authority in this particular subject, when the authority cannot be trusted to tell the truth, when authorities disagree on this subject (except for the occasional lone wolf), when the reasoner misquotes the authority, and so forth. Although spotting a fallacious appeal to authority often requires some background knowledge about the subject matter and the who is claimed to be the authority, in brief it can be said we are reasoning fallacious if we accept the words of a supposed authority when we should be suspicious of the authority’s words.
Example:
The moon is covered with dust because the president of our neighborhood association said so.
This is a Fallacious Appeal to Authority because, although the president is an authority on many neighborhood matters, you are given no reason to believe the president is an authority on the composition of the moon. It would be better to appeal to some astronomer or geologist. A TV commercial that gives you a testimonial from a famous film star who wears a Wilson watch and that suggests you, too, should wear that brand of watch is using a fallacious appeal to authority. The film star is an authority on how to act, not on which watch is best for you.
Appeal to Consequence
Arguing that a belief is false because it implies something you’d rather not believe. Also called Argumentum Ad Consequentiam.
Example:
That can’t be Senator Smith there in the videotape going into her apartment. If it were, he’d be a liar about not knowing her. He’s not the kind of man who would lie. He’s a member of my congregation.
Smith may or may not be the person in that videotape, but this kind of arguing should not convince us that it’s someone else in the videotape.
Appeal to Emotions
Your reasoning contains the Fallacy of Appeal to Emotions when someone’s appeal to you to accept their claim is accepted merely because the appeal arouses your feelings of anger, fear, grief, love, outrage, pity, pride, sexuality, sympathy, relief, and so forth. Example of appeal to relief from grief:
[The speaker knows he is talking to an aggrieved person whose house is worth much more than $100,000.] You had a great job and didn’t deserve to lose it. I wish I could help somehow. I do have one idea. Now your family needs financial security even more. You need cash. I can help you. Here is a check for $100,000. Just sign this standard sales agreement, and we can skip the realtors and all the headaches they would create at this critical time in your life.
There is nothing wrong with using emotions when you argue, but it’s a mistake to use emotions as the key premises or as tools to downplay relevant information. Regarding the Fallacy of Appeal to Pity, it is proper to pity people who have had misfortunes, but if as the person’s history instructor you accept Max’s claim that he earned an A on the history quiz because he broke his wrist while playing in your college’s last basketball game, then you’ve used the fallacy of appeal to pity.
The Fallacy of Appeal to Ignorance comes in two forms: (1) Not knowing that a certain statement is true is taken to be a proof that it is false. (2) Not knowing that a statement is false is taken to be a proof that it is true. The fallacy occurs in cases where absence of evidence is not good enough evidence of absence. The fallacy uses an unjustified attempt to shift the burden of proof. The fallacy is also called “Argument from Ignorance.”
Example:
Nobody has ever proved to me there’s a God, so I know there is no God.
This kind of reasoning is generally fallacious. It would be proper reasoning only if the proof attempts were quite thorough, and it were the case that, if the being or object were to exist, then there would be a discoverable proof of this. Another common example of the fallacy involves ignorance of a future event: You people have been complaining about the danger of Xs ever since they were invented, but there’s never been any big problem with Xs, so there’s nothing to worry about.
Appeal to Money
The Fallacy of Appeal to Money uses the error of supposing that, if something costs a great deal of money, then it must be better, or supposing that if someone has a great deal of money, then they’re a better person in some way unrelated to having a great deal of money. Similarly it’s a mistake to suppose that if something is cheap it must be of inferior quality, or to suppose that if someone is poor financially then they’re poor at something unrelated to having money.
Example:
He’s rich, so he should be the president of our Parents and Teachers Organization.
If you suggest too strongly that someone’s claim or argument is correct simply because it’s what most everyone believes, then your reasoning contains the Fallacy of Appeal to the People. Similarly, if you suggest too strongly that someone’s claim or argument is mistaken simply because it’s not what most everyone believes, then your reasoning also uses the fallacy. Agreement with popular opinion is not necessarily a reliable sign of truth, and deviation from popular opinion is not necessarily a reliable sign of error, but if you assume it is and do so with enthusiasm, then you are using this fallacy. It is essentially the same as the fallacies of Ad Numerum, Appeal to the Gallery, Appeal to the Masses, Argument from Popularity, Argumentum ad Populum, Common Practice, Mob Appeal, Past Practice, Peer Pressure, and Traditional Wisdom. The “too strongly” mentioned above is important in the description of the fallacy because what most everyone believes is, for that reason, somewhat likely to be true, all things considered. However, the fallacy occurs when this degree of support is overestimated.
Example:
You should turn to channel 6. It’s the most watched channel this year.
This is fallacious because of its implicitly accepting the questionable premise that the most watched channel this year is, for that reason alone, the best channel for you. If you stress the idea of appealing to a new idea held by the gallery, masses, mob, peers, people, and so forth, then it is a Bandwagon Fallacy.
We have an unfortunate instinct to base an important decision on an easily recalled, dramatic example, even though we know the example is atypical. It is a specific version of the fallacy of Confirmation Bias.
Example:
I just saw a video of a woman dying by fire in a car crash because she was unable to unbuckle her seat belt as the flames increased in intensity. So, I am deciding today no longer to wear a seat belt when I drive.
This reasoning commits the Fallacy of the Availability Heuristic because the reasoner would realize, if he would stop and think for a moment, that a great many more lives are saved due to wearing seat belts rather than due to not wearing seat belts, and the video of the situation of the woman unable to unbuckle her seat belt in the car crash is an atypical situation. The name of this fallacy is not very memorable, but it is in common use.
Avoiding the Issue
A reasoner who is supposed to address an issue but instead goes off on a tangent is properly accused of using the Fallacy of Avoiding the Issue. Also called missing the point, straying off the subject, digressing, and not sticking to the issue.
Example:
A city official is charged with corruption for awarding contracts to his wife’s consulting firm. In speaking to a reporter about why he is innocent, the city official talks only about his wife’s conservative wardrobe, the family’s lovable dog, and his own accomplishments in supporting Little League baseball.
However, the fallacy isn’t used by a reasoner who says that some other issue must first be settled and then continues by talking about this other issue, provided the reasoner is correct in claiming this dependence of one issue upon the other.
Avoiding the Question
The Fallacy of Avoiding the Question is a type of Fallacy of Avoiding the Issue that occurs when the issue is how to answer some question. The fallacy occurs when someone’s answer doesn’t really respond to the question asked. The fallacy is also called “Changing the Question.”
Example:
Question: Would the Oakland Athletics be in first place if they were to win tomorrow’s game?
Answer: What makes you think they’ll ever win tomorrow’s game?
Bad Seed
Attempting to undermine someone’s reasoning by pointing our their “bad” family history, when it is an irrelevant point. See Genetic Fallacy.
If you suggest that someone’s claim is correct simply because it’s what most everyone is coming to believe, then you’re are using the Bandwagon Fallacy. Get up here with us on the wagon where the band is playing, and go where we go, and don’t think too much about the reasons. The Latin term for this Fallacy of Appeal to Novelty is Argumentum ad Novitatem.
Example:
[Advertisement] More and more people are buying sports utility vehicles. It is time you bought one, too.
Like its close cousin, the Fallacy of Appeal to the People, the Bandwagon Fallacy needs to be carefully distinguished from properly defending a claim by pointing out that many people have studied the claim and have come to a reasoned conclusion that it is correct. What most everyone believes is likely to be true, all things considered, and if one defends a claim on those grounds, this is not a fallacious inference. What is fallacious is to be swept up by the excitement of a new idea or new fad and to unquestionably give it too high a degree of your belief solely on the grounds of its new popularity, perhaps thinking simply that ‘new is better.’ The key ingredient that is missing from a bandwagon fallacy is knowledge that an item is popular because of its high quality.
Begging the Question
A form of circular reasoning in which a conclusion is derived from premises that presuppose the conclusion. Normally, the point of good reasoning is to start out at one place and end up somewhere new, namely having reached the goal of increasing the degree of reasonable belief in the conclusion. The point is to make progress, but in cases of begging the question there is no progress, and the arguer is essentially arguing by repeating the point.
Example:
“Women have rights,” said the Bullfighters Association president. “But women shouldn’t fight bulls because a bullfighter is and should be a man.”
The president is saying basically that women shouldn’t fight bulls because women shouldn’t fight bulls. This reasoning isn’t making any progress.
Insofar as the conclusion of a deductively valid argument is “contained” in the premises from which it is deduced, this containing might seem to be a case of presupposing, and thus any deductively valid argument might seem to be begging the question. It is still an open question among logicians as to why some deductively valid arguments are considered to be begging the question and others are not. Some logicians suggest that, in informal reasoning with a deductively valid argument, if the conclusion is psychologically new insofar as the premises are concerned, then the argument isn’t an example of the fallacy. Other logicians suggest that we need to look instead to surrounding circumstances, not to the psychology of the reasoner, in order to assess the quality of the argument. For example, we need to look to the reasons that the reasoner used to accept the premises. Was the premise justified on the basis of accepting the conclusion? A third group of logicians say that, in deciding whether the fallacy is present, more evidence is needed. We must determine whether any premise that is key to deducing the conclusion is adopted rather blindly or instead is a reasonable assumption made by someone accepting their burden of proof. The premise would here be termed reasonable if the arguer could defend it independently of accepting the conclusion that is at issue.
Beside the Point
Arguing for a conclusion that is not relevant to the current issue. Also called Irrelevant Conclusion. It is a form of the Red Herring Fallacy
Biased Generalizing
Generalizing from a biased sample. Using an unrepresentative sample and overestimating the strength of an argument based on that sample.
See Unrepresentative Sample.
The Black-or-White fallacy or Black-White fallacy is a False Dilemma Fallacy that limits you unfairly to only two choices, as if you were made to choose between black and white.
Example:
Well, it’s time for a decision. Will you contribute $20 to our environmental fund, or are you on the side of environmental destruction?
A proper challenge to this fallacy could be to say, “I do want to prevent the destruction of our environment, but I don’t want to give $20 to your fund. You are placing me between a rock and a hard place.” The key to diagnosing the Black-or-White Fallacy is to determine whether the limited menu is fair or unfair. Simply saying, “Will you contribute $20 or won’t you?” is not unfair. The black-or-white fallacy is often committed intentionally in jokes such as: “My toaster has two settings—burnt and off.” In thinking about this kind of fallacy it is helpful to remember that everything is either black or not black, but not everything is either black or white.
Caricaturization
Attacking a person’s argument by presenting a caricaturization is a form of the Straw Man Fallacy and the Ad Hominem Fallacy. A critical thinker should attack the real man and his argument, not a caricaturization of the man or the argument. Ditto for women, of course. The fallacy is a form of the Straw Man Fallacy because Ideally an argument should not be assessed by a technique that unfairly misrepresents it. The Caricaturization Fallacy is the same as the Fallacy of Refutation by Caricature.
Cherry-Picking the Evidence is another name for the Fallacy of Suppressed Evidence.
Circular Reasoning
The Fallacy of Circular Reasoning occurs when the reasoner begins with what he or she is trying to end up with.
Here is Steven Pinker’s example:
Definition: endless loop, n. See loop, endless.
Definition: loop, endless, n. See endless loop.
The most well known examples of circular reasoning are cases of the Fallacy of Begging the Question. Here the circle is as short as possible. However, if the circle is very much larger, including a wide variety of claims and a large set of related concepts, then the circular reasoning can be informative and so is not considered to be fallacious. For example, a dictionary contains a large circle of definitions that use words which are defined in terms of other words that are also defined in the dictionary. Because the dictionary is so informative, it is not considered as a whole to be fallacious. However, a small circle of definitions is considered to be fallacious.
In properly-constructed recursive definitions, defining a term by using that same term is not fallacious. For example, here is an appropriate recursive definition of the term “a stack of coins.” Basis step: Two coins, with one on top of the other, is a stack of coins. Recursion step: If p is a stack of coins, then adding a coin on top of p produces a stack of coins. For a deeper discussion of circular reasoning see Infinitism in Epistemology.
This fallacy occurs during causal reasoning when a causal connection between two kinds of events is claimed when evidence is available indicating that both are the effect of a common cause.
Example:
Noting that the auto accident rate rises and falls with the rate of use of windshield wipers, one concludes that the use of wipers is somehow causing auto accidents.
However, it’s the rain that’s the common cause of both.
You use this fallacy when you frame a question so that some controversial presupposition is made by the wording of the question.
Example:
[Reporter’s question] Mr. President: Are you going to continue your policy of wasting taxpayer’s money on missile defense?
The question unfairly presumes the controversial claim that the policy really is a waste of money. The Fallacy of Complex Question is a form of Begging the Question.
Composition
The Composition Fallacy occurs when someone mistakenly assumes that a characteristic of some or all the individuals in a group is also a characteristic of the group itself, the group “composed” of those members. It is the converse of the Division Fallacy.
Example:
Each human cell is very lightweight, so a human being composed of cells is also very lightweight.
Confirmation Bias
The tendency to look for evidence in favor of one’s controversial hypothesis and not to look for disconfirming evidence, or to pay insufficient attention to it. This is the most common kind of Fallacy of Selective Attention, and it is the foundation of many conspiracy theories.
Example:
She loves me, and there are so many ways that she has shown it. When we signed the divorce papers in her lawyer’s office, she wore my favorite color. When she slapped me at the bar and called me a “handsome pig,” she used the word “handsome” when she didn’t have to. When I called her and she said never to call her again, she first asked me how I was doing and whether my life had changed. When I suggested that we should have children in order to keep our marriage together, she laughed. If she can laugh with me, if she wants to know how I am doing and whether my life has changed, and if she calls me “handsome” and wears my favorite color on special occasions, then I know she really loves me.
Using the Fallacy of Confirmation Bias is usually a sign that one has adopted some belief dogmatically and isn’t willing to disconfirm the belief, or is too willing to interpret ambiguous evidence so that it conforms to what one already believes. Confirmation bias often reveals itself in the fact that people of opposing views can each find support for those views in the same piece of evidence.
Conjunction
Mistakenly supposing that event E is less likely than the conjunction of events E and F. Here is an example from the psychologists Daniel Kahneman and Amos Tversky.
Example:
Suppose you know that Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice. Then you are asked to choose which is more likely: (A) Linda is a bank teller or (B) Linda is a bank teller and active in the feminist movement. If you choose (B) you commit the Conjunction Fallacy
Confusing an Explanation with an Excuse
Treating someone’s explanation of a fact as if it were a justification of the fact. Explaining a crime should not be confused with excusing the crime, but it too often is.
Example:
Speaker: The German atrocities committed against the French and Belgians during World War I were in part due to the anger of German soldiers who learned that French and Belgian soldiers were ambushing German soldiers, shooting them in the back, or even poisoning, blinding and castrating them.
Respondent: I don’t understand how you can be so insensitive as to condone those German atrocities.
Consensus Gentium
Fallacy of Argumentum Consensus Gentium (argument from the consensus of the nations). See Traditional Wisdom.
If we reason by paying too much attention to exceptions to the rule, and generalize on the exceptions, our reasoning contains this fallacy. This fallacy is the converse of the Accident Fallacy. It is a kind of Hasty Generalization, by generalizing too quickly from a peculiar case.
Example:
I’ve heard that turtles live longer than tarantulas, but the one turtle I bought lived only two days. I bought it at Dowden’s Pet Store. So, I think that turtles bought from pet stores do not live longer than tarantulas.
The original generalization is “Turtles live longer than tarantulas.” There are exceptions, such as the turtle bought from the pet store. Rather than seeing this for what it is, namely an exception, the reasoner places too much trust in this exception and generalizes on it to produce the faulty generalization that turtles bought from pet stores do not live longer than tarantulas.
Latin for “with this, therefore because of this.” This is a False Cause Fallacy that doesn’t depend on time order (as does the post hoc fallacy), but on any other chance correlation of the supposed cause being in the presence of the supposed effect.
Example:
Loud musicians live near our low-yield cornfields. So, loud musicians must be causing the low yield.
Curve Fitting
Curve fitting is the process of constructing a curve that has the best fit to a series of data points. The curve is a graph of some mathematical function. The function or functional relationship might be between variable x and variable y, where x is the time of day and y is the temperature of the ocean. When you collect data about some relationship, you inevitably collect information that is affected by noise or statistical fluctuation. If you create a function between x and y that is too sensitive to your data, you will be overemphasizing the noise and producing a function that has less predictive value than need be. If you create your function by interpolating, that is, by drawing straight line segments between all the adjacent data points, or if you create a polynomial function that exactly fits every data point, it is likely that your function will be worse than if you’d produced a function with a smoother curve. Your original error of too closely fitting the data-points is called the Fallacy of Curve Fitting or the Fallacy of Overfitting.
Example:
You want to know the temperature of the ocean today, so you measure it at 8:00 A.M. with one thermometer and get the temperature of 60.1 degrees. Then you measure the ocean at 8:05 A.M. with a different thermometer and get the temperature of 60.2 degrees; then at 8:10 A.M. and get 59.1 degrees perhaps with the first thermometer, and so. If you fit your curve exactly to your data points, then you falsely imply that the ocean’s temperature is shifting all around every five minutes. However, the temperature is probably constant, and the problem is that your prediction is too sensitive to your data, so your curve fits the data points too closely.
Definist
The Definist Fallacy occurs when someone unfairly defines a term so that a controversial position is made easier to defend. Same as the Persuasive Definition.
Example:
During a controversy about the truth or falsity of atheism, the fallacious reasoner says, “Let’s define ‘atheist’ as someone who doesn’t yet realize that God exists.”
Denying the Antecedent
You are using this fallacy if you deny the antecedent of a conditional and then suppose that doing so is a sufficient reason for denying the consequent. This formal fallacy is often mistaken for Modus Tollens, a valid form of argument using the conditional. A conditional is an if-then statement; the if-part is the antecedent, and the then-part is the consequent.
Example:
If she were Brazilian, then she would know that Brazil’s official language is Portuguese. She isn’t Brazilian; she’s from London. So, she surely doesn’t know this about Brazil’s language.
Disregarding Known Science
This fallacy is committed when a person makes a claim that knowingly or unknowingly disregards well known science, science that weighs against the claim. They should know better. This fallacy is a form of the Fallacy of Suppressed Evidence.
Example:
John claims in his grant application that he will be studying the causal effectiveness of bone color on the ability of leg bones to support indigenous New Zealand mammals. He disregards well known scientific knowledge that color is not what causes any bones to work the way they do by saying that this knowledge has never been tested in New Zealand.
Merely because a group as a whole has a characteristic, it often doesn’t follow that individuals in the group have that characteristic. If you suppose that it does follow, when it doesn’t, your reasoning contains the Fallacy of Division. It is the converse of the Composition Fallacy.
Example:
Joshua’s soccer team is the best in the division because it had an undefeated season and won the division title, so their goalie must be the best in the division.
As an example of division, Aristotle gave this example: The number 5 is 2 and 3. But 2 is even and 3 is odd, so 5 is even and odd.
There are many situations in which you should judge two things or people by the same standard. If in one of those situations you use different standards for the two, your reasoning contains the Fallacy of Using a Double Standard.
Example:
I know we will hire any man who gets over a 70 percent on the screening test for hiring Post Office employees, but women should have to get an 80 to be hired because they often have to take care of their children.
This example is a fallacy if it can be presumed that men and women should have to meet the same standard for becoming a Post Office employee.
Equivocation is the illegitimate switching of the meaning of a term that occurs twice during the reasoning; it is the use of one word taken in two ways. The fallacy is a kind of Fallacy of Ambiguity.
Example:
Brad is a nobody, but since nobody is perfect, Brad must be perfect, too.
The term “nobody” changes its meaning without warning in the passage. Equivocation can sometimes be very difficult to detect, as in this argument from Walter Burleigh:
If I call you a swine, then I call you an animal.
If I call you an animal, then I’m speaking the truth.
Therefore, if I call you a swine, then I’m speaking the truth.
Etymological
The Etymological Fallacy occurs whenever someone falsely assumes that the meaning of a word can be discovered from its etymology or origins.
Example:
The word “vise” comes from the Latin “that which winds,” so it means anything that winds. Since a hurricane winds around its own eye, it is a vise.
Every and All
The Fallacy of Every and All turns on errors due to the order or scope of the quantifiers “every” and “all” and “any.” This is a version of the Scope Fallacy.
Example:
Every action of ours has some final end. So, there is some common final end to all our actions.
In proposing this fallacious argument, Aristotle believed the common end is the supreme good, so he had a rather optimistic outlook on the direction of history.
Exaggeration
When we overstate or overemphasize a point that is a crucial step in a piece of reasoning, then we are guilty of the Fallacy of Exaggeration. This is a kind of error called Lack of Proportion.
Example:
She’s practically admitted that she intentionally yelled at that student while on the playground in the fourth grade. That’s verbal assault. Then she said nothing when the teacher asked, “Who did that?” That’s lying, plain and simple. Do you want to elect as secretary of this club someone who is a known liar prone to assault? Doing so would be a disgrace to our Collie Club.
When we exaggerate in order to make a joke, though, we do not use the fallacy because we do not intend to be taken literally.
The problem is that the items in the analogy are too dissimilar. When reasoning by analogy, the fallacy occurs when the analogy is irrelevant or very weak or when there is a more relevant disanalogy. See also Faulty Comparison.
Example:
The book Investing for Dummies really helped me understand my finances better. The book Chess for Dummies was written by the same author, was published by the same press, and costs about the same amount. So, this chess book would probably help me understand my finances, too.
False Balance
A specific form of the False Equivalence Fallacy that occurs in the context of news reporting, in which the reporter misleads the audience by suggesting the evidence on two sides of an issue is equally balanced, when the reporter knows that one of the two sides is an extreme outlier. Reporters regularly commit this fallacy in order to appear “fair and balanced.”
Example:
The news report of the yesterday’s city council meeting says, “David Samsung challenged the council by saying the Gracie Mansion is haunted, so it should not be torn down. Councilwoman Miranda Gonzales spoke in favor of dismantling the old mansion saying its land is needed for an expansion of the water treatment facility. Both sides seemed quite fervent in promoting their position.” Then the news report stops there, covering up the facts that the preponderance of scientific evidence implies there is no such thing as being haunted, and that David Samsung is the well known “village idiot” who last month came before the council demanding a tax increase for Santa Claus’ workers at the North Pole.
False Cause
Improperly concluding that one thing is a cause of another. The Fallacy of Non Causa Pro Causa is another name for this fallacy. Its four principal kinds are the Post Hoc Fallacy, the Fallacy of Cum Hoc, Ergo Propter Hoc, the Regression Fallacy, and the Fallacy of Reversing Causation.
Example:
My psychic adviser says to expect bad things when Mars is aligned with Jupiter. Tomorrow Mars will be aligned with Jupiter. So, if a dog were to bite me tomorrow, it would be because of the alignment of Mars with Jupiter.
A reasoner who unfairly presents too few choices and then implies that a choice must be made among this short menu of choices is using the False Dilemma Fallacy, as does the person who accepts this faulty reasoning.
Example:
A pollster asks you this question about your job: “Would you say your employer is drunk on the job about (a) once a week, (b) twice a week, or (c) more times per week?
The pollster is committing the fallacy by limiting you to only those choices. What about the choice of “no times per week”? Think of the unpleasant choices as being the horns of a bull that is charging toward you. By demanding other choices beyond those on the unfairly limited menu, you thereby “go between the horns” of the dilemma, and are not gored. The fallacy is called the “False Dichotomy Fallacy” or the “Black-or-White” Fallacy when the unfair menu contains only two choices, and thus two horns.
False Equivalence
The Fallacy of False Equivalence is committed when someone implies falsely (and usually indirectly) that the two sides on some issue have basically equivalent evidence, while knowingly covering up the fact that one side’s evidence is much weaker. A form of the Fallacy of Suppressed Evidence.
Example:
A popular science article suggests there is no consensus about the Earth’s age, by quoting one geologist who says she believes the Earth is billions of years old, and then by quoting Bible expert James Ussher who says he calculated from the Bible that the world began on Friday, October 28, 4,004 B.C.E. The article suppresses the evidence that geologists (who are the relevant experts on this issue) have reached a consensus that the Earth is billions of years old.
Far-Fetched Hypothesis
This is the fallacy of offering a bizarre (far-fetched) hypothesis as the correct explanation without first ruling out more mundane explanations.
Example:
Look at that mutilated cow in the field, and see that flattened grass. Aliens must have landed in a flying saucer and savaged the cow to learn more about the beings on our planet.
Faulty Comparison
If you try to make a point about something by comparison, and if you do so by comparing it with the wrong thing, then your reasoning uses the Fallacy of Faulty Comparison or the Fallacy of Questionable Analogy.
Example:
We gave half the members of the hiking club Durell hiking boots and the other half good-quality tennis shoes. After three months of hiking, you can see for yourself that Durell lasted longer. You, too, should use Durell when you need hiking boots.
Shouldn’t Durell hiking boots be compared with other hiking boots, not with tennis shoes?
An irrelevant appeal to the motives of the arguer, and supposing that this revelation of their motives will thereby undermine their reasoning. A kind of Ad Hominem Fallacy.
Example:
The councilman’s argument for the new convention center can’t be any good because he stands to gain if it’s built.
Formal Fallacy
Formal fallacies are all the cases or kinds of reasoning that fail to be deductively valid. Formal fallacies are also called Logical Fallacies or Invalidities. That is, they are deductively invalid arguments that are too often believed to be deductively valid.
Example:
Some cats are tigers. Some tigers are animals. So, some cats are animals.
This might at first seem to be a good argument, but actually it is fallacious because it has the same logical form as the following more obviously invalid argument:
Some women are Americans. Some Americans are men. So, some women are men.
Nearly all the infinity of types of invalid inferences have no specific fallacy names.
Four Terms
The Fallacy of Four Terms (quaternio terminorum) occurs when four rather than three categorical terms are used in a standard-form syllogism.
Example:
All rivers have banks. All banks have vaults. So, all rivers have vaults.
The word “banks” occurs as two distinct terms, namely river bank and financial bank, so this example also is an equivocation. Without an equivocation, the four term fallacy is trivially invalid.
Gambler’s
This fallacy occurs when the gambler falsely assumes that the history of outcomes will affect future outcomes.
Example:
I know this is a fair coin, but it has come up heads five times in a row now, so tails is due on the next toss.
The fallacious move was to conclude that the probability of the next toss coming up tails must be more than a half. The assumption that it’s a fair coin is important because, if the coin comes up heads five times in a row, one would otherwise become suspicious that it’s not a fair coin and therefore properly conclude that the probably is high that heads is more likely on the next toss.
Genetic
A critic uses the Genetic Fallacy if the critic attempts to discredit or support a claim or an argument because of its origin (genesis) when such an appeal to origins is irrelevant.
Example:
Whatever your reasons are for buying that gift, they’ve got to be ridiculous. You said yourself that you got the idea for buying it from last night’s fortune cookie. Cookies can’t think!
Fortune cookies are not reliable sources of information about what gift to buy, but the reasons the person is willing to give are likely to be quite relevant and should be listened to. The speaker is committing the Genetic Fallacy by paying too much attention to the genesis of the idea rather than to the reasons offered for it.
If I learn that your plan for building the shopping center next to the Johnson estate originated with Johnson himself, who is likely to profit from the deal, then my request that the planning commission not accept your proposal without independent verification of its merits wouldn’t be committing the genetic fallacy. Because appeals to origins are sometimes relevant and sometimes irrelevant and sometimes on the borderline, in those latter cases it can be very difficult to decide whether the fallacy has been committed. For example, if Sigmund Freud shows that the genesis of a person’s belief in God is their desire for a strong father figure, then does it follow that their belief in God is misplaced, or is Freud’s reasoning committing the Genetic Fallacy?
Group Think
A reasoner uses the Group Think Fallacy if he or she substitutes pride of membership in the group for reasons to support the group’s policy. If that’s what our group thinks, then that’s good enough for me. It’s what I think, too. “Blind” patriotism is a rather nasty version of the fallacy.
Example:
We K-Mart employees know that K-Mart brand items are better than Wall-Mart brand items because, well, they are from K-Mart, aren’t they?
Guilt by Association
Guilt by Association is a version of the Ad Hominem Fallacy in which a person is said to be guilty of error because of the group he or she associates with. The fallacy occurs when we unfairly try to change the issue to be about the speaker’s circumstances rather than about the speaker’s actual argument. Also called “Ad Hominem, Circumstantial.”
Example:
Secretary of State Dean Acheson is too soft on communism, as you can see by his inviting so many fuzzy-headed liberals to his White House cocktail parties.
Has any evidence been presented here that Acheson’s actions are inappropriate in regards to communism? This sort of reasoning is an example of McCarthyism, the technique of smearing liberal Democrats that was so effectively used by the late Senator Joe McCarthy in the early 1950s. In fact, Acheson was strongly anti-communist and the architect of President Truman’s firm policy of containing Soviet power.
I’ve met two people in Nicaragua so far, and they were both nice to me. So, all people I will meet in Nicaragua will be nice to me.
In any Hasty Generalization the key error is to overestimate the strength of an argument that is based on too small a sample for the implied confidence level or error margin. In this argument about Nicaragua, using the word “all” in the conclusion implies zero error margin. With zero error margin you’d need to sample every single person in Nicaragua, not just two people.
You are hedging if you refine your claim simply to avoid counterevidence and then act as if your revised claim is the same as the original.
Example:
Samantha: David is a totally selfish person.
Yvonne: I thought we was a boy scout leader. Don’t you have to give a lot of your time for that?
Samantha: Well, David’s totally selfish about what he gives money to. He won’t spend a dime on anyone else.
Yvonne: I saw him bidding on things at the high school auction fundraiser.
Samantha: Well, except for that he’s totally selfish about money.
You do not use the fallacy if you explicitly accept the counterevidence, admit that your original claim is incorrect, and then revise it so that it avoids that counterevidence.
Hooded Man
This is an error in reasoning due to confusing the knowing of a thing with the knowing of it under all its various names or descriptions.
Example:
You claim to know Socrates, but you must be lying. You admitted you didn’t know the hooded man over there in the corner, but the hooded man is Socrates.
Hyperbolic Discounting
The Fallacy of Hyperbolic Discounting occurs when someone too heavily weighs the importance of a present reward over a significantly greater reward in the near future, but only slightly differs in their valuations of those two rewards if they are to be received in the far future. The person’s preferences are biased toward the present.
Example:
When asked to decide between receiving an award of $50 now or $60 tomorrow, the person chooses the $50; however, when asked to decide between receiving $50 in two years or $60 in two years and one day, the person chooses the $60.
If the person is in a situation in which $50 now will solve their problem but $60 tomorrow will not, then there is no fallacy in having a bias toward the present.
Hypostatization
The error of inappropriately treating an abstract term as if it were a concrete one. Also known as the Fallacy of Misplaced Concreteness and the Fallacy of Reification.
Example:
Nature decides which organisms live and which die.
Nature isn’t capable of making decisions. The point can be made without reasoning fallaciously by saying: “Which organisms live and which die is determined by natural causes.” Whether a phrase commits the fallacy depends crucially upon whether the use of the inaccurate phrase is inappropriate in the situation. In a poem, it is appropriate and very common to reify nature, hope, fear, forgetfulness, and so forth, that is, to treat them as if they were objects or beings with intentions. In any scientific claim, it is inappropriate.
Ideology-Driven Argumentation
This occurs when an arguer presupposes some aspect of their own ideology that they are unable to defend.
Example:
Senator, if you pass that bill to relax restrictions on gun ownership and allow people to carry concealed handguns, then you are putting your own voters at risk.
The arguer is presupposing a liberal ideology which implies that permitting private citizens to carry concealed handguns increases crime and decreases safety. If the arguer is unable to defend this presumption, then the fallacy is committed regardless of whether the presumption is defensible. If the senator were to accept this liberal ideology, then the senator is likely to accept the arguer’s conclusion, and the argument could be considered to be effective, but still it would be fallacious—such is the difference between rhetoric and logic.
The fallacy occurs when we accept an inconsistent set of claims, that is, when we accept a claim that logically conflicts with other claims we hold.
Example:
I never generalize because everyone who does is a hypocrite.
That last remark implies the speaker does generalize, although the speaker doesn’t notice this inconsistency with what is said.
Inductive Conversion
Improperly reasoning from a claim of the form “All As are Bs” to “All Bs are As” or from one of the form “Many As are Bs” to “Many Bs are As” and so forth.
Example:
Most professional basketball players are tall, so most tall people are professional basketball players.
The term “conversion” is a technical term in formal logic.
Insufficient Statistics
Drawing a statistical conclusion from a set of data that is clearly too small.
Example:
A pollster interviews ten London voters in one building about which candidate for mayor they support, and upon finding that Churchill receives support from six of the ten, declares that Churchill has the majority support of London voters.
The mistake of treating different descriptions or names of the same object as equivalent even in those contexts in which the differences between them matter. Reporting someone’s beliefs or assertions or making claims about necessity or possibility can be such contexts. In these contexts, replacing a description with another that refers to the same object is not valid and may turn a true sentence into a false one.
Example:
Michelle said she wants to meet her new neighbor Stalnaker tonight. But I happen to know Stalnaker is a spy for North Korea, so Michelle said she wants to meet a spy for North Korea tonight.
Michelle said no such thing. The faulty reasoner illegitimately assumed that what is true of a person under one description will remain true when said of that person under a second description even in this context of indirect quotation. What was true of the person when described as “her new neighbor Stalnaker” is that Michelle said she wants to meet him, but it wasn’t legitimate for me to assume this is true of the same person when he is described as “a spy for North Korea.”
Extensional contexts are those in which it is legitimate to substitute equals for equals with no worry. But any context in which this substitution of co-referring terms is illegitimate is called an intensional context. Intensional contexts are produced by quotation, modality, and intentionality (propositional attitudes). Intensionality is failure of extensionality, thus the name “Intensional Fallacy”.
Invalid Reasoning
An invalid inference. An argument can be assessed by deductive standards to see if the conclusion would have to be true if the premises were to be true. If the argument cannot meet this standard, it is invalid. An argument is invalid only if it is not an instance of any valid argument form. The Fallacy of Invalid Reasoning is a formal fallacy.
Example:
If it’s raining, then there are clouds in the sky. It’s not raining. Therefore, there are no clouds in the sky.
This invalid argument is an instance of Denying the Antecedent. Any invalid inference that is also inductively very weak is a Non Sequitur.
Irrelevant Conclusion
The conclusion that is drawn is irrelevant to the premises; it misses the point.
Example:
In court, Thompson testifies that the defendant is a honorable person, who wouldn’t harm a flea. The defense attorney uses the fallacy by rising to say that Thompson’s testimony shows once again that his client was not near the murder scene.
The testimony of Thompson may be relevant to a request for leniency, but it is irrelevant to any claim about the defendant not being near the murder scene. Other examples of this fallacy are Ad Hominem, Appeal to Authority, Appeal to Emotions, and Argument from Ignorance.
Irrelevant Reason
This fallacy is a kind of Non Sequitur in which the premises are wholly irrelevant to drawing the conclusion.
Example:
Lao Tze Beer is the top selling beer in Thailand. So, it will be the best beer for Canadians.
Is-Ought
The Is-Ought Fallacy occurs when a conclusion expressing what ought to be so is inferred from premises expressing only what is so, in which it is supposed that no implicit or explicit ought-premises are need. There is controversy in the philosophical literature regarding whether this type of inference is always fallacious.
Example:
He’s torturing the cat.
So, he shouldn’t do that.
This argument would not use the fallacy if there were an implicit premise indicating that he is a person and that persons should not torture other beings.
Jumping to Conclusions
It is not always a mistake to make a quick decision, but when we draw a conclusion without taking the trouble to acquire enough of the relevant evidence, our reasoning commits the fallacy of jumping to conclusions, provided there was sufficient time to acquire and assess that extra evidence, and provided that the extra effort it takes to get the evidence isn’t prohibitive.
Example:
This car is really cheap. I’ll buy it.
Hold on. Before concluding that you should buy it, ask yourself whether you need to buy another car and, if so, whether you should lease or rent or just borrow a car when you need to travel by car. If you do need to buy a car, you ought to have someone check its operating condition, or else you should make sure you get a guarantee about the car’s being in working order. And, if you stop to think about it, there may be other factors you should consider before making the purchase, such as its age, size, appearance, and mileage.
Lack of Proportion
The Fallacy of Lack of Proportion occurs either by exaggerating or downplaying or simply not noticing a point that is a crucial step in a piece of reasoning. You exaggerate when you make a mountain out of a molehill. You downplay when you suppress relevant evidence. The Genetic Fallacy blows the genesis of an idea out of proportion.
Example:
Did you hear about that tourist being mugged in Russia last week? And then there was the awful train wreck last year just outside Moscow where three of the twenty-five persons killed were tourists. I’ll never visit Russia.
The speaker is blowing these isolated incidents out of proportion. Millions of tourists visit Russia with no problems. Another example occurs when the speaker simply lacks the information needed to give a factor its proper proportion or weight:
I don’t use electric wires in my home because it is well known that the human body can be injured by electric and magnetic fields.
The speaker does not realize all experts agree that electric and magnetic fields caused by home wiring are harmless. However, touching the metal within those wires is very dangerous.
Line-Drawing
If we improperly reject a vague claim because it is not as precise as we’d like, then we are using the line-drawing fallacy. Being vague is not being hopelessly vague. Also called the Bald Man Fallacy, the Fallacy of the Heap and the Sorites Fallacy.
Example:
Dwayne can never grow bald. Dwayne isn’t bald now. Don’t you agree that if he loses one hair, that won’t make him go from not bald to bald? And if he loses one hair after that, then this one loss, too, won’t make him go from not bald to bald. Therefore, no matter how much hair he loses, he can’t become bald.
Loaded Language
Loaded language is emotive terminology that expresses value judgments. When used in what appears to be an objective description, the terminology unfortunately can cause the listener to adopt those values when in fact no good reason has been given for doing so. Also called Prejudicial Language.
Example:
[News broadcast] In today’s top stories, Senator Smith carelessly cast the deciding vote today to pass both the budget bill and the trailer bill to fund yet another excessive watchdog committee over coastal development.
This broadcast is an editorial posing as a news report.
Loaded Question
Asking a question in a way that unfairly presumes the answer. This fallacy occurs commonly in polls, especially push polls, which are polls designed to push information onto the person being polled and not designed to learn the person’s views.
Example:
“If you knew that candidate B was a liar and crook, would you support candidate A or instead candidate B who is neither a liar nor a crook?”
Logic Chopping
Obscuring the issue by using overly-technical logic tools, especially the techniques of formal symbolic logic, that focus attention on trivial details. A form of Smokescreen and Quibbling.
A fallacy of reasoning that depends on intentionally saying something that is known to be false. If the lying occurs in an argument’s premise, then it is an example of the Fallacy of Questionable Premise.
Example:
Abraham Lincoln, Theodore Roosevelt, and John Kennedy were assassinated.
They were U.S. presidents.
Therefore, at least three U.S. presidents have been assassinated.
When the Fallacy of Jumping to Conclusions is due to a special emphasis on an anecdote or other piece of evidence, then the Fallacy of Misleading Vividness has occurred.
Example:
Yes, I read the side of the cigarette pack about smoking being harmful to your health. That’s the Surgeon General’s opinion, him and all his statistics. But let me tell you about my uncle. Uncle Harry has smoked cigarettes for forty years now and he’s never been sick a day in his life. He even won a ski race at Lake Tahoe in his age group last year. You should have seen him zip down the mountain. He smoked a cigarette during the award ceremony, and he had a broad smile on his face. I was really proud. I can still remember the cheering. Cigarette smoking can’t be as harmful as people say.
The vivid anecdote is the story about Uncle Harry. Too much emphasis is placed on it and not enough on the statistics from the Surgeon General.
Misplaced Concreteness
Mistakenly supposing that something is a concrete object with independent existence, when it’s not. Also known as the Fallacy of Reification and the Fallacy of Hypostatization.
Example:
There are two footballs lying on the floor of an otherwise empty room. When asked to count all the objects in the room, John says there are three: the two balls plus the group of two.
John mistakenly supposed a group or set of concrete objects is also a concrete object.
A less metaphysical example would be a situation where John says a criminal was caught by K-9 aid, and thereby supposed that K-9 aid was some sort of concrete object. John could have expressed the same point less misleadingly by saying a K-9 dog aided in catching a criminal.
Misplaced Burden of Proof
Committing the error of trying to get someone else to prove you are wrong, when it is your responsibility to prove you are correct.
Example:
Person A: I saw a green alien from outer space.
Person B: What!? Can you prove it?
Person A: You can’t prove I didn’t.
If someone says, “I saw a green alien from outer space,” you properly should ask for some proof. If the person responds with no more than something like, “Prove I didn’t,” then they are not accepting their burden of proof and are improperly trying to place it on your shoulders.
Misrepresentation
If the misrepresentation occurs on purpose, then it is an example of lying. If the misrepresentation occurs during a debate in which there is misrepresentation of the opponent’s claim, then it would be the cause of a Straw Man Fallacy.
This is the error of treating modal conditionals as if the modality applies only to the then-part of the conditional when it more properly applies to the entire conditional.
Example:
James has two children. If James has two children, then he necessarily has more than one child. So, it is necessarily true that James has more than one child.
This apparently valid argument is invalid. It is not necessarily true that James has more than one child; it’s merely true that he has more than one child. He could have had no children. It is logically possible that James has no children even though he actually has two. The solution to the fallacy is to see that the premise “If James has two children, then he necessarily has more than one child,” requires the modality “necessarily” to apply logically to the entire conditional “If James has two children,then he has more than one child” even though grammatically it applies only to “he has more than one child.” The Modal Fallacy is the most well known of the infinitely many errors involving modal concepts. Modal concepts include necessity, possibility, and so forth.
On a broad interpretation of this fallacy, it applies to any attempt to argue from an “is” to an “ought,” that is, from a list of facts to a conclusion about what ought to be done.
Example:
Because women are naturally capable of bearing and nursing children while men are not, women ought to be the primary caregivers of children.
Here is another example. Owners of financially successful companies are more successful than poor people in the competition for wealth, power and social status. Therefore, the poor deserve to be poor. There is considerable disagreement among philosophers regarding what sorts of arguments the term “Naturalistic Fallacy” legitimately applies to.
This error is a kind of Ad Hoc Rescue of one’s generalization in which the reasoner re-characterizes the situation solely in order to escape refutation of the generalization.
Example:
Smith: All Scotsmen are loyal and brave.
Jones: But McDougal over there is a Scotsman, and he was arrested by his commanding officer for running from the enemy.
Smith: Well, if that’s right, it just shows that McDougal wasn’t a TRUE Scotsman.
Non Causa Pro Causa
This label is Latin for mistaking the “non-cause for the cause.” See False Cause.
Non Sequitur
When a conclusion is supported only by extremely weak reasons or by irrelevant reasons, the argument is fallacious and is said to be a Non Sequitur. However, we usually apply the term only when we cannot think of how to label the argument with a more specific fallacy name. Any deductively invalid inference is a non sequitur if it also very weak when assessed by inductive standards.
Example:
Nuclear disarmament is a risk, but everything in life involves a risk. Every time you drive in a car you are taking a risk. If you’re willing to drive in a car, you should be willing to have disarmament.
The following is not an example: “If she committed the murder, then there’d be his blood stains on her hands. His blood stains are on her hands. So, she committed the murder.” This deductively invalid argument uses the Fallacy of Affirming the Consequent, but it isn’t a non sequitur because it has significant inductive strength.
Obscurum per Obscurius
Explaining something obscure or mysterious by something that is even more obscure or more mysterious.
Example:
Let me explain what a lucky result is. It is a fortuitous collapse of the quantum mechanical wave packet that leads to a surprisingly pleasing result.
Being opposed to someone’s reasoning because of who they are, usually because of what group they are associated with. See the Fallacy of Guilt by Association.
You oversimplify when you cover up relevant complexities or make a complicated problem appear to be too much simpler than it really is.
Example:
President Bush wants our country to trade with Fidel Castro’s Communist Cuba. I say there should be a trade embargo against Cuba. The issue in our election is Cuban trade, and if you are against it, then you should vote for me for president.
Whom to vote for should be decided by considering quite a number of issues in addition to Cuban trade. When an oversimplification results in falsely implying that a minor causal factor is the major one, then the reasoning also uses the False Cause Fallacy.
The Pathetic Fallacy is a mistaken belief due to attributing peculiarly human qualities to inanimate objects (but not to animals). The fallacy is caused by anthropomorphism.
Example:
Aargh, it won’t start again. This old car always breaks down on days when I have a job interview. It must be afraid that if I get a new job, then I’ll be able to afford a replacement, so it doesn’t want me to get to my interview on time.
Some people try to win their arguments by getting you to accept their faulty definition. If you buy into their definition, they’ve practically persuaded you already. Same as the Definist Fallacy. Poisoning the Well when presenting a definition would be an example of a using persuasive definition.
Example:
Let’s define a Democrat as a leftist who desires to overtax the corporations and abolish freedom in the economic sphere.
Perfectionist
If you remark that a proposal or claim should be rejected solely because it doesn’t solve the problem perfectly, in cases where perfection isn’t really required, then you’ve used the Perfectionist Fallacy.
Example:
You said hiring a house cleaner would solve our cleaning problems because we both have full-time jobs. Now, look what happened. Every week, after cleaning the toaster oven, our house cleaner leaves it unplugged. I should never have listened to you about hiring a house cleaner.
Poisoning the well is a preemptive attack on a person in order to discredit their testimony or argument in advance of their giving it. A person who thereby becomes unreceptive to the testimony reasons fallaciously and has become a victim of the poisoner. This is a kind of Ad Hominem, Circumstantial Fallacy.
Example:
[Prosecuting attorney in court] When is the defense attorney planning to call that twice-convicted child molester, David Barnington, to the stand? OK, I’ll rephrase that. When is the defense attorney planning to call David Barnington to the stand?
Post Hoc
Suppose we notice that an event of kind A is followed in time by an event of kind B, and then hastily leap to the conclusion that A caused B. If so, our reasoning contains the Post Hoc Fallacy. Correlations are often good evidence of causal connection, so the fallacy occurs only when the leap to the causal conclusion is done “hastily.” The Latin term for the fallacy is Post Hoc, Ergo Propter Hoc (“After this, therefore because of this”). It is a kind of False Cause Fallacy.
Example:
I have noticed a pattern about all the basketball games I’ve been to this year. Every time I buy a good seat, our team wins. Every time I buy a cheap, bad seat, we lose. My buying a good seat must somehow be causing those wins.
Your background knowledge should tell you that this pattern probably won’t continue in the future; it’s just an accidental correlation that tells you nothing about the cause of your team’s wins.
Substituting a distracting comment for a real proof.
Example:
I don’t need to tell a smart person like you that you should vote Republican.
This comment is trying to avoid a serious disagreement about whether one should vote Republican.
Prosecutor’s Fallacy
This is the mistake of over-emphasizing the strength of a piece of evidence while paying insufficient attention to the context.
Example:
Suppose a prosecutor is trying to gain a conviction and points to the evidence that at the scene of the burglary the police found a strand of the burglar’s hair. A forensic test showed that the burglar’s hair matches the suspect’s own hair. The forensic scientist testified that the chance of a randomly selected person producing such a match is only one in two thousand. The prosecutor concludes that the suspect has only a one in two thousand chance of being innocent. On the basis of only this evidence, the prosecutor asks the jury for a conviction.
That is fallacious reasoning, and if you are on the jury you should not be convinced. Here’s why. The prosecutor paid insufficient attention to the pool of potential suspects. Suppose that pool has six million people who could have committed the crime, all other things being equal. If the forensic lab had tested all those people, they’d find that about one in every two thousand of them would have a hair match, but that is three thousand people. The suspect is just one of the 3000, so the suspect is very probably innocent unless the prosecutor can provide more evidence. The prosecutor over-emphasized the strength of a
piece of evidence by focusing on one suspect while paying insufficient attention to the context which suggests a pool of many more suspects.
We quibble when we complain about a minor point and falsely believe that this complaint somehow undermines the main point. To avoid this error, the logical reasoner will not make a mountain out of a mole hill nor take people too literally. Logic Chopping is a kind of quibbling.
Example:
I’ve found typographical errors in your poem, so the poem is neither inspired nor perceptive.
Quoting out of Context
If you quote someone, but select the quotation so that essential context is not available and therefore the person’s views are distorted, then you’ve quoted “out of context.” Quoting out of context in an argument creates a Straw Man Fallacy. The fallacy is also called “contextomy.”
Example:
Smith: I’ve been reading about a peculiar game in this article about vegetarianism. When we play this game, we lean out from a fourth-story window and drop down strings containing “Free food” signs on the end in order to hook unsuspecting passers-by. It’s really outrageous, isn’t it? Yet isn’t that precisely what sports fishermen do for entertainment from their fishing boats? The article says it’s time we put an end to sport fishing.
Jones: Let me quote Smith for you. He says “We…hook unsuspecting passers-by.” What sort of moral monster is this man Smith?
Jones’s selective quotation is fallacious because it makes Smith appear to advocate this immoral activity when the context makes it clear that he doesn’t.
Rationalization
We rationalize when we inauthentically offer reasons to support our claim. We are rationalizing when we give someone a reason to justify our action even though we know this reason is not really our own reason for our action, usually because the offered reason will sound better to the audience than our actual reason.
Example:
“I bought the matzo bread from Kroger’s Supermarket because it is the cheapest brand and I wanted to save money,” says Alex [who knows he bought the bread from Kroger’s Supermarket only because his girlfriend works there].
Red Herring
A red herring is a smelly fish that would distract even a bloodhound. It is also a digression that leads the reasoner off the track of considering only relevant information.
Example:
Will the new tax in Senate Bill 47 unfairly hurt business? I notice that the main provision of the bill is that the tax is higher for large employers (fifty or more employees) as opposed to small employers (six to forty-nine employees). To decide on the fairness of the bill, we must first determine whether employees who work for large employers have better working conditions than employees who work for small employers. I am ready to volunteer for a new committee to study this question. How do you suppose the committee should go about collecting the data we need?
Bringing up the issue of working conditions and the committee is the red herring diverting us from the main issue of whether Senate Bill 47 unfairly hurts business. An intentional false lead in a criminal investigation is another example of a red herring.
This fallacy occurs when regression to the mean is mistaken for a sign of a causal connection. Also called the Regressive Fallacy. It is a kind of False Cause Fallacy.
Example:
You are investigating the average heights of groups of people living in the United States. You sample some people living in Columbus, Ohio and determine their average height. You have the numerical figure for the mean height of people living in the U.S., and you notice that members of your sample from Columbus have an average height that differs from this mean. Your second sample of the same size is from people living in Dayton, Ohio. When you find that this group’s average height is closer to the U.S. mean height [as it is very likely to be due to common statistical regression to the mean], you falsely conclude that there must be something causing people living in Dayton to be more like the average U.S. resident than people living in Columbus.
There is most probably nothing causing people from Dayton to be more like the average resident of the U.S.; but rather what is happening is that averages are regressing to the mean.
Reification
Considering a word to be referring to an object, when the meaning of the word can be accounted for more mundanely without assuming the object exists. Also known as the Fallacy of Misplaced Concreteness and the Hypostatization.
Example:
The 19th century composer Tchaikovsky described the introduction to his Fifth Symphony as “a complete resignation before fate.”
He is treating “fate” as if it is naming some object, when it would be less misleading, but also less poetic, to say the introduction suggests that listeners will resign themselves to accepting whatever events happen to them. The Fallacy occurs also when someone says, “I succumbed to nostalgia.” Without committing the fallacy, one can make the same point by saying, “My mental state caused actions that would best be described as my reflecting an unusual desire to return to some past period of my life.” Another common way the Fallacy is used is when someone says that if you understand what “Sherlock Holmes” means, then Sherlock Holmes exists in your understanding. The larger point being made in this last example is that nouns can be meaningful without them referring to an object, yet those who use the Fallacy of Reification do not understand this point.
Reversing Causation
Drawing an improper conclusion about causation due to a causal assumption that reverses cause and effect. A kind of False Cause Fallacy.
Example:
All the corporate officers of Miami Electronics and Power have big boats. If you’re ever going to become an officer of MEP, you’d better get a bigger boat.
The false assumption here is that having a big boat helps cause you to be an officer in MEP, whereas the reverse is true. Being an officer causes you to have the high income that enables you to purchase a big boat.
Scapegoating
If you unfairly blame an unpopular person or group of people for a problem, then you are scapegoating. This is a kind of Fallacy of Appeal to Emotions.
Example:
Augurs were official diviners of ancient Rome. During the pre-Christian period, when Christians were unpopular, an augur would make a prediction for the emperor about, say, whether a military attack would have a successful outcome. If the prediction failed to come true, the augur would not admit failure but instead would blame nearby Christians for their evil influence on his divining powers. The elimination of these Christians, the augur would claim, could restore his divining powers and help the emperor. By using this reasoning tactic, the augur was scapegoating the Christians.
Scare Tactic
If you suppose that terrorizing your opponent is giving him a reason for believing that you are correct, then you are using a scare tactic and reasoning fallaciously.
Example:
David: My father owns the department store that gives your newspaper fifteen percent of all its advertising revenue, so I’m sure you won’t want to publish any story of my arrest for spray painting the college.
Newspaper editor: Yes, David, I see your point. The story really isn’t newsworthy.
David has given the editor a financial reason not to publish, but he has not given a relevant reason why the story is not newsworthy. David’s tactics are scaring the editor, but it’s the editor who uses the Scare Tactic Fallacy, not David. David has merely used a scare tactic. This fallacy’s name emphasizes the cause of the fallacy rather than the error itself. See also the related Fallacy of Appeal to Emotions.
Scope
The Scope Fallacy is caused by improperly changing or misrepresenting the scope of a phrase.
Example:
Every concerned citizen who believes that someone living in the US is a terrorist should make a report to the authorities. But Shelley told me herself that she believes there are terrorists living in the US, yet she hasn’t made any reports. So, she must not be a concerned citizen.
The first sentence has ambiguous scope. It was probably originally meant in this sense: Every concerned citizen who believes (of someone that this person is living in the US and is a terrorist) should make a report to the authorities. But the speaker is clearly taking the sentence in its other, less plausible sense: Every concerned citizen who believes (that there is someone or other living in the US who is a terrorist) should make a report to the authorities. Scope fallacies usually are Amphibolies.
Improperly focusing attention on certain things and ignoring others.
Example:
Father: Justine, how was your school day today? Another C on the history test like last time?
Justine: Dad, I got an A- on my history test today. Isn’t that great? Only one student got an A.
Father: I see you weren’t the one with the A. And what about the math quiz?
Justine: I think I did OK, better than last time.
Father: If you really did well, you’d be sure. What I’m sure of is that today was a pretty bad day for you.
The pessimist who pays attention to all the bad news and ignores the good news thereby use the Fallacy of Selective Attention. The remedy for this fallacy is to pay attention to all the relevant evidence. The most common examples of selective attention are the fallacy of Suppressed Evidence and the fallacy of Confirmation Bias. See also the Sharpshooter’s Fallacy.
Self-Fulfilling Prophecy
The fallacy occurs when the act of prophesying will itself produce the effect that is prophesied, but the reasoner doesn’t recognize this and believes the prophesy is a significant insight.
Example:
A group of students are selected to be interviewed individually by the teacher. Each selected student is told that the teacher has predicted they will do significantly better in their future school work. Actually, though, the teacher has no special information about the students and has picked the group at random. If the students believe this prediction about themselves, then, given human psychology, it is likely that they will do better merely because of the teacher’s making the prediction.
The prediction will fulfill itself, so to speak, and the students’ reasoning contains the fallacy.
This fallacy can be dangerous in an atmosphere of potential war between nations when the leader of a nation predicts that their nation will go to war against their enemy. This prediction could very well precipitate an enemy attack because the enemy calculates that if war is inevitable then it is to their military advantage not to get caught by surprise.
Self-Selection
A Biased Generalization in which the bias is due to self-selection for membership in the sample used to make the generalization.
Example:
The radio announcer at a student radio station in New York asks listeners to call in and say whether they favor Jones or Smith for president. 80% of the callers favor Jones, so the announcer declares that Americans prefer Jones to Smith.
The problem here is that the callers selected themselves for membership in the sample, but clearly the sample is unlikely to be representative of Americans.
Sharpshooter’s
The Sharpshooter’s Fallacy gets its name from someone shooting a rifle at the side of the barn and then going over and drawing a target and bulls eye concentrically around the bullet hole. The fallacy is caused by overemphasizing random results or making selective use of coincidence. See the Fallacy of Selective Attention.
Example:
Psychic Sarah makes twenty-six predictions about what will happen next year. When one, but only one, of the predictions comes true, she says, “Aha! I can see into the future.”
Suppose someone claims that a first step (in a chain of causes and effects, or a chain of reasoning) will probably lead to a second step that in turn will probably lead to another step and so on until a final step ends in trouble. If the likelihood of the trouble occurring is exaggerated, the Slippery Slope Fallacy is present.
Example:
Mom: Those look like bags under your eyes. Are you getting enough sleep?
Jeff: I had a test and stayed up late studying.
Mom: You didn’t take any drugs, did you?
Jeff: Just caffeine in my coffee, like I always do.
Mom: Jeff! You know what happens when people take drugs! Pretty soon the caffeine won’t be strong enough. Then you will take something stronger, maybe someone’s diet pill. Then, something even stronger. Eventually, you will be doing cocaine. Then you will be a crack addict! So, don’t drink that coffee.
The form of a Slippery Slope Fallacy looks like this:
A often leads to B.
B often leads to C.
C often leads to D.
…
Z leads to HELL.
We don’t want to go to HELL.
So, don’t take that first step A.
The key claim in the fallacy is that taking the first step will lead to the final, unacceptable step. Arguments of this form may or may not be fallacious depending on the probabilities involved in each step. The analyst asks how likely it is that taking the first step will lead to the final step. For example, if A leads to B with a probability of 80 percent, and B leads to C with a probability of 80 percent, and C leads to D with a probability of 80 percent, is it likely that A will eventually lead to D? No, not at all; there is about a 50% chance. The proper analysis of a slippery slope argument depends on sensitivity to such probabilistic calculations. Regarding terminology, if the chain of reasoning A, B, C, D, …, Z is about causes, then the fallacy is called the Domino Fallacy.
Small Sample
This is the fallacy of using too small a sample. If the sample is too small to provide a representative sample of the population, and if we have the background information to know that there is this problem with sample size, yet we still accept the generalization upon the sample results, then we use the fallacy. This fallacy is the Fallacy of Hasty Generalization, but it emphasizes statistical sampling techniques.
Example:
I’ve eaten in restaurants twice in my life, and both times I’ve gotten sick. I’ve learned one thing from these experiences: restaurants make me sick.
How big a sample do you need to avoid the fallacy? Relying on background knowledge about a population’s lack of diversity can reduce the sample size needed for the generalization. With a completely homogeneous population, a sample of one is large enough to be representative of the population; if we’ve seen one electron, we’ve seen them all. However, eating in one restaurant is not like eating in any restaurant, so far as getting sick is concerned. We cannot place a specific number on sample size below which the fallacy is produced unless we know about homogeneity of the population and the margin of error and the confidence level.
Smear Tactic
A smear tactic is an unfair characterization either of the opponent or the opponent’s position or argument. Smearing the opponent causes an Ad Hominem Fallacy. Smearing the opponent’s argument causes a Straw Man Fallacy.
Smokescreen
This fallacy occurs by offering too many details in order either to obscure the point or to cover-up counter-evidence. In the latter case it would be an example of the Fallacy of Suppressed Evidence. If you produce a smokescreen by bringing up an irrelevant issue, then you produce a Red Herring Fallacy. Sometimes called Clouding the Issue.
Example:
Senator, wait before you vote on Senate Bill 88. Do you realize that Delaware passed a bill on the same subject in 1932, but it was ruled unconstitutional for these twenty reasons. Let me list them here…. Also, before you vote on SB 88 you need to know that …. And so on.
There is no recipe to follow in distinguishing smokescreens from reasonable appeals to caution and care.
Special pleading is a form of inconsistency in which the reasoner doesn’t apply his or her principles consistently. It is the fallacy of applying a general principle to various situations but not applying it to a special situation that interests the arguer even though the general principle properly applies to that special situation, too.
Example:
Everyone has a duty to help the police do their job, no matter who the suspect is. That is why we must support investigations into corruption in the police department. No person is above the law. Of course, if the police come knocking on my door to ask about my neighbors and the robberies in our building, I know nothing. I’m not about to rat on anybody.
In our example, the principle of helping the police is applied to investigations of police officers but not to one’s neighbors.
Specificity
Drawing an overly specific conclusion from the evidence. A kind of jumping to conclusions.
Example:
The trigonometry calculation came out to 5,005.6833 feet, so that’s how wide the cloud is up there.
Using stereotypes as if they are accurate generalizations for the whole group is an error in reasoning. Stereotypes are general beliefs we use to categorize people, objects, and events; but these beliefs are overstatements that shouldn’t be taken literally. For example, consider the stereotype “She’s Mexican, so she’s going to be late.” This conveys a mistaken impression of all Mexicans. On the other hand, even though most Mexicans are punctual, a German is more apt to be punctual than a Mexican, and this fact is said to be the “kernel of truth” in the stereotype. The danger in our using stereotypes is that speakers or listeners will not realize that even the best stereotypes are accurate only when taken probabilistically. As a consequence, the use of stereotypes can breed racism, sexism, and other forms of bigotry.
Example:
German people aren’t good at dancing our sambas. She’s German. So, she’s not going to be any good at dancing our sambas.
This argument is deductively valid, but it’s unsound because it rests on a false, stereotypical premise. The grain of truth in the stereotype is that the average German doesn’t dance sambas as well as the average South American, but to overgeneralize and presume that ALL Germans are poor samba dancers compared to South Americans is a mistake called “stereotyping.”
Straw Man
Your reasoning contains the Straw Man Fallacy whenever you attribute an easily refuted position to your opponent, one that the opponent would not endorse, and then proceed to attack the easily refuted position (the straw man) believing you have thereby undermined the real man, the opponent’s actual position. If the unfair and inaccurate representation is on purpose, then the Straw Man Fallacy is caused by lying.
Example (a debate before the city council):
Opponent: Because of the killing and suffering of Indians that followed Columbus’s discovery of America, the City of Berkeley should declare that Columbus Day will no longer be observed in our city.
Speaker: This is ridiculous, fellow members of the city council. It’s not true that everybody who ever came to America from another country somehow oppressed the Indians. I say we should continue to observe Columbus Day, and vote down this resolution that will make the City of Berkeley the laughing stock of the nation.
The Opponent is likely to respond with “Wait! That’s not what I said.” The Speaker has twisted what his Opponent said. The Opponent never said nor even indirectly suggested that everybody who ever came to America from another country somehow oppressed the Indians.
Style Over Substance
Unfortunately the style with which an argument is presented is sometimes taken as adding to the substance or strength of the argument.
Example:
You’ve just been told by the salesperson that the new Maytag is an excellent washing machine because it has a double washing cycle. If you notice that the salesperson smiled at you and was well dressed, this does not add to the quality of the salesperson’s argument, but unfortunately it does for those who are influenced by style over substance, as most of us are.
Subjectivist
The Subjectivist Fallacy occurs when it is mistakenly supposed that a good reason to reject a claim is that truth on the matter is relative to the person or group.
Example:
Justine has just given Jake her reasons for believing that the Devil is an imaginary evil person. Jake, not wanting to accept her conclusion, responds with, “That’s perhaps true for you, but it’s not true for me.”
Superstitious Thinking
Reasoning deserves to be called superstitious if it is based on reasons that are well known to be unacceptable, usually due to unreasonable fear of the unknown, trust in magic, or an obviously false idea of what can cause what. A belief produced by superstitious reasoning is called a superstition. The fallacy is an instance of the False Cause Fallacy.
Example:
I never walk under ladders; it’s bad luck.
It may be a good idea not to walk under ladders, but a proper reason to believe this is that workers on ladders occasionally drop things, and that ladders might have dripping wet paint that could damage your clothes. An improper reason for not walking under ladders is that it is bad luck to do so.
Suppressed Evidence
Intentionally failing to use information suspected of being relevant and significant is committing the fallacy of suppressed evidence. This fallacy usually occurs when the information counts against one’s own conclusion. Perhaps the arguer is not mentioning that experts have recently objected to one of his premises. The fallacy is a kind of Fallacy of Selective Attention.
Example:
Buying the Cray Mac 11 computer for our company was the right thing to do. It meets our company’s needs; it runs the programs we want it to run; it will be delivered quickly; and it costs much less than what we had budgeted.
This appears to be a good argument, but you’d change your assessment of the argument if you learned the speaker has intentionally suppressed the relevant evidence that the company’s Cray Mac 11 was purchased from his brother-in-law at a 30 percent higher price than it could have been purchased elsewhere, and if you learned that a recent unbiased analysis of ten comparable computers placed the Cray Mac 11 near the bottom of the list.
If the relevant information is not intentionally suppressed but rather inadvertently overlooked, the fallacy of suppressed evidence also is said to occur, although the fallacy’s name is misleading in this case. The fallacy is also called the Fallacy of Incomplete Evidence and Cherry-Picking the Evidence. See also Slanting.
If you interpret a merely token gesture as an adequate substitute for the real thing, you’ve been taken in by tokenism.
Example:
How can you call our organization racist? After all, our receptionist is African American.
If you accept this line of reasoning, you have been taken in by tokenism.
Traditional Wisdom
If you say or imply that a practice must be OK today simply because it has been the apparently wise practice in the past, then your reasoning contains the fallacy of traditional wisdom. Procedures that are being practiced and that have a tradition of being practiced might or might not be able to be given a good justification, but merely saying that they have been practiced in the past is not always good enough, in which case the fallacy is present. Also called Argumentum Consensus Gentium when the traditional wisdom is that of nations.
Example:
Of course we should buy IBM’s computer whenever we need new computers. We have been buying IBM as far back as anyone can remember.
The “of course” is the problem. The traditional wisdom of IBM being the right buy is some reason to buy IBM next time, but it’s not a good enough reason in a climate of changing products, so the “of course” indicates that the Fallacy of Traditional Wisdom has occurred. The fallacy is essentially the same as the fallacies of Appeal to the Common Practice, Gallery, Masses, Mob, Past Practice, People, Peers, and Popularity.
Tu Quoque
The Fallacy of Tu Quoque occurs in our reasoning if we conclude that someone’s argument not to perform some act must be faulty because the arguer himself or herself has performed it. Similarly, when we point out that the arguer doesn’t practice what he or she preaches, and then suppose that there must be an error in the preaching for only this reason, then we are reasoning fallaciously and creating a Tu Quoque. This is a kind of Ad Hominem Circumstantial Fallacy.
Example:
Look who’s talking. You say I shouldn’t become an alcoholic because it will hurt me and my family, yet you yourself are an alcoholic, so your argument can’t be worth listening to.
Discovering that a speaker is a hypocrite is a reason to be suspicious of the speaker’s reasoning, but it is not a sufficient reason to discount it.
Two Wrongs do not Make a Right
When you defend your wrong action as being right because someone previously has acted wrongly, you are using the fallacy called “Two Wrongs do not Make a Right.” This is a special kind of Ad Hominem Fallacy.
Example:
Oops, no paper this morning. Somebody in our apartment building probably stole my newspaper. So, that makes it OK for me to steal one from my neighbor’s doormat while nobody else is out here in the hallway.
Undistributed Middle
In syllogistic logic, failing to distribute the middle term over at least one of the other terms is the fallacy of undistributed middle. Also called the Fallacy of Maldistributed Middle.
Example:
All collies are animals.
All dogs are animals.
Therefore, all collies are dogs.
The middle term (“animals”) is in the predicate of both universal affirmative premises and therefore is undistributed. This formal fallacy has the logical form: All C are A. All D are A. Therefore, all C are D.
Unfalsifiability
This error in explanation occurs when the explanation contains a claim that is not falsifiable, because there is no way to check on the claim. That is, there would be no way to show the claim to be false if it were false.
Example:
He lied because he’s possessed by demons.
This could be the correct explanation of his lying, but there’s no way to check on whether it’s correct. You can check whether he’s twitching and moaning, but this won’t be evidence about whether a supernatural force is controlling his body. The claim that he’s possessed can’t be verified if it’s true, and it can’t be falsified if it’s false. So, the claim is too odd to be relied upon for an explanation of his lying. Relying on the claim is an instance of fallacious reasoning.
Unrepresentative Generalization
If the plants on my plate are not representative of all plants, then the following generalization should not be trusted.
Example:
Each plant on my plate is edible.
So, all plants are edible.
The set of plants on my plate is called “the sample” in the technical vocabulary of statistics, and the set of all plants is called “the target population.” If you are going to generalize on a sample, then you want your sample to be representative of the target population, that is, to be like it in the relevant respects. This fallacy is the same as the Fallacy of Unrepresentative Sample.
Unrepresentative Sample
If the means of collecting the sample from the population are likely to produce a sample that is unrepresentative of the population, then a generalization upon the sample data is an inference using the fallacy of unrepresentative sample. A kind of Hasty Generalization. When some of the statistical evidence is expected to be relevant to the results but is hidden or overlooked, the fallacy is called Suppressed Evidence. There are many ways to bias a sample. Knowingly selecting atypical members of the population produces a biased sample.
Example:
The two men in the matching green suits that I met at the Star Trek Convention in Las Vegas had a terrible fear of cats. I remember their saying they were from France. I’ve never met anyone else from France, so I suppose everyone there has a terrible fear of cats.
Most people’s background information is sufficient to tell them that people at this sort of convention are unlikely to be representative, that is, are likely to be atypical members of the rest of society. Having a small sample does not by itself cause the sample to be biased. Small samples are OK if there is a corresponding large margin of error or low confidence level.
Large samples can be unrepresentative, too.
Example:
We’ve polled over 400,000 Southern Baptists and asked them whether the best religion in the world is Southern Baptist. We have over 99% agreement, which proves our point about which religion is best.
Getting a larger sample size does not overcome sampling bias.
The Vested Interest Fallacy occurs when a person argues that someone’s claim is incorrect or their recommended action is not worthy of being followed because the person is motivated by their interest in gaining something by it, with the implication that were it not for this vested interest then the person wouldn’t make the claim or recommend the action. Because this reasoning attacks the reasoner rather than the reasoning itself, it is a kind of Ad Hominem fallacy.
Example:
According to Samantha we all should vote for Anderson for Congress. Yet she’s a lobbyist in the pay of Anderson and will get a nice job in the capitol if he’s elected, so that convinces me that she is giving bad advice.
This is fallacious reasoning by the speaker because whether Samantha is giving good advice about Anderson ought to depend on Anderson’s qualifications, not on whether Samantha will or won’t get a nice job if he’s elected.
I’ve got my mind made up, so don’t confuse me with the facts. This is usually a case of the Traditional Wisdom Fallacy.
Example:
Of course she’s made a mistake. We’ve always had meat and potatoes for dinner, and our ancestors have always had meat and potatoes for dinner, and so nobody knows what they’re talking about when they start saying meat and potatoes are bad for us.
Wishful Thinking
A reasoner who suggests that a claim is true, or false, merely because he or she strongly hopes it is, is using the fallacy of wishful thinking. Wishing something is true is not a relevant reason for claiming that it is actually true.
Example:
There’s got to be an error here in the history book. It says Thomas Jefferson had slaves. I don’t believe it. He was our best president, and a good president would never do such a thing. That would be awful.
You-Too
This is an informal name for the Tu Quoque fallacy.
7. References and Further Reading
Eemeren, Frans H. van, R. F. Grootendorst, F. S. Henkemans, J. A. Blair, R. H. Johnson, E. C. W. Krabbe, C. W. Plantin, D. N. Walton, C. A. Willard, J. A. Woods, and D. F. Zarefsky, 1996. Fundamentals of Argumentation Theory: A Handbook of Historical Backgrounds and Contemporary Developments. Mahwah, New Jersey, Lawrence Erlbaum Associates, Publishers.
Fearnside, W. Ward and William B. Holther, 1959. Fallacy: The Counterfeit of Argument. Prentice-Hall, Inc. Englewood Cliffs, New Jersey.
Fischer, David Hackett., 1970. Historian’s Fallacies: Toward Logic of Historical Thought. New York, Harper & Row, New York, N.Y.
This book contains additional fallacies to those in this article, but they are much less common, and many have obscure names.
Groarke, Leo and C. Tindale, 2003. Good Reasoning Matters! 3rd edition, Toronto, Oxford University Press.
Hamblin, Charles L., 1970. Fallacies. London, Methuen.
Hansen, Has V. and R. C. Pinto., 1995. Fallacies: Classical and Contemporary Readings. University Park, Pennsylvania State University Press.
Huff, Darrell, 1954. How to Lie with Statistics. New York, W. W. Norton.
Levi, D. S., 1994. “Begging What is at Issue in the Argument,” Argumentation, 8, 265-282.
Schwartz, Thomas, 1981. “Logic as a Liberal Art,” Teaching Philosophy 4, 231-247.
Walton, Douglas N., 1989. Informal Logic: A Handbook for Critical Argumentation. Cambridge, Cambridge University Press.
Walton, Douglas N., 1995. A Pragmatic Theory of Fallacy. Tuscaloosa, University of Alabama Press.
Walton, Douglas N., 1997. Appeal to Expert Opinion: Arguments from Authority. University Park, Pennsylvania State University Press.
Whately, Richard, 1836. Elements of Logic. New York, Jackson.
Woods, John and D. N. Walton, 1989. Fallacies: Selected Papers 1972-1982. Dordrecht, Holland, Foris.
Research on the fallacies of informal logic is regularly published in the following journals: Argumentation, Argumentation and Advocacy, Informal Logic, Philosophy and Rhetoric, and Teaching Philosophy.
Author Information
Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.
The Ethics and Epistemology of Trust
Trust is a topic of long-standing philosophical interest because it is indispensable to the success of almost every kind of coordinated human activity, from politics and business to sport and scientific research. Even more, trust is necessary for the successful dissemination of knowledge, and, by extension, for nearly any form of practical deliberation and planning that requires us to make use of more information than we are able to gather individually and verify ourselves. In short, without trust, we could achieve few of our goals and would know very little. Despite trust’s fundamental importance in human life, there is substantial philosophical disagreement about what trust is, and further, how trusting is normatively constrained and best theorized about in relation to other things we value. Consequently, contemporary philosophical literature on trust features a range of different theoretical options for making sense of trust, and these options differ in how they (among other things) take trust to relate to such things as reliance, optimism, belief, obligations, monitoring, expectations, competence, trustworthiness, assurance, and doubt. With the aim of exploring these myriad issues in an organized way, this article is divided into three sections, each of which offers an overview of key (and sometimes interconnected) ethical and epistemological themes in the philosophy of trust: (1) The Nature of Trust; (2) The Normativity of Trust; and (3) The Value of Trust.
Table of Contents
What is trust? To a very first approximation, trust is an attitude or a hybrid of attitudes (for instance, optimism, hope, belief, and so forth) toward a trustee, that involves some (non-negligible) vulnerability to being betrayed on the truster’s side. This general remark, of course, does not take us very far. For example, we may ask: what kind of attitude (or hybrid of attitudes) is trust exactly? Suppose that (as some philosophers of trust maintain) trust requires an attitude of optimism. Even if that is right, getting a grip on trust requires a further conception of what the truster, qua truster, must be optimistic about. One standard answer here proceeds as follows: trust (at least, in the paradigmatic case of interpersonal trust) involves some form of optimism that the trustee will take care of things as we have entrusted them. In the special case of trusting the testimony of another—a topic at the centre of the epistemology of trust—this will involve at least some form of optimism that the speaker is living up to her expectations as a testifier; for instance, that the speaker knows what she says or, more weakly, is telling the truth.
Even at this level of specificity, though, the nature of trust remains fairly elusive. Does trusting involve (for example) merely optimism that the trustee will take care of things as entrusted, or does it also involve optimism that the trustee will do so compresently (that is, concurrently) with certain beliefs, non-doxastic attitudes, emotions or motivations on the part of the trustee, such as with goodwill (Baier 1986; Jones 1996). Moreover, and apart from such positive characterizations of trust, does trust also have a negative condition to the effect that one fails to genuinely trust another if one—past some threshold of vigilance—monitors the trustee (or otherwise, reflects critically on the trust relationship so as to attempt to minimize risk)?
These are among the questions that occupy philosophers working on the nature of trust. This section explores four subthemes aimed at clarifying trust’s nature: these concern (a) the distinction between trust and reliance; (b) two-place vs three-place trust; (c) doxastic versus non-doxastic conditions on trust; (d) deception detection and monitoring.
a. Reliance vs. Interpersonal Trust
Reliance is ubiquitous. You rely on the weather not to suddenly drop by 20 degrees, leaving you shivering; you rely on your chair not to give out, causing you to tumble to the floor. In these cases, are you trusting the weather and trusting your chair, respectively? Many philosophers working on trust believe the correct answer here is “no”. This is so even though, in each case, you are depending on these things in a way that leaves you potentially vulnerable.
The idea that trust is a kind of dependence that does not reduce to mere reliance (of the sort that might be apposite to things like chairs and the weather) is widely accepted. According to Annette Baier (1986: 244) the crux of the difference is that trust involves relying on another not just to take care of things any old way (for instance, out of fear, begrudgingly, accidentally, and so forth) but rather that they do so out of goodwill toward the truster; relatedly, a salient kind of vulnerability one subjects oneself to in trusting is vulnerability to the limits of that goodwill. On this way of thinking, then, you are not trusting someone if you (for instance) rely on that person to act in a characteristically self-centred way, even if you depend on them to do so, and even if you fully expect them to do so.
Katherine Hawley (2014, 2019) rejects the idea that what distinguishes trust from mere reliance has anything to do with the trustee’s motives or goodwill. Instead, on her account, the crucial difference is that in cases of trust, but not of mere reliance, a commitment on the part of the trustee must be in place. Consider a situation in which you reliably bring too much lunch to work, because you are a bad judge of quantities, and I get to eat your leftovers. My attitude to you in this situation is one of reliance, but not trust; in Hawley’s view, that is because you have made no commitment to provide me with lunch:
However, if we adapt the case so as to suggest commitment, it starts to look more like a matter of trust. Suppose we enjoy eating together regularly, you describe your plans for the next day, I say how much I’m looking forward to it, and so on. To the extent that this involves a commitment on your part, it seems reasonable for me to feel betrayed and expect apologies if one day you fail to bring lunch and I go hungry (Hawley 2014: 10).
If it is right that trust differs in important ways from mere reliance, then a consequence is that while reliance is something we can have toward people (when we merely depend on them) as well as toward objects (for instance, when we depend on the weather and chairs), not just anything can be genuinely trusted. Karen Jones (1996) captures this point, one that circumscribes people as the fitting objects of genuine trust, as follows:
One can only trust things that have wills, since only things with wills can have goodwills—although having a will is to be given a generous interpretation so as to include, for example, firms and government bodies. Machinery can be relied on, but only agents, natural or artificial, can be trusted (1996: 14).
If, as the foregoing suggests, trust relationships are best understood as a special subset of reliance relationships, should we also expect the appropriate attitudes toward misplaced trust to be a subset of a more general attitude-type we might have in response to misplaced reliance?
Katherine Hawley (2014) thinks so. As she puts it, misplaced trust warrants a feeling of betrayal. But the same is not so for misplaced (mere) reliance. Suppose, to draw from an example she offers (2014: 2) that a shelf you rely on to support a vase gives out; it would be inappropriate, Hawley maintains, to feel betrayed, even if a more general attitude of (mere) disappointment befits such misplaced reliance. Misplaced trust, by contrast, befits a feeling of betrayal.
In contrast with the above thinking, according to which disanalogies between trust and mere reliance are taken to support distinguishing trust from reliance, some philosophers have taken a more permissive approach to trust, by distinguishing between two senses of trust that differ with respect to the similarity of each to mere reliance.
Paul Faulkner (2011: 246; compare McMyler 2011), for example, distinguishes between what he calls predictive and affective trust. Predictive trust involves merely reliance in conjunction with a belief that the trustee will take care of things (namely, a prediction). Misplaced predictions warrant disappointment, not betrayal, and so predictive trust (like mere reliance) cannot be betrayed. Affective trust, by contrast, is a thick, interpersonal normative notion, and, according to Faulkner, it involves, along with reliance, a kind of normative expectation to the effect that the trustee (i) ought to prove dependable; and that they (ii) will prove dependable for that reason. On this view, it is affective trust that is uniquely subject to betrayal, even though predictive trust, which is a genuine variety of trust, is not.
b. Two-place vs. Three-place Trust
The distinction between two-place and three-place trust, first drawn by Horsburgh (1960), is meant to capture a simple idea: sometimes when we trust someone, we trust them to do some particular thing (see also Holton 1994; Hardin 1992), for example, you might trust your neighbour to water your plant while you are away on holiday but not to look after your daughter. This is three-place trust, with an infinitival component (schematically: A trusts B to X). Not all trusting fits this schema. You might also simply trust your neighbour generally (schematically: A trusts B) and in a way that does not involve any particular task in mind. Three- and two-place trust are thus different in the sense that the object of trust is specified in the former case but not in the latter.
While there is nothing philosophically contentious about drawing this distinction, the relationship between two- and three-place trust becomes contested when one of these kinds of trust is taken to be, in some sense, more fundamental than the other. To be clear, it is uncontentious that philosophers, as Faulkner (2015: 242) notes, tend to “focus” on three-place trust. What is contentious is whether any—and if so, which—of these notions is theoretically more basic.
The overwhelming view in the literature maintains that three-place trust is the fundamental notion and that two-place (as well as one-place) trust are derivative upon three-place trust (Baier 1986; Holton 1994; Jones 1996; Faulkner 2007; Hieronymi 2008; Hawley 2014; compare Faulkner 2015). This view can be called three-place fundamentalism.
According to Baier, for instance, trust is centrally concerned with “one person trusting another with some valued thing” (1986: 236) and for Hawley, trust is “primarily a three-place relation, involving two people and a task” (2014: 2). We might think of two-place (X trusts Y) trust as derived from three-place trust (X trusts Y to phi) in a way that is broadly analogous to how one might extract a diachronic view of someone on the basis of discrete interactions, as opposed to starting with any such diachronic view. On this way of thinking, three-place trust leads to two place trust over time, and is established on the basis of it.
Resistance to three-place fundamentalism has been advanced by Faulkner (2015) and Domenicucci and Holton (2017). Faulkner takes as a starting point that it is a desideratum on any plausible account of trust that it should accommodate infant trust, and thus, “that it not make essential to trusting the use of concepts or abilities which a child cannot be reasonably believed to possess” (1986: 244). As Faulkner (2015: 5) maintains, however, an infant, in trusting its mother “need not have any further thought; the trust is no more than a confidence or faith – a trust, as we say – in his mother”. And so, Faulkner reasons, if we take Baier’s constraint seriously, then we “have to take two-place trust as basic rather than three-place trust.”
A second strand of arguments against three-place fundamentalism is owed to Domenicucci and Holton (2017). According to them, the kind of derivation of two-place trust from three-place trust that is put forward by three-place fundamentalists is implausible for other similar kinds of attitudes like love and friendship:
No one—or at least, hardly anyone—thinks that we should understand what it is for Antony to love Cleopatra in terms of the three place relation ‘Antony loves Cleopatra for her __’, or in terms of any other three-place relation. Likewise hardly anyone thinks that we should understand the two place relation of friendship in terms of some underlying three-place relation […]. To this extent at least, we suggest that trust might be like love and friendship (2017: 149-50).
In response to this kind of argument by association, a proponent of three-place fundamentalism might either deny that these three- to two-place derivations are really problematic in the case of love or friendship, or instead grant that they are and maintain that trust is disanalogous.
In order to get a better sense of whether two-place trust might be unproblematically derived from three-place trust, regardless of whether the same holds mutatis mutandis for love in friendship, it will be helpful to look at a concrete attempt to do so. For example, according to Hawley (2014), three-place trust should be analyzed as: X relies on Y to phi because Y believes Y has a commitment to phi. And then, two-place trust defined simply as “reliance on someone to fulfil whatever commitments she may have” (2014: 16). If something like Hawley’s reduction is unproblematic, then, as one line of response might go, this trumps whatever concerns one might have about the prospects of making analogous moves in the love and friendship cases.
c. Trust and Belief: Doxastic, Non-doxastic and Performance-theoretic Accounts
Where does belief fit in to an account of trust? In particular, what beliefs (if any) must a truster have about whether the trustee will prove trustworthy? Proponents of doxastic accounts (Adler 1994; Hieronymi 2008; Keren 2014; McMyler 2011) hold that trust involves a belief on the part of the truster. On the simpler, straightforward incarnation of this view, when A trusts B to do X, A believes that B will do X. Other theorists propose more sophisticated belief-based accounts: on Hawley’s (2019) account, for instance, to trust someone to do something is to believe that she has a commitment to doing it, and to rely upon her to meet that commitment. Conversely, to distrust someone to do something is to believe that she has a commitment to doing it, and yet not rely upon her to meet that commitment.
Non-doxastic accounts (Jones 1996; McLeod 2002; Paul Faulkner 2007; 2011; Baker 1987) have a negative and a positive thesis. The negative thesis is just the denial of the belief requirement on trust that proponents of doxastic accounts accept (namely, a denial that trusting someone to do something entails the corresponding belief that they will do that thing). This negative thesis, to note, is perfectly compatible with the idea that trust oftentimes involves such a belief. What is maintained is that it is not essential. The positive thesis embraced by non-doxastic accounts involves a characterization of some further non-doxastic attitude the truster, qua truster, must have with respect to the trustee’s proving trustworthy.
An example of such a further (non-doxastic) attitude, on non-doxastic accounts, is optimism. For example, on Jones’ (1996) view, you trust your neighbour to bring back the garden tools you loaned her only if you are optimistic that she will bring them back, and regardless of whether you believe she will. It should be pointed out that oftentimes, optimism will lead to the acquisition of a corresponding belief. Importantly for Jones, the kind of optimism that characterizes trust is not itself to be explained in terms of belief but rather in terms of affective attitudes entirely. Such a commitment is more generally shared by non-doxastic views which take trust to involve affective attitudes that might be apt to prompt corresponding beliefs.
Quite a few important debates about trust turn on the matter of whether a doxastic account or a non-doxastic account is correct. For example, discussions of the rationality of trust will look one way if trust essentially involves belief and another way if it does not (Jones 1996; Keren 2014). Relatedly, what one says about trust and belief will bear importantly on how one thinks about the relationship between trust and monitoring, as well as the distinction between paradigmatic trust and therapeutic trust (the kind of trust one engages in in order to build trustworthiness; see Horsburgh 1960; Hieronymi 2008; Frost-Arnold 2014).
A notable advantage of the doxastic account is that it simplifies the epistemology of trust—and in particular, how trust can provide reasons for belief. Suppose, for instance, that the doxastic account is correct, and so your trusting your colleague’s word that they will return your laptop involves believing that they will return your laptop. This belief, some think, conjoined with the fact that your colleague tells you they will return your laptop, gives you a reason to believe that they will return your laptop. As Faulkner (2017: 113) puts it, on the doxastic account, “[t]rust gives a reason for belief because belief can provide reason for belief”. Non-doxastic accounts, by contrast, require further explanation as to why trusting someone would ever give you a reason to believe what they say.
Another advantage of doxastic accounts is that they are well-positioned to distinguish trusting someone to do something and mere optimistic wishing. Suppose, for instance, you loan £100 to a loved one with a terrible track record for repaying debts. Such a person may have lost your trust years ago, and yet you may remain optimistic and wishful that they will be trustworthy on this occasion. What distinguishes this attitude from genuine trust on the doxastic account is simply that you lack any belief that your loved one will prove trustworthy. Explaining this difference is more difficult on non-doxastic accounts. This is especially the case on non-doxastic accounts according to which trust not only does not involve belief but positively precludes it, by essentially involving a kind of “leap of faith” (Möllering 2006) that differs in important ways from belief.
Nonetheless, non-doxastic accounts have been emboldened in light of various serious objections that have been raised to doxastic accounts. One often raised objection of this kind highlights a key disanalogy with respect to how trust and belief interact with evidence, respectively (Faulkner 2007):
[Trust] need not be based on evidence and can demonstrate a wilful insensitivity to the evidence. Indeed there is a tension between acting on trust and acting on evidence that is illustrated in the idea that one does not actually trust someone to do something if one only believes they will do it when one has evidence that they will (2007: 876).
As Baker (1987) unpacks this idea, trusting can require ignoring counterevidence—as one might do when one trusts a friend despite evidence of guilt—whereas believing does not.
A particular type of example that puts pressure on doxastic accounts’ ability to accommodate dis-analogies with belief concerns therapeutic trust. In cases of therapeutic trust, the purpose of trusting is to promote trustworthiness, and is thereby not predicated on prior belief of trustworthiness. Take a case in which one trusts a teenager with an important task, in hopes that by trusting them, it will then lead them to become more trustworthy in the future. In this kind of case, we are plausibly trusting, but not on the basis of prior evidence or belief we have that the trustee will succeed on this occasion. To the contrary: we trust with the aim of establishing trustworthiness (Frost-Arnold 2014; Faulkner 2011). To the extent that such a description of this kind of case is right, therapeutic trust offers a counterexample to the doxastic account, as it involves trust in the absence of belief.
A third kind of account—the performance-theoretic account of trust (Carter 2020a, 2020c)—makes no essential commitment as to whether trusting involves belief. On the performance-theoretic account, what is essential to the attitude of trusting is how it is normatively constrained. An attitude is a trust attitude (toward a trustee, T, and a task, X) just in case the attitude is successful if and only if T takes care of X as entrusted to. Just as there is a sense in which, for example, your archery shot is not successful if it misses the target (see, for example, Sosa 2010a, 2015; Carter 2020b), your trusting someone to keep a secret misses its mark, and so fails to be successful trust, if the trustee spills the beans. With reference to this criterion of successful (and unsuccessful) trust, the performance-theoretic account aims to explain what good and bad trusting involves (see §2.a), and also why some variety of trust is more valuable than others (see §3).
d. Deception Detection and Monitoring
Given that trusting inherently involves the incurring of some level of risk to the truster, it is natural to think that trust would in some way be improved by the truster doing what she can to minimize such risk, for example, by monitoring the trustee with an eye to pre-empting any potential betrayal or at least mitigating the expected disvalue of potential betrayal.
This prima facie plausible suggestion, however, raises some perplexities. As Annette Baier (1986) puts it: “Trust is a fragile plant […] which may not endure inspection of its roots, even when they were, before inspection, quite healthy” (1986: 260). There is something intuitive about this point. If, for instance, A trusts B to drive the car straight home after work—but then proceeds to surreptitiously drive behind B the entire way in order to make sure that B really does drive straight home, it seems that A in doing so is no longer trusting B. The trust, it seems, dissolves through the process of such monitoring.
Extrapolating from such cases, it seems that trust inherently involves not only subjecting oneself to some risk, but also remaining subjected to such risk—or, at least—behaving in ways that are compatible with one’s viewing oneself as (remaining to be) subjected to such risk.
The above idea of course needs sharpened. For example, trusting is plausibly not destroyed by negligible monitoring. The crux of the idea seems to be, as Faulkner (2011, §5) puts it, that “too much reflection” on the trust relation, perhaps in conjunction with making attempts to minimize risks that trust will be betrayed, can undermine trust. Specifying what “too much reflection” or monitoring involves, however, and how reflecting relates to monitoring to begin with, remains a difficult task.
One form of monitoring—construed loosely—that is plausibly compatible with trusting is contingency planning (Carter 2020c). For example, suppose you trust your teenager to drive your car to work and back in order that they may undertake a summer job. A prudent mitigation against the additional risk incurred (for instance, that the car will be wrecked in the process) will be to buy some additional insurance upon entrusting the teenager with the car. The purchasing of this insurance, however, does not itself undermine the trusting relationship, even though it involves a kind of risk mitigating behaviour.
One explanation here turns on the distinction between (i) mitigating against the risk that trust will be betrayed; and (ii) mitigating against the extent or severity of the harm or damage incurred if trust is betrayed. Contingency planning involves type-(ii) mitigation, whereas, for example, trailing behind the teenager with your own car, which is plausibly incompatible with trusting, is of type-(i).
2. The Normativity of Trust
Norms of trust arise between the two parties of reciprocal trust: a norm to be trusting in response to the invitation to trust, and to be trustworthy in response to the other’s trusting reliance (Fricker 2018). The former normativity lies “on the truster’s side”, and the latter on the trustee’s side. In this section, we discuss norms on trusting by looking at these two kinds of norms—that govern the truster and the trustee, respectively—in turn.
This section first discusses general norms on trusting on the truster’s side, and then engages—in some detail—with the complex issue of the norms governing trust in another’s words specifically. Second, it discusses the normativity of trust on the trustee’s side and the nature of trustworthiness.
a. Entitlement to Trust
If—as doxastic accounts maintain—trust is a species of belief (Hieronymi 2008), then the rational norms governing trust govern belief, such that (for example) it will be irrational to trust someone whom you have strong evidence to be unreliable, and the norm violation here is the same kind of norm violation in play in a case where one simply believes, against the evidence, that an individual is trustworthy. Thus: to the extent that one is rationally entitled to believe the trustee is trustworthy, with respect to F, one thereby has an entitlement (on these kinds of views) to trust the trustee to F.
The norms that govern trust on the truster’s side will look different on non-doxastic accounts. For example, on a proposal like Frost-Arnold’s (2014), according to which trust is characterized as a kind of non-doxastic acceptance rather than as belief, the rationality governing trusting will be the rationality of acceptance, where rational acceptance can in principle come apart from rational belief. For one thing, whereas the rationality of belief is exclusively epistemically constrained, the rationality of acceptance need not be. In cases of therapeutic trust, for example, it might be practically rational (namely, rational with reference to the adopted end of building a trusting relationship) to accept that the trustee will F, and thus, to use the proposition that they will F as a premise in practical deliberation (see Bratman 1992; Cohen 1989)—that is, to act as if it is true that they will F. Of course, acting as if a proposition is true neither implies nor is implied by believing that it is true.
On performance-theoretic accounts, trusting is subject, on the truster’s side, to three kinds of evaluative norms, which correspond with three kinds of positive evaluative assessments: success, competence, and aptness. Whereas trusting is successful if and only if the trustee takes care of things as intrusted, trusting is competent if and only if one’s trusting issues from a reliable disposition—namely, a competence—to trust successfully when appropriately situated (for discussion, see Carter 2020a).
Just as successful trust might be incompetent as when one trusts someone with a well-known track record of unreliability who happens to prove trustworthy on this particular occasion, likewise, trust might fail to be successful despite being competent, as when one trusts an ordinarily reliable individual who, due to fluke luck, fails to take care of things as entrusted on this particular occasion. Even if trust is both successful and competent, however, there remains a sense in which it could fall short of the third kind of evaluative standard—namely, aptness. Aptness demands success because competence, and not merely success and competence (see Sosa 2010a, 2015; Carter 2020a, 2020b). Trust is apt, accordingly, if and only if one trusts successfully such that the successful trust manifests her trusting competence.
b. Trust in Words
Why not lie? (Or, more generally, why not promise to take care of things, and then renege on that promise whenever it is convenient to do so?) According to a fairly popular answer (Faulkner 2011; Simion 2020b), deception is bad not only for the deceived, but it is bad likewise for the deceiver (see also Kant). If one cultivates a reputation as being untrustworthy, then this comes with practical costs in one’s community; the untrustworthy person, recognized as such, is outcast, and de facto foregoes the (otherwise possible) social benefits of trusting.
Things are more complicated, however, in one-off trust-exchanges—where the risk of the disvalue of cultivating an untrustworthy reputation is minimal. The question can be reposed within the one-off context: why not lie and deceive, when it is convenient to do so, in one-off exchanges? In one-off interactions where we (i) do not know others’ motivations but (ii) do appreciate that there is a general motivation to be unreliable (for example, to reap gains of betrayal), it is surprising that we find as much trustworthy behaviour as we do. Why do people not betray to a greater extent than they do in such circumstances, given that betrayal seems prima facie the most rational decision-theoretic move?
According to Faulkner, when we communicate with another as to the facts, we face a situation akin to a prisoner’s dilemma (2011: 6). In a prisoner’s dilemma, our aggregate well-being will be maximized if we both cooperate. However, given the logic of the situation, it looks like the rational thing to do for each of us is to defect. We are then faced with a problem: how to ensure the cooperative outcome?
Similarly, Faulkner argues, speakers and audiences have different interests in communication. The audience is interested in learning the truth. In contrast, engaging in conversations is to the advantage of speakers because it is a means of influencing others: through an audience’s acceptance of what we say, we can get an audience to think, feel, and act in specific ways. So, according to Faulkner, our interest, qua speakers’, is being believed, because we have a more basic interest in influencing others. The commitment to telling the truth would not be best for the speaker. The best outcome for a speaker would be to receive an audience’s trust and yet have the liberty to tell the truth or not (2011: 5-6).
There are four main reactions to this problem in the literature in the epistemology of testimony. According to Reductionism (Adler 1994; Audi 1997, 2004, 2006; Faulkner 2011; Fricker 1994, 1995, 2017, 2018; Hume 1739; Lipton 1998; Lyons 1997), in virtue of this lack of alignment of hearer and speaker interests, one needs positive, independent reasons to trust their speaker: since communication is like a prisoner’s dilemma, the hearer needs a reason for thinking or presuming that the speaker has chosen the cooperative, helpful outcome. Anti-Reductionism (Burge 1993, 1997; Coady 1973, 1992; Goldberg 2006, 2010; Goldman 1999; Graham 2010, 2012a, 2015; Greco 2015, 2019; Green 2008; Reid 1764; Simion 2020b; Simion and Kelp 2018) rejects this claim. According to these philosophers, we have a default (absent defeaters) entitlement to believe what we are being told. In turn, this default entitlement is derivable on a priori grounds from the nature of reason (Burge 1993, 1997), sourced in social norms of truth-telling (Graham 2012b), social roles (Greco 2015), the reliance on other people’s justification-conferring processes (Goldberg 2010), or from the knowledge norm of assertion (Simion 2020b). Other than these two main views, we also encounter hybrid views (Lackey 2003, 2008; Pritchard 2004) that try to impose weaker conditions on testimonial justification than Reductionism, while, at the same time, not being as liberal about it as Anti-Reductionism. Last but not least, a fourth reaction to Faulkner’s problem of cooperation for testimonial exchanges is scepticism (Graham 2012a; Simion 2020b); on this view, the problem does not get off the ground to begin with.
According to Faulkner himself, trust lies at the heart of the solution to his problem of cooperation, that is, it gives speakers reasons to tell the truth (2011, Ch. 1; 2017). Faulkner thinks that the problem is resolved “once one recognizes how trust itself can give reasons for cooperating” (2017: 9). When the hearer H believes that the speaker S can see that H is relying on S for information about whether p, and in addition H trusts S for that information, then H will make a number of presumptions: 1. H believes that S recognizes H’s trusting dependence on S proving informative; 2. H presumes that if S recognizes H’s trusting dependence, then S will recognize that H normatively expects S to prove informative; 3. H presumes that if S recognizes H’s expectation that S should prove informative, then, other things being equal, S will prove informative for this reason; 4. Hence, taking the attitude of trust involves presuming that the trusted will prove trustworthy (2011: 130). The hearer’s presumption that the speaker will prove informative rationalizes the hearer’s uptake of the speaker testimony.
Furthermore, Faulkner claims, H’s trust gives S “a reason to be trustworthy”, such that S is, as a result, more likely to be trustworthy: it raises the objective probability that S will prove informative in utterance. In this fashion, “acts of trust can create as well as sustain trusting relations” (2011: 156-7). As Graham (2012a) puts it, “the hearer’s trust—the hearer’s normative expectation, which rationalizes uptake—then ‘engages,’ so to speak, the speaker’s internalization of the norm, which thereby motivates the speaker to choose the informative outcome.” Speakers who have internalized these norms will then often enough choose the informative outcome when they see that audiences need information; they will be “motivated to conform” because they have “internalized the norm” and so “intrinsically value” compliance (2011: 186). As such, the de facto reliability of testimony is explained by the fact that the trust placed in hearers by the speakers triggers, on the speakers’ side, the internalization of social norms of trust, which, in turn, makes speakers objectively likely to put hearers’ informational interests before their own.
According to Peter Graham (2012a), however, Faulkner’s own solution threatens to dissolve the problem of cooperation rather than solve it (Graham 2012a). Recall how the problem was set up: the thought was that speakers only care about being believed, whether they are speaking the truth or not, which is why the hearer needs some reason for thinking the speaker is telling them the truth. But if speakers have internalized social norms of trustworthiness, it is not true that speakers are just as apt to prove uninformative as informative. It is not true that they are only interested in being believed. Rather, they are out to inform, to prove helpful; due to having internalized the relevant trustworthiness norms, speakers are committed to informative outcomes (Graham 2012a).
Another version of scepticism about the problem of cooperation is voiced in Simion’s “Testimonial Contractarianism” (2020b). Recall that, according to Faulkner, in testimonial exchanges, the default position for speakers involves no commitment to telling the truth. If that is the case, he argues, the default position for hearers involves no entitlement to believe. Here is the argument unpacked:
(P1) Hearers are interested in truth; speakers are interested in being believed.
(P2) The default position for speakers is seeing to their own interests rather than to the interests of the hearers.
(P3) Therefore, it is not the case that the default position for speakers is telling the truth (from 1 and 2).
(P4) The default position for hearers is trust only if the default position for speakers is telling the truth.
(C) Therefore, it is not the case that the default position for hearers is trust (from 3 and 4).
There is one important worry for this argument: on the reconstruction above, the conclusion does not follow. In particular, the problem is with premise (3), which is not supported by (1) and (2) (Simion 2020b). That is because being interested in being believed does not exclude also being interested in telling the truth. Speakers might still—by default—also be interested in telling the truth on independent grounds, that is, independently of their concern (or, rather, lack thereof) with hearers’ interests; indeed, the sources of entitlement proposed by the Anti-Reductionist—for instance, the existence of social norms of truth-telling, the knowledge norm of assertion and so forth—may well constitute themselves in reasons for the speaker to tell the truth—absent overriding incentive to do otherwise. If that is the case, telling the truth will be default for hearers, therefore, trust will be default for hearers. What the defender of the Problem of Cooperation needs, then, for validity, is to replace (P1) with the stronger (P1*): Hearers are interested in truth; speakers are only interested in being believed. However, it is not clear that (P1*) spells out the correct utility profile of the case: are all speakers really such that they only care about being believed? This seems like a fairly heavy empirical assumption that is in need of further defence.
c. Obligation to Trust
A final normative species that merits discussion on the truster’s side is the obligation to trust. Obligations to trust can be generated, trivially, by promise-making (compare Owens 2017) or by other kinds of cooperative agreements (Faulkner 2011, Ch. 1). Of more philosophical interest are cases where obligations to trust are generated without explicit agreements.
One case of particular interest here arises in the literature on testimonial injustice, pioneered by Miranda Fricker (2007). Put roughly, testimonial injustice occurs when a speaker receives an unfair deficit of credibility from a hearer due to prejudice on the hearer’s part, resulting in the speaker’s being prevented from sharing what she knows.
An example of testimonial injustice that Fricker uses as a reference point is from Harper Lee’s To Kill a Mockingbird, where Tom Robinson, a black man on trial after being falsely accused of raping a white woman, has his testimony dismissed due to the prejudiced preconceptions on the part of the jury which owes to deeply seated racial stereotypes. In this case, the jury makes a deflated credibility judgement of Robinson, and as a result, he is unable to convey to them the knowledge that he has of the true events which occurred.
On one way of thinking about norms of trusting on the truster’s side, the members of the jury have mere entitlements to trust Robinson’s testimony though no obligation to do so; thus, their distrust of Robinson is not norm-violating. This gloss of the situation, on Fricker’s view, is incomplete; it fails to take into account the sense in which Robinson is wronged in his capacity as a knower as a result of this distrust. An appreciation of this wrong, according to Fricker, should lead us to think of the relevant norm on the hearer’s side as generating an obligation rather than a mere permission to believe; as such, on this view, distrust that arises from affording a speaker a prejudiced credibility deficit is not merely an instance of foregoing trusting when one is entitled to trust, but failing to trust when one should. For additional work discussing the relationship between trust and testimonial injustice see, for example, Origgi (2012); Medina (2011); Wanderer (2017); Carter and Meehan (2020).
Fricker’s ground-breaking proposal concerns cases when one is harmed in their capacity as a knower via distrust sourced in prejudice. That being said, several philosophers believe that the phenomenon generalizes beyond cases of distinctively prejudicial distrust; that is, that it lies in the nature and normativity of telling that we have a defeasible obligation to trust testifiers, and that failure to do so is impermissible, whether it is sourced in prejudice or not. Indeed, G. E. M. Anscombe (1979) and J. L. Austin (1946) famously believed that you can insult someone by refusing their testimony.
We can distinguish between three general accounts of what it is that hearers owe to speakers and why: presumption-based accounts, purport-based accounts, and function-based accounts. The key questions for all accounts are whether they successfully deliver an obligation to trust, what rationale they provide, and whether their rationale is ultimately satisfactory.
While there are differences in the details, the core idea behind presumption-based views (Gibbard 1990, Hinchman 2005, Moran 2006, Ridge 2014) is that when a speaker S tells a hearer H that p, say, S incurs certain responsibilities for the truth of p. Crucially, H, in virtue of recognising what S is doing, thereby acquires a reason for presuming S to be trustworthy in their assertion that p. But since anyone who is to be presumed trustworthy in asserting that p ought to be believed, we get exactly what we were looking for: an obligation to trust speakers alongside an answer to the rationale question.
Of course, the question remains whether the rationale provided is ultimately convincing. Sandy Goldberg (2020) argues that the answer is no. To see what he takes to be the most important reason for this, one should first look at a distinction Goldberg introduces between a practical entitlement to hold someone responsible and an epistemic entitlement to believe that they are responsible. Crucially, one can have the former without the latter. For instance, if your teenager tells you that they will be home by midnight and they are not, you will have a practical entitlement to hold them responsible even if you do not have an epistemic entitlement to believe that they are responsible. Importantly, to establish a presumption of trustworthiness, you need to make a case for an epistemic entitlement to believe. According to Goldberg, however, presumption-based accounts only deliver an entitlement to hold speakers responsible for their assertions, not an entitlement to believe that they are responsible. That is to say, when S tells H that p and thereby incurs certain responsibilities for the truth of p and when H recognises that this is what S is doing, H comes by an entitlement to hold S responsible for the truth of p. Crucially, to get to the presumption of trustworthiness we need more than this, as the case of the teenager clearly indicates. But presumption-based accounts do not offer more (Goldberg 2020, Ch. 4).
Another problem for these views is sourced in the fact that extant presumption-based accounts are distinctively personal: all accounts share the idea that in telling an addressee that p, speakers perform a further operation on them and that it is this further operation that generates the obligation on the addressee’s side. In virtue of this, presumption-based accounts deliver too limited a presumption of trustworthiness. To see this, we should go back to Fricker’s cases of epistemic injustice: it looks as though, not believing what a testifier says in virtue of prejudice is equally bad, whether one is the addressee of the instance of telling in question or merely overhears it (Goldberg 2020).
Goldberg’s own alternative proposal is purport-based: according to him, assertion has a job description, which is to present a content as true in such a way that, were the audience to accept it on the basis of accepting the speaker’s speech contribution, the resulting belief would be a candidate for knowledge (Goldberg 2020, Ch. 5). Since assertion has this job description, when speakers make assertions, they purport to achieve exactly what the job description says. Moreover, it is common knowledge that this is what speakers purport to do. But since assertion will achieve its job description only if the speaker meets certain epistemic standards and since this is also common knowledge, the audience will recognise that the performed speech act achieves its aim only if the relevant epistemic standards are met. Finally, this exerts normative pressure on hearers. To be more precise, hearers owe it to speakers to recognize them as agents who purport to be in compliance with the epistemic standards at work and to treat them accordingly.
According to Goldberg, our obligation toward speakers is weaker than presumption-based accounts would have it: in the typical case of testimony, what we owe to the speakers is not to outright believe them, but rather to properly assess their speech act epistemically. The reason for this, Goldberg argues, is that we do not have access to their evidence, or their deliberations; given that this is so, the best we can do is to adjust our doxastic reaction to “a proper (epistemic) assessment of the speaker’s epistemic authority, since in doing so they are adjusting their doxastic reaction to a proper (epistemic) assessment of the act in which she conveyed having such authority” (Goldberg 2020, Ch. 5).
As a first observation, note that Goldberg’s purport-based account deals better with cases of testimonial injustice than presumption-based accounts. After all, since the normative pressure is generated by the fact that it is common knowledge that in asserting speakers represent themselves as meeting the relevant epistemic standards, the normative pressure is on anyone who happens to listen in the conversation, not just on the direct addressees of the speech act.
With this point in play, let us return to Goldberg’s argument that there is no obligation to believe. According to Goldberg, this is because hearers do not have access to speakers’ reasons and their deliberations. One question is why exactly this should matter. After all, one might argue, the fact that the speaker asserted that p provides them with sufficient reason to believe that p (absent defeat, of course). That the assertion does not also give hearers access to the speakers’ own reasons and deliberations does nothing to detract from this, unless one endorses dramatically strong versions of reductionism about testimony (which Goldberg himself would not want to endorse). If so, the fact that assertions do not afford hearers access to speakers’ reasons and deliberations provides little reason to believe that there is no obligation to believe on the part of the hearer (Kelp & Simion 2020a).
An alternative way to ground an obligation to trust testimony (Kelp & Simion 2020a) relies on the plausible idea that the speech act of assertion has the epistemic function to generate true belief (Graham 2010), or knowledge (Kelp 2018; Kelp & Simion 2020a; Simion 2020a). According to this view, belief-responses on behalf of hearers contribute to the explanation of the continuous existence of the practice of asserting: were hearers to stop believing what they are being told, speakers would lose incentive to assert, and the practice would soon disappear. Since this is so, and since hearers are plausibly criticism-averse, it makes sense to have a norm that imposes an obligation on the hearers to believe what they are being told (absent defeat). Like that, in virtue of their criticism-aversion, hearers will reliably obey the norm—that is, will reliably form the corresponding beliefs—which, in turn, will keep the practice of assertion going (Kelp & Simion 2020a, Ch. 6).
One potential worry for this view is that it does not deliver the “normative oomph” that we want from a satisfactory account of the hearer’s obligation to trust: think of paradigm cases of epistemic injustice again. The hearers in these cases seem to fail in substantive moral and epistemic ways. However, on the function-based view, their failure is restricted to breaking a norm internal to the practice of assertion. Since norms internal to practices need not deliver substantive oughts outside of the practice itself—think, for instance, of rules of games—the function-based view still owes us an account of the normative strength of the “ought to believe” that drops out of their picture.
d. Trustworthiness
As the previous sections of this article show, trust can be a two-place or a three-place relation. In the former case, it is a relation between a trustor and a trustee, as in “Ann trusts George”. Two-place trust seems to be a fairly demanding affair: when we say that Ann trusts George simpliciter, we seem to attribute a fairly robust attitude to Ann, one whereby she trusts him in (at least) several respects. In contrast, three-place trust is a less involved affair: when we say that Ann trusts George to do the dishes, we need not say much about their relationship otherwise.
This contrast is preserved when we switch from focusing on the trustor’s trust to the trustee’s trustworthiness. That is, one can be trustworthy simpliciter (corresponding to a two-place trust relation) but one can also be trustworthy with regard to a particular matter—that is, two-place trustworthiness (Jones 1996) —corresponding to three-place trust. For instance, a surgeon might well be extremely trustworthy when it comes to performing surgery well, but not in any other respects.
Some philosophers working on trustworthiness focus more on two-place trust. As such, since the two-place trust relation is intuitively a more robust one, they put forward accounts of trustworthiness that are generally quite demanding, in that they require the trustee to be reliably making good on their commitments, but also to do so out of the right motive.
The classic account of such kind is Annette Baier’s goodwill-based account; in a similar vein, others combine reliance on goodwill with certain expectations (Jones 1996) including in one case a normative expectation of goodwill (Cogley 2012). According to this kind of view, the trustworthy person fulfils their commitments in virtue of their goodwill toward the trustor. This view, according to Baier, makes sense of the intuition that there is a difference between trustworthiness and mere reliability, that corresponds to the difference between trust and mere reliance.
The most widely spread worry about these accounts of trustworthiness is that they are too strong: we can trust other people without presuming that they have goodwill. Indeed, our everyday trust in strangers falls into this category. If so, the argument goes, this seems to suggest that whether or not people are making good on their commitments out of goodwill or not is largely inconsequential: “[w]e are often content to trust without knowing much about the psychology of the one-trusted, supposing merely that they have psychological traits sufficient to get the job done” (Blackburn 1998).
Another worry for these accounts is that, while plausible as accounts of trustworthiness simpliciter, they give counterintuitive results in cases of two-place trustworthiness: indeed, whether George is trustworthy when it comes to washing the dishes or not seems not to depend on his goodwill, nor on other such noble motives. The goodwill view is too strong.
Furthermore, it looks as though there is a reason to believe the goodwill view is, at the same time, too weak. To see this, consider the case of a convicted felon and his mother: it looks as though they can have a goodwill-based relationship, and thus be trustworthy within the scope thereof, while, at the same time, not being someone whom we would describe as trustworthy (Potter 2002: 8).
If all of this is true, it begins to look as though the presence of goodwill is independent of the presence of trustworthiness. This observation motivates accounts of trustworthiness that rely on less highbrow motives underlying the trustee’s reliability. One such account is the social contract view of trustworthiness. According to this view, the motives underlying people’s making good on their commitments are sourced in social norms and the unfortunate consequences to one’s reputation and general wellbeing of breaking them (Hardin 2002: 53; see also O’Neill 2002; Dasgupta 2000). Self-interest determines trustworthiness on these accounts.
It is easy to see that social contract views do well in accounting for trustworthiness in three-place trust relations: George is trustworthy when it comes to washing the dishes, on this view: he makes good on his commitments in virtue of social norms making it such that it is in his best interest to do so. The main worry for these views, however, is that they will be too permissive, and thus have difficulties in distinguishing between trustworthiness proper and mere reliability. Relatedly, the worry goes, these views seem less well equipped to deal with trustworthiness simpliciter, that is, the kind of trustworthiness that corresponds to a two-place trust relation. For instance, on a social contract view, it would seem that a sexist employer who treats female employees well only because he believes that he would face legal sanctions if he did not, will come out as trustworthy (Potter 2002: 5). This is intuitively an unfortunate result.
One thought that gets prompted by the case of the sexist employer is that trustworthiness is a character trait that virtuous people possess; after all, this seems to be something that the sexist employer is missing. On Nancy Potter’s view, trustworthiness is a disposition to respond to trust in appropriate ways, given “who one is in relation [to]” and given other virtues that one possesses or ought to possess (for example, justice, compassion) (2002: 25). According to Potter, a trustworthy person is “one who can be counted on, as a matter of the sort of person he or she is, to take care of those things that others entrust to one.”
When it comes to demandingness, the virtue-based view seems to lie somewhere in-between the goodwill view, on one hand, and the social contract view, on the other. It seems more permissive than the former in that it can account for the trustworthiness of strangers insofar as they display the virtue at stake. It seems more demanding than the latter in that it purports to account for the intuition that mere reliability is not enough for trustworthiness: rather, what is required is reliability sourced in good character.
An important criticism of virtue-based views comes from Jones (2012). According to her, trustworthiness does not fit the normative profile of virtue in the following way: if trustworthiness was a virtue, then being untrustworthy would be a vice. However, according to Jones, that cannot be right: after all, we are often required to be untrustworthy in one respect or another—for instance, because of conflicting normative constraints—but it cannot be that being vicious is ever required.
Another problem for Potter’s specific view are its apparent un-informativeness; first, defining the trustworthy person as “a person who can be counted on as a matter of the sort of person he or she is” threatens vicious circularity: after all, it defines the trustworthy as those that can be trusted. Relatedly, the account turns out to be too vague to give definite predictions in a series of cases. Take again the case of the sexist employer: why is it that he cannot be “counted on, as a matter of the sort of person he is, to take care of those things that others entrust to one” in his relationship with his female employees? After all, in virtue of the sort of person he is—that is, the sort of person who cares about not suffering the social consequences of mistreating them—he can be counted on to treat his employees well. If that is so, Potter’s view will not do much better than social contract views when it comes to distinguishing trustworthiness from mere reliability.
Several philosophers propose less demanding accounts of trustworthiness. Katherine Hawley’s (2019) view falls squarely within this camp. According to her, trustworthiness is a matter of avoiding unfulfilled commitments, which requires both caution in incurring new commitments and diligence in fulfilling existing commitments. Crucially, on this view, one can be trustworthy regardless of one’s motives for fulfilling one’s commitments. Hawley’s is a negative account of trustworthiness, which means that one can be trustworthy while avoiding commitments as far as possible. Untrustworthiness can arise from insincerity or bad intentions, but it can also arise from enthusiasm and becoming over-committed. A trustworthy person must not allow her commitments to outstrip her competence.
One natural question that arises for this view is: how about commitments that we do not, but we should take on board? Am I a trustworthy friend if I never take on any commitments toward my friends? According to Hawley, in practice, through friendship, work and other social engagements we take on meta-commitments—commitments to incur future commitments. These can make it a matter of trustworthiness to take on certain new commitments.
Another view in a similar, externalist vein, is developed by Kelp and Simion (2020b). According to them, trustworthiness is a disposition to fulfil one’s obligations. What drives the view is the thought that one can fail to fulfil one’s commitments in virtue of being in a bad environment—an environment that “masks” the normative disposition in question—while, at the same time, remaining a trustworthy person. Again, on this view as well, whether the disposition in question is there in virtue of good will or not is inconsequential. That being said, their view can accommodate the thought that people who comply with a particular norm for the wrong reason are less trustworthy than their good-willing counterparts. To see how, take the sexist employer again: insofar as it is plausible that there are norms against sexism, as well as norms against mistreating one’s female employees, the sexist employer fulfils the obligations generated by the latter but not by the former. In this, he is trustworthy when it comes to treating his employees well, but not trustworthy when it comes to treating them well for the right reason.
Another advantage of this view is that it explains the intuitive difference in robustness between two-place trustworthiness and trustworthiness simpliciter. According to this account, one is trustworthy simpliciter when one meets a contextually-variant threshold of two-place trustworthiness for contextually-salient obligations. For instance, a philosophy professor is trustworthy simpliciter in the philosophy department just in case she has a disposition to meet enough of her contextually salient obligations: do her research and teaching, not be late for meetings, answer emails promptly, help students with their essays and so forth. Plausibly, some of these contextually salient obligations will include doing these things for the right reasons. If so, the view is able to account for the fact that trustworthiness simpliciter is more demanding than two-place trustworthiness.
3. The Value of Trust
Trust is valuable. Without it, we face not only cooperation problems, but we also incur substantial risks to our well-being—namely, those ubiquitous risks to life that characterize—at the limit case—the Hobbesian (1651/1970) “state of nature”. Accordingly, one very general argument for the value of trust appeals to the disutility of its absence (see also Alfano 2020).
Moreover, apart from merely serving as an enabling condition for other valuable things (like the possibility of large-scale collective projects for societal benefit), trust is also instrumentally valuable for both the truster and the trustee as a way of resolving particular (including one-off) cooperation problems in such a way as to facilitate mutual profit (see §2). Furthermore, trust is instrumentally valuable as a way of building trusting relationships (Solomon and Flores 2003). For example, trust can effectively be used—as when one trusts a teenager with a car to help cultivate a trust relationship—in order to make more likely the attainments of the benefits of trust (for both the truster and the trustee) further down the road (Horsburgh 1960; Jones 2004; Frost-Arnold 2014; see also the discussion of therapeutic trust above).
Apart from trust’s uncontroversial instrumental value (for helpful discussion, see O’Neill 2002), some philosophers believe that trust has final value. Something X is instrumentally valuable, with respect to an end, Y, insofar as it is valuable as a means to Y; instrumental value can be contrasted with final value. Something is finally valuable iff it is valuable for its own sake. A paradigmatic example of something instrumentally valuable is money, which we value because of its usefulness in getting other things; an example of something (arguably) finally valuable is happiness.
One way to defend the view that trust can be finally valuable, and not merely instrumentally valuable, is to supplement the performance-theoretic view of trust (see §1.c and §2.a) with some additional (albeit somewhat contentious) axiological premises as follows:
(P1) Apt trust is successful trust that is because of trust-relevant competence. (from the performance-theoretic view of trust)
(P2) Something is an achievement if and only if it is a success because of competence. (Premise)
(C1) So, apt trust is an achievement. (from P1 and P2)
(P3) Achievements are finally valuable. (Premise)
(C2) So, apt trust has final value. (from C1 and P3)
Premise (2) of the argument is mostly uncontentious, and is taken for granted widely in contemporary virtue epistemology (for instance, Greco 2009, 2010; Haddock, Millar, and Pritchard 2009; Sosa 2010b) and elsewhere (Feinberg 1970; Bradford 2013, 2015).
Premise (3), however, is where the action lies. Even if apt trust is an achievement, given that it involves a kind of success because of ability (that is, trust-relevant competences), we would need some positive reason to connect the “success because of ability” structure with final value if we are to accept (P3).
A strong line here defends (3) by maintaining that all achievements (including evil achievements and “trivial” achievements) are finally valuable, because successes because of ability (no matter what the success, no matter what the ability used) have a value that is not reducible to just the value of the success.
This kind of argument faces some well-worn objections (for some helpful discussions, see Kelp and Simion 2016; Dutant 2013; Goldman and Olsson 2009; Sylvan 2017). A more nuanced line of argument for C2 will weaken (3) so that it says, instead, that (3*) some achievements are finally valuable. But with this weaker premise in play, (3*) and (C1) no longer entail C2; what would be needed—and this remains an open problem for work on the axiology of trust—is a further premise to the effect that the kind of achievement that features in apt trust, specifically, is among the finally valuable rather than non-finally valuable achievements. And a defence of such a further premise, of course, will turn on further considerations about (among other things) the value of successful and competent trust, perhaps also in the context of wider communities of trust.
4. References and Further Reading
Adler, Jonathan E. 1994. ‘Testimony, Trust, Knowing’. The Journal of Philosophy 91 (5): 264–275.
Alfano, Mark. 2020. ‘The Topology of Communities of Trust’. Russian Sociological Review 15 (4): 3-57. https://doi.org/10.17323/1728-192X-2016-4-30-56/.
Anscombe, G. E. M. 1979. ‘What Is It to Believe Someone?’ In Rationality and Religious Belief, edited by C. F. Delaney, 141–151. South Bend: University of Notre Dame Press.
Audi, Robert. 1997. ‘The Place of Testimony in the Fabric of Knowledge and Justification’. American Philosophical Quarterly 34 (4): 405–422.
Audi, Robert. 2004. ‘The a Priori Authority of Testimony’. Philosophical Issues 14: 18–34.
Audi, Robert. 2006. ‘Testimony, Credulity, and Veracity’. In The Epistemology of Testimony, edited by Jennifer Lackey and Ernest Sosa, 25–49. Oxford University Press.
Austin, J. L. 1946. ‘Other Minds.’ Proceedings of the Aristotelian Society Supplement 20: 148–187.
Burge, Tyler. 1997. ‘Interlocution, Perception, and Memory’. Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 86 (1): 21–47.
Carter, J. Adam. 2020a. ‘On Behalf of a Bi-Level Account of Trust’. Philosophical Studies, 2020, issue 177, pages 2299–2322.
Coady, C. A. J. 1973. ‘Testimony and Observation’. American Philosophical Quarterly 108 (2): 149–55.
Coady, C. A. J. 1992. Testimony: A Philosophical Study. Oxford University Press.
Cogley, Zac. 2012. ‘Trust and the Trickster Problem’. Analytic Philosophy 53 (1): 30–47. https://doi.org/10.1111/j.2153-960X.2012.00546.x.
Cohen, L. Jonathan. 1989. ‘Belief and Acceptance’. Mind 98 (391): 367–389.
Dasgupta, Partha. 2000. ‘Trust as a Commodity’. Trust: Making and Breaking Cooperative Relations 4: 49–72.
deTurck, Mark A., Janet J. Harszlak, Darlene J. Bodhorn, and Lynne A. Texter. 1990. ‘The Effects of Training Social Perceivers to Detect Deception from Behavioral Cues’. Communication Quarterly 38 (2): 189–199.
Domenicucci, Jacopo, and Richard Holton. 2017. ‘Trust as a Two-Place Relation’. The Philosophy of Trust, 149–160.
Dutant, Julien. 2013. ‘In Defence of Swamping’. Thought: A Journal of Philosophy 2 (4): 357–366.
Faulkner, Paul. 2007. ‘A Genealogy of Trust’. Episteme 4 (3): 305–321. https://doi.org/10.3366/E174236000700010X.
Faulkner, Paul. 2011. Knowledge on Trust. Oxford: Oxford University Press.
Faulkner, Paul. 2015. ‘The Attitude of Trust Is Basic’. Analysis 75 (3): 424–429.
Faulkner, Paul. 2017. ‘The Problem of Trust’. The Philosophy of Trust, 109–28.
Feinberg, Joel. 1970. Doing and Deserving; Essays in the Theory of Responsibility. Princeton: Princeton University Press.
Fricker, Elizabeth. 1994. ‘Against Gullibility’. In Knowing from Words, 125–161. Springer.
Fricker, Elizabeth. 1995. ‘Critical Notice’. Mind 104 (414): 393–411.
Fricker, Elizabeth. 2017. ‘Inference to the Best Explanation and the Receipt of Testimony: Testimonial Reductionism Vindicated’. Best Explanations: New Essays on Inference to the Best Explanation, 262–94.
Fricker, Elizabeth. 2018. Trust and Testimonial Justification.
Fricker, Miranda. 2007. Epistemic Injustice: Power and the Ethics of Knowing. Oxford University Press.
Gibbard, Allan. 1990. Wise Choices, Apt Feelings: A Theory of Normative Judgment. Cambridge, MA: Harvard University Press.
Goldberg, Sanford C. 2006. ‘Reductionism and the Distinctiveness of Testimonial Knowledge’. The Epistemology of Testimony, 127–44.
Goldberg, Sanford C. 2010. Relying on Others: An Essay in Epistemology. Oxford University Press.
Goldberg, Sanford C. 2020. Conversational Pressure. Oxford University Press.
Goldman, Alvin I. 1999. ‘Knowledge in a Social World’. Oxford University Press.
Goldman, Alvin, and Erik J. Olsson. 2009. ‘Reliabilism and the Value of Knowledge’. In Epistemic Value, edited by Adrian Haddock, Alan Millar, and Duncan Pritchard, 19–41. Oxford University Press.
Graham, Peter J. 2010. ‘Testimonial Entitlement and the Function of Comprehension’. In Social Epistemology, edited by Duncan Pritchard, Alan Millar, and Adrian Haddock, 148–74. Oxford University Press.
Graham, Peter J. 2012a. ‘Testimony, Trust, and Social Norms’. Abstracta 6 (3): 92–116.
Graham, Peter J. 2012b. ‘Epistemic Entitlement’. Noûs 46 (3): 449–82. https://doi.org/10.1111/j.1468-0068.2010.00815.x.
Graham, Peter J. 2015. ‘Epistemic Normativity and Social Norms’. In Epistemic Evaluation: Purposeful Epistemology, edited by David Henderson, and John Greco, 247-273. Oxford University Press.
Greco, John. 2009. ‘The Value Problem’. In Epistemic Value, edited by Adrian Haddock, Alan Millar, and Duncan Pritchard, 313–22. Oxford: Oxford University Press.
Greco, John. 2010. Achieving Knowledge: A Virtue-Theoretic Account of Epistemic Normativity. Cambridge University Press.
Holton, Richard. 1994. ‘Deciding to Trust, Coming to Believe’. Australasian Journal of Philosophy 72 (1): 63–76. https://doi.org/10.1080/00048409412345881.
Horsburgh, H. J. N. 1960. ‘The Ethics of Trust’. The Philosophical Quarterly (1950-) 10 (41): 343–54. https://doi.org/10.2307/2216409.
Hume, David. 2000(1739). Treatise on Human Nature. Oxford University Press.
Jones, Karen. 1996. ‘Trust as an Affective Attitude’. Ethics 107 (1): 4–25.
Jones, Karen. 2004. ‘Trust and Terror’. In Moral Psychology: Feminist Ethics and Social Theory, edited by Peggy DesAutels and Margaret Urban Walker, 3–18. Rowman & Littlefield.
Kelp, Christoph, and Simion, Mona. 2016. The Tertiary Value Problem and the Superiority of Knowledge (with C. Kelp). American Philosophical Quarterly 53 (4): 397-411.
Keren, Arnon. 2014. ‘Trust and Belief: A Preemptive Reasons Account’. Synthese 191 (12): 2593–2615.
Kraut, Robert. 1980. ‘Humans as Lie Detectors’. Journal of Communication 30 (4): 209–218.
Lackey, Jennifer. 2003. ‘A Minimal Expression of Non–Reductionism in the Epistemology of Testimony’. Noûs 37 (4): 706–723.
Lackey, Jennifer. 2008. Learning from Words: Testimony as a Source of Knowledge. Oxford University Press.
Lipton, Peter. 1998. ‘The Epistemology of Testimony’. Studies in History and Philosophy of Science Part A 29 (1): 1–31.
Lyons, Jack. 1997. ‘Testimony, Induction and Folk Psychology’. Australasian Journal of Philosophy 75 (2): 163–178.
McLeod, Carolyn. 2002. Self-Trust and Reproductive Autonomy. MIT Press.
McMyler, Benjamin. 2011. Testimony, Trust, and Authority. Oxford University Press USA.
Medina, José. 2011. ‘The Relevance of Credibility Excess in a Proportional View of Epistemic Injustice: Differential Epistemic Authority and the Social Imaginary’. Social Epistemology 25 (1): 15–35.
Medina, José. 2013. The Epistemology of Resistance: Gender and Racial Oppression, Epistemic Injustice, and the Social Imagination. Oxford University Press.
Moran, Richard. 2006. ‘Getting Told and Being Believed’. In Jennifer Lackey and Ernest Sosa (eds.), The Epistemology of Testimony. Oxford: Oxford University Press.
O’Neill, Onora. 2002. Autonomy and Trust in Bioethics. Cambridge University Press.
Origgi, Gloria. 2012. ‘Epistemic Injustice and Epistemic Trust’. Social Epistemology 26 (2): 221–235.
Owens, David. 2017. ‘Trusting a Promise and Other Things’. In Paul Faulkner and Thomas Simpson (eds.), New Philosophical Perspectives on Trust, 214–29. Oxford University Press.
Rabinowicz, Wlodek, and Toni Ronnow-Rasmussen. 2000. ‘II-A Distinction in Value: Intrinsic and For Its Own Sake’. Proceedings of the Aristotelian Society 100 (1): 33–51.
Reid, Thomas. 1764. ‘An Inquiry into the Mind on the Principles of Common Sense’. In The Works of Thomas Reid, edited by W.H. Bart. Maclachlan & Stewart.
Ridge, Michael. 2014. Impassioned Belief. Oxford: Oxford University Press.
Simion, Mona. 2020a. Shifty Speech and Independent Thought: Epistemic Normativity in Context. Oxford: Oxford University Press.
Simion, Mona. 2020b. ‘Testimonial Contractarianism: A Knowledge-First Social Epistemology’. Noûs 1-26. https://doi.org/10.1111/nous.12337
Simion, Mona, and Christoph Kelp. 2018. ‘How to Be an Anti-Reductionist’. Synthese. https://doi.org/10.1007/s11229-018-1722-y.
Solomon, Robert C., and Fernando Flores. 2003. Building Trust: In Business, Politics, Relationships, and Life. Oxford University Press USA.
Sosa, Ernest. 2010b. ‘Value Matters in Epistemology’. The Journal of Philosophy 107 (4): 167–190.
Sosa, Ernest. 2015. Judgment and Agency. Oxford: Oxford University Press.
Sylvan, Kurt. 2017. ‘Veritism Unswamped’. Mind 127 (506): 381–435.
Wanderer, Jeremy. 2017. ‘Varieties of Testimonial Injustice’. In Ian James Kidd, José Medina, and Gaile Pohlhaus Jr. (eds.), The Routledge Handbook of Epistemic Injustice, 27–40. Routledge.
Williamson, Timothy. 2000. Knowledge and Its Limits. Oxford University Press.
Tyler Burge is an American philosopher who has done influential work in several areas of philosophy. These include philosophy of language, logic, philosophy of mind, epistemology, philosophy of science (primarily philosophy of psychology), and history of philosophy (focusing especially on Frege, but also on the classical rationalists—Descartes, Leibniz, and Kant). Burge has also done some work in psychology itself.
Burge is best known for his extended elaboration and defense of the thesis of anti-individualism. This is the thesis that most representational mental states depend for their natures upon phenomena that are not determined by the individual’s own body and other characteristics. In other words, what it means to represent a subject matter—whether in perception, language, or thought—is not fully determined by individualistic characteristics of the brain, body, or person involved. One of the most famous illustrations of this point is Burge’s argument that psychologically representing a kind such as water requires the fulfillment of certain non-individualistic conditions; such as having been in causal contact with instances of the kind, having acquired the representational content through communication with others, having theorized about it, and so forth. A consequence of Burge’s anti-individualism, in this case, is that two thinkers who are physically indiscernible (who are, for example, neurologically indistinguishable in a certain sense) can differ in that one of them, but not the other, has thoughts containing the concept “water”.
When Burge first proposed the thesis of anti-individualism, it was common for philosophers to reject it for one reason or another. It is a measure of Burge’s influence, and the power of his arguments, that the early 21st century saw few philosophers deny the truth of the view.
Nevertheless, there is much more to Burge’s philosophical contributions than anti-individualism. Most of Burge’s more influential theses and arguments are briefly described in this article. An attempt is made to convey how the seemingly disparate topics addressed in Burge’s corpus are unified by certain central commitments and interests. Foremost among these is Burge’s long-standing interest in understanding the differences between the minds of human beings, on one hand, and the minds of other animals, on the other. This interest colors and informs Burge’s work on language, mind, and epistemology in particular.
Charles Tyler Burge graduated from Wesleyan University in 1967. He obtained his Ph.D. at Princeton University in 1971, his dissertation being directed by Donald Davidson. He is married with two sons. Burge’s wife, Dorli Burge, was prior to her retirement a clinical psychologist. Burge’s eldest son, Johannes, is Assistant Professor in Vision Science at the University of Pennsylvania. His younger son, Daniel, completed a Ph.D. in 20th century American History at Boston University.
Burge is a fan of sport and enjoys traveling and hiking. He also reads widely outside of philosophy (particularly literature, history, history of science, history of mathematics, psychology, biology, music, and art history). Three of Burge’s interests are classical music, fine food, and fine wine.
A list of Burge’s main philosophical contributions would include the following seven areas. First, in his dissertation and the 1970s more generally, Burge focused attention upon the central significance of context-dependent referential and representational devices, including many uses of proper names, as well as what he came to call “applications” in language and thought. This was during a philosophical era in which it was widely believed that such devices were reducible to context-independent representational elements such as linguistic descriptions and concepts in thought. Burge also appealed to demonstrative- or indexical-like elements in perhaps unexpected areas, such as in his treatment of the semantical paradox. A concern with referential representation, which Burge does not believe to be confined solely to empirical cases, has been as close to his central philosophical interest as any topic. Much the same could be said about Burge’s long-standing interest in predication and attribution. (See sections 2 and 4.) Burge’s work on context-dependent aspects of representation is indebted to Keith Donnellan and Saul Kripke.
Second, while broadly-understood anti-individualism has been a dominant view in the history of philosophy, Burge was the first philosopher to articulate the doctrine, to argue for it, and to mine it for many of its implications. Anti-individualism is the view that the natures of most representational states and events are partly dependent on relations to matters beyond the individuals with representational abilities. In the 20th century, anti-individualism went from being, for a decade or more after Burge discussed its several forms or aspects, a minority view to a view that is rarely even questioned in serious philosophical work today. Furthermore, the discussion of anti-individualism engendered by Burge’s work breathed new life into at least two somewhat languishing areas of philosophy: the problem of mental causation, and the nature of authoritative self-knowledge, each of which has since then become widely discussed and recognized as central areas of philosophy of mind, and epistemology, respectively. (See sections 3, 5 and 8.)
Third, Burge’s work on interlocution (commonly called “testimony”) has been widely discussed. He was among the first to defend a non-reductionist view of interlocution, one which remains among the best-articulated and supported accounts of our basic epistemic warrant for relying upon the words of others (see section 7); and Burge later extended this work to provide new ways of thinking about the problem of other minds, on one hand, and the epistemology of computer-aided mathematical proofs, on the other.
Fourth, beginning with his work on self-knowledge and interlocution, Burge began a rationalist initiative in epistemology that has been influential, in addition to areas already mentioned, in discussions of memory, the first-person concept, reflection and understanding, and other abilities, such as certain forms of reasoning, that seem to be peculiar to human beings. Central to Burge’s limited form of rationalism is his powerful case against the once-common view that both analytic and a priori truths are in some way insubstantial or vacuous; as well as his rejection of the closely related view that apriority is to be reduced to analyticity. (See sections 6-10.)
Fifth, beginning in the mid- to late-1980s all the way up to the early 21st century, Burge developed a detailed understanding of the nature of perception. Integral to this understanding has been the extent to which Burge has immersed himself in the psychology of perception as well as developmental psychology and ethology. Some of Burge’s work on perception is as much a contribution to psychology as to philosophy; one of the articles he has published on the topic covers a prominent and hotly contested question in psychology—the question whether infants and non-human animals attribute psychological states to agents with whom they interact. Parallel with these developments has been Burge’s articulation of a novel account of perceptual epistemic warrant. (See sections 6, 11 and 13.)
Sixth, throughout his career Burge has resisted the tendency of philosophers of mind, especially in the United States, to accept some form of materialism. While it may not have been a central focus of his published work, Burge has over time formulated and defended a version of dualism about the relation between the mind and the body in the literature today. Burge’s view holds that minds, mental states, and mental events are not identical with bodies, physical states, or physical events. It is important to note, however, that Burge’s dualism is not a substance dualism such as the view commonly attributed to Descartes. It is instead a “modest dualism” motivated by the view that counting mental events as physical events does no scientific or other good conceptual work; similarly, for mental properties and minds themselves. This is one example of Burge’s more general resistance to forms of reductionism in philosophy. (See section 5.)
The seventh respect in which Burge’s work has been influential is not confined to a certain body of work or a defended thesis. It lies in providing an antidote to the pervasive tendency, in several areas of philosophy, toward hyper-intellectualization. The earliest paper in which Burge discusses hyper-intellectualization is his short criticism of David Lewis’s account of convention (1975). The tendency toward hyper-intellectualization is exhibited in individualism about linguistic, mental, or perceptual representational content—the idea being that the individual herself must somehow be infallible concerning the proper application conditions of terms, concepts, and even perceptual attributives. It is at the center of the syndrome of views, called Compensatory Individual Representationalism, that Burge criticizes at some length. These views insist that objective empirical representation requires that the individual must in some way herself represent necessary conditions for objective representation. Hyper-intellectualization motivates various forms of epistemic internalism, according to which epistemic warrant requires that the individual be able in some way to prove that her beliefs are warranted, or at least to have good, articulable grounds for believing that they are. Finally, hyper-intellectualization permeates even action theory, which tends to model necessary conditions for animal action upon the intentional actions of mature human language-users. Burge has resisted all of these hyper-intellectualizing tendencies within philosophy, and to a lesser extent in psychology. (See sections 3, 6, 7 and 11.)
If there is a single, overriding objective running throughout Burge’s long and productive career, it is to understand wherein human beings are similar to, and different from, other animals in representational and cognitive respects. As he put the point early on, in the context of a discussion of anti-individualism:
I think that ultimately the greatest interest of the various arguments lies not in defeating individualism, but in opening routes for exploring the concepts of objectivity and the mental, and more especially those aspects of the mental that are distinctive of persons. (1986b, 194 fn. 1)
This large program has involved not only investigating the psychological powers that seem to be unique to human beings—such as a priori warranted cognition and reflection, and authoritative self-knowledge and self-understanding—but also competencies that we share with a wide variety of non-human animals, principally memory, action, and perception. (See sections 3, 4, 7, and 8-11.)
2. Language and Logic
Burge’s early work in philosophy of language and logic centered on semantics and logical form. The work on semantics constitutes the beginning of Burge’s lifelong goal of understanding reference and representation—beginning in language and proceeding to thought and perception. This work includes the logical form of de re thought (1977); the semantics of proper names (1973); demonstrative and indexical constructions (1974a); and also mass and singular terms (1972; 1974b). While the work on context-dependent reference was the dominant special case of Burge’s thought and writing on semantics and logical form, it also includes Burge’s work on paradoxes, especially the strengthened liar (semantic) paradox and the epistemic paradox.
Significant later work on logic and language prominently includes articles on logic and analyticity, and on predication and truth (2003a; 2007b).
3. Anti-Individualism
Anti-individualism is the view that the natures of most thoughts, and perceptual states and events, are partly determined by matters beyond the natures of individual thinkers and perceivers. By the “nature” of these mental states we understand that without which they would not be the mental states they are. So the representational contents of thoughts and perceptual states, for example, are essential to their natures. If “they” had different contents, they would be different thoughts or states. As Burge emphasizes, anti-individualism has been the dominant view in the history of philosophy. It was present in Aristotle, arguably in Descartes, and in many other major philosophers in the Western canon. When Burge, partly building upon slightly earlier work by Hilary Putnam, came explicitly to formulate and defend the view, it became controversial. There are several reasons for this. One is that materialistic views in philosophy of mind seemed incompatible with the implications of anti-individualism. Another was a tendency, which began to be dislodged only after the mid-20th century, to place very high emphasis upon phenomenology and introspective “flashes” of insight when it came to discussions of the natures of representational mental states and events. There are rear-guard defenses of the cognitive relevance of phenomenology that still have currency today. But anti-individualism appears to have become widely, if sometimes reluctantly, accepted.
As noted, anti-individualism is the view that most psychological representational states and events depend for their natures upon relations to subject matters beyond the representing individuals or their psychological sub-systems. This view was first defended in Burge’s seminal article, “Individualism and the Mental” (1979a). Some of Burge’s arguments for anti-individualism employ the Twin-Earth thought-experiment methodology originally set out by Putnam. Burge went beyond Putnam, among other ways, by arguing that the intentional natures of many mental states themselves, rather than merely associated linguistic meanings, depend for their natures on relations to a subject matter. Burge has also argued at length against Putnam’s view (which Putnam has since given up) that meanings and thought contents involving natural kinds are indexical in character.
There are five distinct arguments for anti-individualism in Burge’s work. The order in which they were published is as follows. First, Burge argued that many representational mental states depend for their natures upon relations to a social environment (1979a). Second, he argued that psychologically representing natural kinds such as water and aluminum depends upon relations to entities in the environment (1982). Third, Burge argued that having thoughts containing concepts corresponding to artefactual kinds such as sofas is compatible with radical, non-standard theorizing about the kinds (1986a). Fourth, Burge constructed a thought experiment that appears to show that even the contents of perception may depend for their natures on relations to entities purportedly perceived (1986b; 1986c). Finally, Burge has provided an argument for a version of empirical anti-individualism that he regards as both necessarily true and a priori: “empirical representational states as of the environment constitutively depend partly on entering into environment-individual causal relations” (2010, 69). This final argument has superseded the fourth as the main ground of perceptual anti-individualism. It could also be said that it provides the strongest ground for anti-individualism in general, at least for empirical cases, since propositional attitudes containing concepts such as “arthritis”, “water”, and “sofa”, are all parasitic, albeit in complex and non-fully-understood ways, upon basic perceptual categories covered by the fifth argument. Finally, while it is a priori that perceptual systems and states are partly individuated by relations to an environment, it is an empirical fact that there are perceptual states and events.
Rather than discussing each of these arguments in detail, the remainder of the section focuses on one of Burge’s schematic representations of the common structure of several of the arguments. The thought experiments in question involve three steps. In the first, one judges that someone could have thoughts about “a given kind or property as such, even though that person is not omniscient about its nature” (2013a, 548). For example, one can think thoughts about electrons without being infallible about the natures of electrons. This lack of omniscience can take the form of incomplete understanding, as in the case of the concept of arthritis. It can stem from an inability to distinguish the kind water from a look-alike liquid in a world that contains no water, or theorizing about water. Or it can issue from non-standard theorizing about entities, say sofas, despite fully grasping the concept of sofa.
In the second step, one imagines a situation just like the one just considered, but in which the person’s mistaken beliefs are in fact true. That is to say, one considers a situation in which the kind or property differs from its counterpart in the first situation, but in ways in which the individual cannot discriminate the kind or property in the first situation from its counterpart in the second step. Thus, in this step of the argument the thoughts one would normally express with the words used by the subject, present in the first step, are in fact true.
In the third step “one judges that in the second environment, the individual could not have thoughts about arthritis … [or] sofas, as such” (2013a, 549). The reason, of course, is that the relevant entities in the second step are not the same as those in the first step. There are additional qualifications that must be made, such as that it must be presupposed that, while there is no arthritis, water, or sofas, in the second step, no alternative ways of acquiring the concepts of arthritis, water or sofa is available or utilized. Burge continues:
The conclusion is that what thoughts an individual can have—indeed, the nature of the individual’s thoughts—depends partly on relations that the individual bears to the relevant environments. For we can imagine the individual’s make-up invariant between the actual and counterfactual situations in all other ways pertinent to his psychology. What explains the possibility of thinking the thoughts in the first environment and the impossibility of thinking them in the second is a network of relations that the individual bears to his physical or social surroundings. (2013a, 549)
In other words, the person is able to use the concepts of arthritis, water, and sofa in the first step of the argument for the same reasons that all of us can think with these concepts. Even if the person were indiscernible in individualistic respects, however, changes in the environment could preclude him from thinking with these concepts. If this is correct, then it cannot be the case that the thoughts that one can think with are fully determined by individualistic factors. That is to say, two possible situations in which a person is indistinguishable with respect to individualist factors can differ as regards the thoughts that she thinks.
What this schematic formulation of the first three thought experiments for anti-individualism emphasizes is arguably the same as the reason that it has come to be so widely accepted. As Burge had earlier put the point: the schematic representation of the arguments “exploits the lack of omniscience that is the inevitable consequence of objective reference to an empirical subject matter” (2007, 22-23). Thus, opposition to anti-individualism, or at least opposition to the three arguments in question, must in some way deny our lack of omniscience about the natures of our thoughts, or the conditions necessary for our thinking them. This denial appears to be unreasonable and without a solid foundation.
4. De Re Representation
To a first approximation, de dicto representation is representation that is entirely conceptualized and does not in any way rely upon non-inferential or demonstrative-like relations for its nature. By contrast, de re representation is both partly nonconceptual and reliant upon demonstrative-like relations (at least in empirical cases) for the determination of its nature. For example, the representational content in “that red sphere” is de re; it depends for its nature on a demonstrative-like relation holding between the representer and the putative subject matter. By contrast, “the shortest spy in all the world in 2019” is de dicto. It is completely conceptualized and is not in any way dependent for its nature on demonstrative-like relations. When Burge first began publishing on the topic, it was very common to hold that de re belief attributions (for example) could be reduced to de dicto ascriptions of belief.
Burge’s early work on de re representation sought to achieve three primary goals (1977). First, he provided a pair of characterizations of the fundamental nature of de re representation in language and in thought: a semantical and an epistemic characterization. The semantical account “maintains that an ascription ascribes a de re attitude by ascribing a relation between what is expressed by an open sentence, understood as having a free variable marking a demonstrative-like application, and a re to which the free variable is referentially related” (2007f, 68). The epistemic account, by contrast, maintains that an attitude is de re if it is not completely conceptualized. The second goal of Burge’s early paper on de re belief was to argue that any individual with de dicto beliefs must also have de re beliefs (1977, section II). Finally, Burge argued that the converse does not hold: it is possible to have de re beliefs but not de dicto beliefs. From the second and third claims it follows, contrary to most work on the topic at the time, that de re representation is in important respects more fundamental than de dicto representation.
Burge’s later work on de re representation includes a presentation of and an argument for five theses concerning de re states and attitudes. The first four theses concern specifically perception and perception-based belief. Thesis one is that all representation involves representation-as (2009a, 249-250). This thesis follows from the view that all perceptual content, and the content of all perception-based belief, involves attribution as well as reference. There is no such thing as “neat” perception. All perception is perspectival and involves attribution of properties (which may or may not correctly characterize the objects of perception, even assuming that perceptual reference succeeds). Thesis two is that all perception and perception-based belief is guided by general attributives (2009a, 252). An attributive is the perceptual analog of a predicate, for example, “red” in the perceptual content “that red sphere”. Perceptual representation must be carried out in such a way that one or more attributives is associated with the perception and guides the ostensible perceptual reference. The third thesis is that successful perceptual reference requires that some perceptual attribution must veridically characterize the entity perceived (2009a, 289-290). A main idea of this thesis is that something must make it the case that perceptual reference has succeeded, in a given instance, rather than failed. What must be so is not only that the right sort of causal relation obtains between the perceiver and the perceptual referent, but that some attributive associated with the object of perception veridically applies to it. Like the second thesis, this one is fully compatible with the fact that perceptual reference can succeed even where many attributions, including those most salient, fail. The difference is that the second thesis concerns only purported perceptual reference, while the third concerns successful reference. Successful reference is compatible with the incorrectness of some perceptual attribution, even if an attributive that functions to guide the reference fails to apply to the referent. But the third thesis, to reiterate, does require that some perceptual attributive present in the psychology of the individual correctly applies to the referent.
To summarize: the first thesis says that every representation must have a mode of representation. It is impossible for representation to occur neat. The second thesis holds that even (merely) purported reference requires attribution. And the third thesis states that successful perceptual reference requires that some attributives associated in the psychology of the individual with the reference correctly apply to the referent.
Burge’s final two theses concerning de re representation are more general and abstract. The fourth thesis states that necessary preconditions on perception and perceptual reference provide opportunities for a priori warranted belief and knowledge concerning perception. In Burge’s words: “Some of our perceptually based de re states and attitudes, involving context-based singular representations, can yield apriori warranted beliefs that are not parasitic on purely logical or mathematical truths” (2009a, 298). An example of such knowledge might be the following:
(AC*) If that object [perceptually presented as a trackable, integral body] exists, it is trackable and integral. (compare Burge 2009a, 301)
This thesis arguably follows from the third thesis concerning de re perceptual representation. It follows, to reiterate, because a minimal, necessary condition upon successful perceptual reference is that some attributive associated (by the individual or its perceptual system) with the referent veridically applies to the referent of perception—and the most general, necessarily applicable attributive where perceptual reference is concerned is that the ostensible entity perceived be a trackable, integral body. Finally, the fifth thesis concerning de re representational states and events provides a general characterization of de re representation that does not apply merely to empirical cases:
A mental state or attitude is autonomously (and proleptically) de re with respect to a representational position in its representational content if and only if the representational position contains a representational content that represents (purports to refer) nondescriptively and is backed by an epistemic competence to make non-inferential, immediate, nondiscursive attributions to the re. (2009a, 316)
The use of “autonomously” here is necessary to exclude reliance upon others in perception-based reference. Such reliance can be de re even if the third thesis fails (2009a, 290-291). “Proleptically” is meant to allow for representation that fails to refer. Technically speaking, failed perceptual or perception-based reference is never de re. But it is nevertheless purported de re reference and so is covered by the fifth thesis.
For discussion of non-empirical cases of de re representation, which Burge allowed for even in “Belief De Re”, see Burge (2007f, 69-75) and (2009a, 309-316).
It should be re-emphasized that two of Burge’s primary philosophical interests, throughout his career, have been de re reference and representation (1977; 2007f), on one hand, and the nature of predication, on the other (2007b; 2010a). These topics connect directly with the aforementioned interest in understanding representational and epistemic abilities that seem to be unique to human beings.
5. Mind and Body
Burge’s early work on the mind/body problem centered around sustained criticism of certain ways the problem of mental causation has been used to support versions of materialism (1992; 1993b). Burge’s criticisms of materialism about mind, including the argument against token-identity materialism, date back to “Individualism and the Mental” (1979a, section IV). As noted earlier, Burge’s position on the mind/body problem is a modest form of dualism that is principally motivated by the failure [of reduction of minds, mental states, and mental events, on one hand, to the body or brain, physical states, and physical events, on the other] to provide empirical or conceptual explanatory illumination. He has also done work on consciousness, and provided a pair of new arguments against what he calls “compositional materialism”.
Beginning in the late 1980s, many philosophers expressed doubts concerning the probity of our ordinary conception of mental causation. Discussion of anti-individualism partially provoked this series of discussions. Some argued that, absent some reductive materialist understanding of mental causation, we are faced with the prospect of epiphenomenalism—the view that instances of mental properties do not do any genuine causal work but are mere impotent concomitants of instances of physical properties. Burge argues that the grounds for rejecting epiphenomenalism are far stronger than any of the reasons that have been advanced in favor of the epiphenomenalist threat. He points out that, were there a serious worry about how mental properties can be causally efficacious, the properties of the special sciences such as biology and geology would be under as much threat as those in commonsense psychology or psychological science. Such causal psychological explanation “works very well, within familiar limits, in ordinary life; it is used extensively in psychology and the social sciences; and it is needed in understanding physical science, indeed any sort of rational enterprise” (1993b, 362). Such explanatory success itself shows, other things equal, the “respectability” of the ordinary conception of mental causation: “Our best understanding of causation comes from reflecting on good instances of causal explanation and causal attribution in the context of explanatory theories” (2010b, 471).
Burge has also provided arguments against some forms of materialism. One such argument employs considerations made available by anti-individualism to contend that physical events, as ordinarily individuated, cannot in the general case be identical with mental events (1979a, 141f.; 1993b, 349f.). This variation in mental events across individualistically indiscernible thinkers would not be possible, of course, if mental events were identical with physical events. In other words, if mental events were identical with physical events then mere variation in environment could not constitutively affect individuals’ mental events. Needless to say, the falsity of token-identity materialism entails the falsity of a claim of type-identity.
Burge has also provided another line of thought on the mind-body problem that supports his “modest dualism” concerning the relation of the mental to the physical. The most plausible of the various versions of materialism, Burge holds, is compositional materialism—the view that psychologies or minds, like tectonic plates and biological organisms, are composed of physical particles. However, like all forms of materialism, compositional materialism makes strong, empirically specific, claims. Burge writes:@
The burden on compositional materialism is heavy. It must correlate neural causes and their effects with psychological causes and their effects. And it must illuminate psychological causation, of both physical and psychological effects, in ways familiar from the material sciences (2010b, 479).
He holds that there is no support in science for the compositional materialist’s commitment to the view that mental states and events are identical with composites of physical materials.
The two new arguments against compositional materialism run roughly as follows. The first turns on the difficulty of seeing how “material compositional structures could ground causation by propositional psychological states or events” (2010b, 482). Physical causal structures—broadly construed, to include causation in the non-psychological special sciences—do not appear to have a rational structure. The propositional structures that help to type-individuate certain psychological kinds do have a rational structure. Hence, it is prima facie implausible that psychological causation could be reduced to physical-cum-compositional causal structures. The second argument is similar but does not turn on the notion of causation. Burge argues that:@
the physical structure of material composites consists in physical bonds among the parts. According to modern natural science, there is no place in the physical structure of material composites for rational, propositional bonds. The structure of propositional psychological states and events constitutively includes propositional, rational structure. So propositional states and events are not material composites. (2010b, 483)
Burge admits the abstractness of the arguments, and allows that subsequent theoretical developments might show how compositional materialism can overcome them. However, he suggests that the changes would have fundamentally to alter how either material states and events or psychological states and events are conceived.
Finally, Burge has written two articles on consciousness. The first of these defends three points. One is that all kinds of consciousness, including access consciousness, presuppose the presence of phenomenal consciousness. Phenomenal consciousness is the “what it is like” aspect certain mental states and events. The claim of presupposition is that no individual can be conscious, in any way, unless it has mental states some of which are phenomenally conscious. The second point is that the notion of access consciousness, as understood by Ned Block, for example, needs refinement. As Block understands access consciousness, it concerns mental states that are poised for use in rational activity (1997). Burge argues that this dispositional characterization runs afoul of the general principle that consciousness, of whatever sort, is constitutively an occurrent phenomenon. Burge’s refinement of the notion of access consciousness is called “rational-access consciousness”. The third point is that we should make at least conceptual space for the idea of phenomenal qualities that are not conscious throughout their instantiation in an individual.
Burge’s second paper on consciousness: (a) notes mounting evidence that a person could have phenomenal qualities without the qualities being rationally accessible; (b) explores ways in which a state could be rationally-access conscious despite not being phenomenally conscious; (c) distinguishes phenomenal consciousness from other phenomena, such as attention, thought, and perception; and (d) sets out a unified framework for understanding all aspects of phenomenal consciousness, as a type of phenomenal presentation of qualities to subjects (2007e).
6. Justification and Entitlement
Burge draws a crucial distinction between two forms of epistemic warrant. One is justification. A justified belief is one that is warranted by reason or reasons. By contrast, an epistemic entitlement is an epistemic warrant that does not consist in the possession of reasons. Entitlement is usually defined by Burge negatively—in such way, because there is no simple way to express what entitlement consists in that abstracts from the nature of the representational competence in question.
The distinction was first articulated in “Content Preservation” (1993a). Burge there explained that:
(t)he distinction between justification and entitlement is this: Although both have positive force in rationally supporting a propositional attitude or cognitive practice, and in constituting an epistemic right to it, entitlements are epistemic rights or warrants that need not be understood by or even accessible to the subject. We are entitled to rely, other things equal, on perception, memory, deductive and inductive reasoning, and on … the word of others. (230)
What entitlement consists in with respect to each of these cases is different. What they do have in common is the negative characteristics listed. Burge continues:
The unsophisticated are entitled to rely on their perceptual beliefs. Philosophers may articulate these entitlements. But being entitled does not require being able to justify reliance on these resources, or even to conceive such a justification. Justifications … involve reasons that people have and have access to. (1993a, 230)
Throughout his career, Burge has provided explanations for our entitlement to rely upon interlocution, certain types of self-knowledge and self-understanding, memory, reasoning, and perception. The last of these is briefly sketched before some misunderstandings of the distinction between justification and entitlement are warned against. The case of perceptual entitlement provides one of the best illustrations of the nature of entitlement in general.
People are entitled to rely upon their perceptual beliefs just in case the beliefs in question: (a) are the product of a natural perceptual competence, that is functioning properly; (b) are of types that are reliable, where the requirement of reliability is restricted to a certain type of environment; and (c) have contents that are normally transduced from perceptual states that themselves are reliably veridical (Burge 2003c, sections VI and VIII; 2020, section I). These points are part of a much larger and more complex discussion, of course. The point for now is that each of (a)-(c) are examples of elements of entitlements. As is the case with all entitlements, individuals who are perceptually entitled to their beliefs do not have to know anything concerning (a)-(c); and indeed need not even be able to understand the explanation of the entitlement, or the concept “entitlement”. A final key point is that while all entitlements, like all epistemic warrants generally for Burge, must be the product of reliable belief-forming competences, no entitlement consists purely in reliability. In the case of perception, the sort of reliability that is necessary for entitlement is reliability in the kind of environment that contributed to making the individual’s perceptual states and beliefs what they are (2003c, section VI; 2020, section I).
Numerous critics of Burge have misunderstood the nature of entitlement, and/or the distinction between justification and entitlement. Rather than exhaustively cataloging these misinterpretations, the remainder of the section is devoted to articulating the four main sources of misunderstanding. Keeping these in mind would help to prevent further interpretive mistakes. In increasing levels of subtlety, the mistakes are the following. The first error is simply to disregard Burge’s insistence that entitlements need not be appealed to, or be even within the ken, of the entitled individual. The fact that an individual has no knowledge of any warranting conditions, in a given case, is not a reason for doubting that she is entitled to the relevant range of beliefs.
The second error is insisting that entitlement be understood in terms of “epistemic grounds”, or “evidence”. Each of these notions suggests the idea of epistemic materials in some way made use of by the believer. But entitlement is never something that accrues to a belief, or by extension to a believer, because of something that he or she does, or even recognizes. The example of perceptual entitlement, which accrues in virtue of conditions (a)-(c) above, illustrates these points. The individuation conditions of perceptual states or beliefs are in no sense epistemic grounds. The notion of evidence is even less appropriate for describing entitlement. While evidence can be made up of many different sorts of entities, or states of affairs, evidence must be possessed or appreciated by a subject in order for it to provide an epistemic warrant. But in that case, on Burge’s view, the warrant would be a justification rather than an entitlement.
A variant on this second source of misunderstanding is to assume that since justification is an epistemic warrant by reason, and reasons are propositional, all propositional elements of epistemic warrants are justifications (or parts of justifications). Several types of entitlements involve propositionality—examples of which are interlocution, authoritative self-knowledge, and even perception (in the sense that perceptual beliefs to which we are entitled must have a propositional structure appropriately derived from the content of relevant perceptual states). But none is a justification or an element in a justification. Being propositional is necessary, but not sufficient, for an element of an epistemic warrant to be, or to be involved in, a justification (as opposed to an entitlement). Another way to put the point is to explain that being propositional in structure is necessary, but not sufficient, for being a reason.
The third tendency that leads to misunderstandings of Burge’s two notions of epistemic warrant is the assumption that they are mutually exclusive. On this view, a belief warranted by justification (entitlement) cannot also be warranted by entitlement (justification). Not only is this not the case, but in fact all beliefs that are justified are also beliefs to which the relevant believer is entitled. Every belief that a thinker is justified in holding is also a belief that is produced by a relevantly reliable, natural competence. (Though the converse obviously does not hold.) Entitlement is the result of a well-functioning, natural, reliable belief-forming competence. There are two species of justification for Burge. In the first case, one is justified in believing a self-evident content such as “I am thinking”, or “2+2=4”. In effect, these contents are reasons for themselves—believing them is enough, other things equal, for the beliefs to be epistemically warranted and indeed to be knowledge. The second kind of justification involves inference. If a sound inference is made by a subject, the premises support the conclusion, and the believer understands why the inference is truth-preserving (or truth-tending), then the belief is justified. Notice that each of these kinds of justified beliefs are, for Burge, also the products of well-functioning, natural, reliable belief-forming competencies. The competence in the case of contents that are reasons for themselves is understanding; and the competence in the second case is a complex of understanding the contents, understanding the pattern of reasoning, and actually reasoning from the content of the premises to the content of the conclusion. So all cases of justification are also cases in which the justified believer is entitled to his or her beliefs.
The subtlest mistake often made by commenters concerning Burge’s notions of justification and entitlement is to assume that what Burge says is nottrue of entitlement is true of his notion of justification. After all, in “Content Preservation”, Burge states that entitlement “need not be understood by or even accessible to the subject” (1993a, 230). And later, in “Perceptual Entitlement”, Burge makes a number of additional negative claims about entitlement. He writes that entitlement “does not require the warranted individual to be capable of understanding the warrant”, and that entitlement is a “warrant that need not be fully conceptually accessible, even on reflection, to the warranted individual” (2003c, 503). Finally, Burge argues that children, for example, are entitled to their perceptual beliefs, rather than being justified, because they lack sophisticated concepts such as epistemic, entails, perceptual state, and so forth (2003c, 521). So we have the following negative specifications concerning entitlement:
(I) It does not require understanding the warrant;
(II) It does not require being able to access the warrant;
and
(III) It does not require the use of sophisticated concepts such as those mentioned above.
The mistake, of course, is to assume that these things that are not required by entitlement are required by justification, as Burge understands justification. This difficulty is a reflection of the fact that Burge, in these passages and others like them, is doing two things at once. He is not only explaining how he thinks of entitlement and justification, but also distinguishing entitlement from extant conceptions of justification. Since his conception of justification differs from most other conceptions, it is a fallacy to infer from (I)-(III), together with relevant context, that they must be abilities or capacities that justification does require.
This is not to say that (I)-(III) are wholly irrelevant to Burge’s notion of justification. For his conception is not completely unlike others’ conceptions. For example, one who believes that 2+2=4 based on his or her understanding of the content does understand the warrant—for the warrant is the content itself. So what (I) denies of entitlement is sometimes true of Burge’s notion of justification. Similarly, a relative neophyte who understands at least basic logic, and makes a sound inference in which the premises support the conclusion, is in one perfectly good sense able to access his or her warrant for believing the conclusion, as in what is denied in (II). The notion of access in question, when Burge invokes the notion in characterizations of epistemic warrant, is conscious access. (See section 5 above.)
But the other two claims are more problematic. Burge’s conception of justification is not as demanding as one which holds that the denials of (II) and (III) correctly characterize what is necessary for justification. Thus, while perceptual entitlement is the primary form of epistemic warrant for those with empirical beliefs, it is not impossible for children or adults to have justifications for their perceptual beliefs. It is only that these will almost always be partial. They will usually be able to access and understand the warrant (the entitlement) only partially. In effect, they are justified only to the extent that they have an understanding of the nature of perceptual entitlement. Fully understanding the warrant, the entitlement, would require concepts such as those mentioned in (IV). But even children and many adults, as noted, are capable of approximating understanding of the warrant. Burge gives the example of a person averring a perceptual belief and providing in support of his belief the claim that it looks that way to him or her. This is a kind of justification. But there is no (full) understanding of the warrant, and likely not even possession of all the concepts employed in a discursive representation of the complete warrant. Finally, Burge’s notion of justification, or epistemic support by reason, is even weaker than these remarks suggest. For he holds that some nonhuman animals probably have reasons for some of their perceptual beliefs (and therefore have justifications for them)—but these animals can in no sense at all access or understand the warrant. As Burge writes, “My notion of having a reason or justification does not require reflection or understanding. That is a further matter” (2003c, 505 fn. 1). This passage brings out how different Burge’s notion of justification is from many others’ conceptions; and it helps to explain why it is an error to assume that what Burge says is not true of entitlement is true of (his notion of) justification.
7. Interlocution
Burge’s early work on interlocution (or testimony) defended two principal theses. One is the “Acceptance Principle”—the view, roughly speaking, that one is prima facie epistemically warranted in relying upon the word of another. The argument for this principle draws upon three a priori theses: (a) speech and the written word are indications of propositional thought; (b) propositional thought is an indication of a rational source; and (c) rational sources can be relied upon to present truth. The other thesis Burge defended was that it is possible to be purely a priori warranted in believing a proposition on the basis of interlocution (1993a). Burge came to regard this second thesis as a large mistake (2013b, section III), and has since then held that the required initial perceptual uptake of the words in question—utilization of which is made in (a)—makes all interlocutionary knowledge and warranted belief at least minimally empirical in epistemic support. It should be noted, however, that Burge’s view on our most basic interlocutionary warrant remains distinctive in that he regards it as fundamentally non-inferential in character. It is an entitlement—whose nature is structured and supported by the Acceptance Principle, and the argument for it—rather than a justification. Furthermore, none of the critics of Burge’s early view on interlocutionary entitlement identified the specific problem that eventually convinced him that the early view had to be given up.
The specific problem in question was that Burge had initially held that since interlocutionary warrant could persist in certain cases, even as perceptual identification of an utterance failed, the warrant could not be based, even partly, on perception. Burge came to believe that this persistence was possible only because of a massive presumption of reliability where perception was concerned. So the fact that interlocutionary warrant could obtain even where perception failed does not show that the warrant is epistemically independent of perception (2013b, section III).
8. Self-Knowledge
Burge’s views on self-knowledge developed over three periods. The first of these consisted largely in a demonstration that anti-individualism is not, contrary to a common view at the time, inconsistent with or in any tension with our possession of some authoritative self-knowledge (1986d; compare 2013, 8). Burge pointed to certain “basic cases” of self-knowledge—such as those involving the content of “I am now entertaining the thought that water is wet”—which are infallible despite consisting partly in concepts that are anti-individualistically individuated. Using the terms that Burge introduced later, this content is a purecogito case. It is infallible in the sense that thinking the content makes it true. It is also self-verifying in the sense that thinking the content provides an epistemic warrant, and indeed knowledge, that it is the case. There are also impure cogito cases, an example of which is “I am hereby thinking [in the sense of committing myself to the view] that writing requires concentration”. This self-ascription is not infallible. One can think the content, even taking oneself to endorse the first-order content in question, but one can fail actually to commit oneself to it. But impure cogito cases are still self-verifying. The intentional content in such cases “is such that its normal use requires a performative, reflexive, self-verifying thought” (2003e, 417-418). What Burge calls “basic self-knowledge” in his early work on self-knowledge is comprised of cogito cases, pure and impure. He is explicit, however, that not all authoritative self-knowledge, much less all self-knowledge in general, has these features.
To reiterate, the central point of this early work was simply to demonstrate that there is no incompatibility between our possession of authoritative self-knowledge and anti-individualism. Basic cases of self-knowledge illustrate this. One further way to explain why there is no incompatibility is to note that the conditions that, in accordance with anti-individualism, must be in place for the first-order contents to be thought are necessarily also in place when one self-ascribes such an attitude to oneself (2013, 8).
The second period of Burge’s work on self-knowledge centered around a more complete discussion of the different forms of authoritative self-knowledge, as well as defending the thesis that a significant part of our warrant for non-basic cases of such self-knowledge derives from its indispensable role in critical reasoning (1996). Critical reasoning is meta-representational reasoning that conceptualizes attitudes and reasons as such. The role of (non-basic) authoritative self-knowledge in critical reasoning is part of our entitlement to relevant self-ascriptions of attitudes in general. This second period thus extended Burge’s account of authoritative self-knowledge to non-cogito instances of self-knowledge. It also began the project of explaining wherein we are entitled to authoritative self-knowledge among instances where the self-ascriptions are not self-verifying. Since cogito cases provide reasons for themselves, as it were, basic cases of self-knowledge involve justification. By contrast, non-basic cases of authoritative self-knowledge are warranted by entitlement rather than justification. (See section 6.)
The third period of Burge’s work on self-knowledge consisted in a full discussion of the nature and foundations of authoritative self-knowledge (2011a). Burge argues that authoritative self-knowledge, including a certain sort of self-understanding, is necessary for our role in making attributions concerning, and being subject to, norms of critical reasoning and morality. A key to authoritative self-knowledge, as stressed by Burge from the beginning of his work on the topic, is the absence of the possibility of brute error. Brute error is an error that is not in any way due to malfunctioning or misuse of a representational competence. In perception, for example, one can be led into error despite the fact that one’s perceptual system is working fully reliably; if, say, light is manipulated in certain ways. By contrast, while error is possible in most cases of authoritative self-knowledge, it is possible only when there is misuse or malfunction. Since misuse and malfunction undermine the epistemic warrant, it can be said that instances of authoritative self-knowledge for Burge are “warrant factive”—warrant entails, in such cases, true self-ascriptions of mental states.
The full, unified account of self-knowledge in Burge (2011a) explains each element in our entitlement to self-knowledge and self-understanding. The account is extended to cover, not only basic cases of self-knowledge, but also knowledge of standing mental states; of perceptual states; and of phenomenal states such as pain. The unified treatment explains why its indispensable role in critical reasoning is not all there is to our entitlement to (non-basic cases of) self-knowledge and self-understanding. Burge’s explanation of the impossibility of brute error with respect to authoritative self-knowledge makes essential use of the notion of “preservational psychological powers”, such as purely preservative memory and betokening understanding. Betokening understanding is understanding of particular instances of propositional representational content. The unification culminates in an argument that shows how immunity to brute error follows from the nature of certain representational competencies, along with the nature of epistemic entitlement (2011a, 213f). In yet later work, Burge explained in detail the relation between authoritative self-knowledge and critical reasoning (2013, 23-24).
9. Memory and Reasoning
Two of Burge’s most important philosophical contributions are his identification and elucidation of the notion of purely preservative memory, on one hand, and his discussion of critical reasoning, particularly its relation to self-knowledge and the first-person concept, on the other.
Burge’s discussion of memory and persons distinguishes three different forms of memory: experiential memory; substantive content memory; and purely preservative memory (2003c, 407-408). Experiential memory is memory of something one did, or that happened to one, from one’s own perspective. Substantive content memory is closer to our ordinary notion of simply recalling a fact, or something that happened, without having experienced it personally. Purely preservative memory, by contrast, simply holds a remembered (or seemingly remembered) content, along with the content’s warrant and the associated attitude or state, in place for later use. When I remember blowing out the candles at my 14th birthday party, this is normally experiential memory. Remembering that the United States tried at least a dozen times to assassinate Fidel Castro, in most cases, is an example of substantive content memory. When one conducts inference over time, by contrast, memory functions simply to hold earlier steps along with their respective warrants in place for later use in the reasoning. This sort of memory is purely preservative. Burge argues that no successful reasoning over time is possible without purely preservative memory. Purely preservative memory also plays an important role in Burge’s earlier account of the epistemology of interlocution (1993a; 2013b); and in his most developed account of the epistemology of self-knowledge and self-understanding (2011a).
In “Memory and Persons” he discussed the role of memory in psychological representation as well as the issue of personal identity. Burge argues that memory is “integral to being a person, indeed to having a representational mind” (2003b, 407). He does this by arguing that three common sorts of mental acts, states, and events—those involving intentional agency, perception, and inference—presume or presuppose the retention of de se representational elements in memory. De se states have two functions. First, they mark an origin of representation. In the case of a perceptual state this might be between an animal’s eyes. Second, they are constitutively associated with an animal’s perspectives, needs, and goals. Thus, a dog might not simply represent in perceptual memory the location of a bone—but instead, the location of his or her bone. De se markers are also called by Burge “ego-centric indexes” (2003c; 2019).
Intentional agency requires retention in memory of de se representational elements because intention formation and fulfillment frequently take place over time. If someone else executes the sort of action that one intends for oneself, this would not count as fulfillment of the veridicality condition of one’s intention. Marking one’s own fulfillment (or the lack of it) requires retention in memory of one’s own de se representational elements. Another example is perception. It requires the use of perceptual contents. This use always and constitutively involves possession or acquisition of repeatable perceptual abilities. “Such repeatable abilities include a systematic ability to connect, from moment to moment, successive perceptions to one another and to the standpoint from which they represent” (2003b, 415). The activity necessarily involved in perception, too, involves retention of de se contents in purely preservative memory. Inference, finally, requires this same sort of retention for reasons alluded to above. If reliance on a content used earlier in a piece of reasoning is not ego-centrally indexed to the reasoner, then simple reliance on the content cannot epistemically support one’s conclusion. The warrant would have to be re-acquired whenever use was made of a given step in the process of reasoning—making reasoning over time impossible.
It follows from these arguments that attempts to reduce personal identity to memory-involving stretches of consciousness cannot be successful. Locke is commonly read as attempting to carry-out such a reduction. Butler pointed out a definitional circularity—memory cannot be used in defining personal identity because genuine memories presuppose such identity. Philosophers such as Derek Parfit and Sydney Shoemaker utilized a notion of “quasi-memory”—a mental state just like memory but which does not presuppose personal identity—in an attempt to explain personal identity in more fundamental terms. Burge’s argumentation shows that this strategy involves an explanatory circularity. Only a creature with a representational mind could have quasi-memories. However, for reasons set out in the previous two paragraphs, having a representational mind requires de se representational elements that themselves presuppose personal identity over time. Hence, quasi-memory presupposes genuine memory, and cannot therefore be used to define or explain it (2003b, sections VI-XI).
As noted in the previous section, critical reasoning is meta-representational reasoning that characterizes propositional attitudes and reasons as such. One of Burge’s most important discussions of critical reasoning explains how fully understanding such reasoning requires use and understanding of the full, first-person singular concept “I” (1998).
Descartes famously inferred his existence from the fact that he was thinking. He believed that this reasoning was immune to serious skeptical challenges. Some philosophers, most notably Lichtenberg, questioned this. They reasoned that while it might be the case that one can know one is thinking, simply by reflecting on the matter, the ontological move from thinking to a thinker seems dubious at worst, and unsupported at best. Burge argues, using only premises that Lichtenberg was himself doubtless committed to—such as that it is a worthwhile philosophical project to understand reason and reasoning—that the first-person singular concept is not dispensable in the way that Lichtenberg and others have thought. Among other things, Burge’s argument provides a vindication of Descartes’s reasoning about the cogito. The argument shows that Descartes’s inference to his existence as a thinker from the cogito is not rationally unsupported, as Lichtenberg and others had suggested.
All reasons that thinkers have are, in Burge’s terminology, “reasons-to”. That is, they are not merely recognitions of (for example) logical entailments among propositions—they enjoin one to change or maintain one’s system of beliefs or actions. This requires not merely recognition of the relevance of a rational review, but also acting upon it. “In other words, fully understanding the concept of reason involves not merely mastering an evaluative system for appraising attitudes … [but also] mastering and conceptualizing the application of reasons in actual reasoning” (1998, 389). Furthermore, reasons must sometimes exert their force immediately. Their implementational relevance, that is to say, is sometimes not subject to further possible rational considerations. Instead, the reasons carry “a rationally immediate incumbency to shape [attitudes] in accordance with the evaluation” of which the reasons are part (1998, 396). Burge argues that full understanding of reasoning in general, and this rational immediacy in particular, requires understanding and employing the full “I”-concept. If correct, this refutes Lichtenberg’s contention that the “I”-concept is only practically necessary; and it supports Descartes’s view that understanding and thought alone are sufficient to establish one’s existence as a thinker. Only by adverting to the “I” concept can we fully explain the immediate rational relevance that reasons sometimes enjoy in a rational activity.
10. Reflection
Burge has also discussed the epistemology of intellection (that is, reason and understanding) and reflection. He argues that classical rationalists maintained three principles concerning reflection. One is that reflection in an individual is always, at least in principle, sufficient to bring to conscious articulation steps or conclusions of the reflection. Another is that reflection is capable of yielding a priori warranted belief and knowledge of objective subject matters. The final classical principle about reflection is that success in reflection requires skillful reasoning and is frequently difficult—it is not a matter simply of attaining immediate understanding or knowledge from a “flash” of insight (2013a, 535-537).
Burge accepts the second and third principles about reflection but rejects the first. He argues that anti-individualism together with advances in psychology show the first principle to be untenable. Anti-individualism shows that “the representational states one is in are less a matter of cognitive control and internal mastery, even ‘implicit’ cognitive control and mastery, than classical views assumed” (2013a, 538). Advances in psychology cast doubt on the first thesis primarily because it seems that many nonhuman animals, as well as human infants, think thoughts (and thus have concepts) despite lacking the ability to reflect on them; and because it has become increasingly clear that much cognition is modular and therefore inaccessible to conscious reflection, even in normal, mature human beings.
Burge has also carried out extensive work on how reflection can (and sometimes, unaided, cannot) “yield fuller understanding of our own concepts and conceptual abilities” (2007d, 165); on the emergence of logical truth and logical consequence as the key notions in understanding logic and deductive reasoning (which discussion includes an argument that fully understanding reasoning commits one ontologically to an infinite number of mathematical entities) (2003a); and on the nature and different forms of incomplete understanding (2012, section III). Finally, a substantial portion of Burge’s other work makes extensive use of a priori reflection—an excellent example being “Memory and Persons” (see section 9).
11. Perception
Burge’s writing on perception is voluminous in scope. Most historically important is Origins of Objectivity (2010). [This book is not most centrally about perception, as some commentators have suggested, but on what its title indicates: the conditions necessary and sufficient for objective psychological reference. A much more complete treatment of perception is to be found in the successor volume to Origins—Perception: First Form of Mind (2021)]. The first part of the present section deals with Burge’s work on the structure and content of perception. The second part briefly describes his 2020 article on perceptual warrant.
Origins is divided into three parts. Part I provides an introduction, a detailed discussion of terminology, and consideration of the bearing of anti-individualism on the rest of the volume’s contents. Part II is a wide-ranging discussion of conceptions of the resources necessary for empirical reference and representation, covering both the analytic and the continental traditions, and spanning the entire 20th century. Part III develops in some detail Burge’s conception of perceptual representation: including biological and methodological backgrounds; the nature of perception as constitutively associated with perceptual constancies; discussion of some of the most basic perceptual representational categories; and a few “glimpses forward”, one of which is mentioned below.
Part I characterizes a view that Burge calls “Compensatory Individual Representationalism” (CIR). With respect to perception, this is the view that the operation of the perceptual system, even when taken in tandem with ordinary relevant causal relations, is insufficient for objective reference to and representation of the empirical world. The individual perceiver must herself compensate for this insufficiency in some way if objective reference is to be possible. This view is then contrasted with Burge’s own view of the origins of objective reference and representation, which is partly grounded in anti-individualism as well as the sciences of perceptual psychology, developmental psychology, and ethology.
Part II of Origins critically discusses all the major versions of CIR. The discussion is comprehensive, including analyses of several highly influential 20th-century philosophers (and some prominent psychologists) who reflected upon the matter in print. There are two families of CIR. The first family holds that a more primitive level of representation is needed, underlying ordinary empirical representation, without which representation of prosaic entities in the environment is not possible. Bertrand Russell is an example of one who held a first-family version of CIR. Representation of the physical world, on his view, was parasitic upon being acquainted—representing—sense data (2010, 119). Second family forms of CIR did not require a more primitive level of representation. They did require, however, that certain advanced competencies be in place if objective reference and empirical representation are to be possible. Peter Strawson, for example, held that objective representation requires the use of a comprehensive spatial framework, as well as the use of one’s position in this represented allocentric space (2010, 160).
Both families of CIR share a negative and a positive claim. The negative claim is that the normal functioning of a perceptual system, together with regular causal relations, is insufficient for objective empirical representation. The positive claim is that such representation requires that an individual in some way herself represents necessary conditions upon objective representation. Burge argues that all versions of CIR are without serious argumentative or empirical support. This includes even versions of CIR that are compatible with anti-individualism. Burge extracted the detailed discussion of Quine’s version of the syndrome in an article (2009b).
The central chapter of Part III of Origins, chapter 9, discusses Burge’s conception of the nature of perceptual representation, including what distinguishes perception from other sensory systems. It argues that perception is paradigmatically attributable to individuals; sensory; representational; a form of objectification; and involves perceptual constancies. All perception must occur in the psychology of an individual with perceptual capacities, and in normal cases some individual perceptions must be attributable to the individual (as opposed to its subsystems). Perception is a special sort of sensory system—a system that functions to represent through the sort of objectification that perceptual constancies consist in. Perception is constitutively a representational competence, for Burge. Objectification involves, inter alia, marking an important divide between mere sensory responses, on one hand, and representational capacities that include such responses, but which cannot be explained solely in terms of them, on the other (2010, 396). Finally, perceptual constancies “are capacities to represent environmental attributes, or environmental particulars, as the same, despite radically different proximal stimulations” (2010, 114).
Burge argues that genuine objective perception begins, for human beings, nearly at birth, and is achieved in dozens or hundreds of other animal species, including some arthropods. The final chapter of the book includes “glimpses beyond”. It points, perhaps most importantly, toward Burge’s work—thus far unpublished—explaining the origins of propositional thought, including what constitutively distinguishes propositional representation from perceptual and other forms of representation. (Burge has published, in addition to the discussion in Origins of Objectivity, some preparatory work in this direction (2010a).)
The remainder of this section briefly discusses Burge’s 2020 work on perceptual warrant. This lengthy article is divided into five substantial sections. The first consists in a largely or wholly a priori discussion of the nature of epistemic warrant, including discussion of the distinction between justification and entitlement; and the nature of representational and epistemic functions and goods. Two of the most important theses defended in the first section are the following: (i) the thesis that, setting aside certain probabilistic cases and beliefs about the future, epistemic warrant certifies beliefs as knowledge—that is, if a perceptual belief (say) is warranted, true, and does not suffer from Gettier-like problems, then the belief counts as knowledge; and (ii) the thesis that epistemic warrant cannot “block” knowledge. That is to say, whatever epistemic warrant is, it cannot be such that it prevents a relevantly warranted belief from becoming knowledge. Burge uses these theses to argue for the inadequacy of various attempts at describing the nature of epistemic warrant.
The second section uses the a priori connections between warrant, knowledge, and reliability to argue against certain (internalist) conceptions of empirical warrant. The central move in the argument against epistemic internalism about empirical warrant is the thesis that warrant and knowledge require reliability in normal circumstances, but that nothing in perceptual states or beliefs taken in themselves ensures such reliability. Burge argues for the reliability requirement on epistemic warrant by an appeal to the “no-blockage” thesis—any unreliable way of forming beliefs would block those beliefs from counting as knowledge. So the argument against epistemic internalism has two central steps. First, the “no-blockage” thesis shows that reliability, at least in certain circumstances, is required for an epistemic warrant. And second, nothing that is purely “internal” to a perceiver ensures that her perceptual state-types are reliably veridical; or, therefore, that her perceptual belief-types are reliably true. Hence, internalism cannot be a correct conception of perceptual warrant.
The third section discusses differences between refuting skeptical theses, on one hand, and providing a non-question-begging response to a skeptical challenge, on the other. (In section VI of “Perceptual Entitlement” (2003c), for example, Burge explains perceptual warrant but does not purport to answer skepticism.) Burge argues that many epistemologists have conflated these two projects, with the result (inter alia) that the nature of epistemic warrant has been obscured. The fourth section argues that a common line of reasoning concerning “bootstrapping” is misconceived. Some have held that if, as on Burge’s view, empirical warrants do not require justifying reasons, then there is the unwelcome consequence that we can infer inductively from the most mundane pieces of empirical knowledge, or warranted empirical beliefs, that our perceptual belief-forming processes are reliable. Burge argues that it is not the nature of epistemic warrant that yields this unacceptable conclusion but instead a misunderstanding concerning the nature of adequate inductive inference. Finally, the fifth section argues at length against the view that conceptions of warrant like Burge’s imply unintuitive results in Bayesian confirmation theory (2020).
12. History of Philosophy
Finally, Burge has done sustained and systematic work on Frege. The work tends to be resolutely historical in focus. All but two of his articles on Frege are collected in Truth, Thought, Reason (2005). The others are Burge (2012) and (2013c). The latter article contains Burge’s fullest discussion of the relation between philosophy and history of philosophy.
The substantial introduction to Burge (2005) is by far the best overview of Burge’s work on Frege. The introduction contains not only a discussion of Frege’s views and how his collected essays relate to them, but also Burge’s most complete explanation of wherein his own views differ from Frege’s. The first essay provides a valuable, quite brief introduction to Frege and his work (2005a). The remaining essays are divided into three broad categories. The first discusses Frege’s views on truth, representational structure, and Frege’s philosophical methodology. The second category deals with Frege’s views on sense and cognitive value. Included in this category is the article that Burge believes is his philosophically most important article on Frege (1990). Finally, the third section of Burge’s collection of essays on Frege treats aspects of Frege’s rationalist epistemology. One of the articles on Frege that do not appear in Burge (2005) critically discusses an interpretation of Frege’s notion of sense advanced by Kripke; it also provides an extended discussion of the nature of incomplete understanding (2012). The other paper discusses respects in which Frege has influenced subsequent philosophers and philosophy (2013c).
Burge has also done historical work on Descartes, Leibniz, and Kant. Much of this work remains unpublished, save three articles. One traces the development and use of the notion of apriority through Leibniz, Kant, and Frege (2000). The other two discuss Descartes’s notion of mental representation, especially including evidence for and against the view that Descartes was an anti-individualist about representational states and events (2003d; 2007c).
13. Psychology
Much of Burge’s work on perception is also a contribution to the philosophy of psychology or even to the science of psychology itself (for example, 1991a; 2010; 2014a; 2014b). He was the first to introduce into philosophical discussion David Marr’s groundbreaking work on perception (Burge, 1986c). Burge himself has also published a couple of shorter pieces in psychology (2007g; 2011b).
In addition to this, Burge published a long article in Psychological Review (2018), that is not focused on perception. This article criticizes in detail the view, common among psychologists and some philosophers, that infants and nonhuman animals attribute mental states to others. The key to Burge’s argument is recognizing and developing a non-mentalistic and non-behavioristic explanatory scheme that centers on explaining action and action targets, but which does not commit itself to the view that relevant subjects represent psychological subject matters. The availability of this teleological, conative explanatory scheme shows that it does not follow, other things equal, from the fact that some infants and nonhuman animals represent actions and actors that they attribute mental states to these actors.
14. References and Further Reading
a. Primary Literature
i. Books
(2005). Truth, Thought, Reason: Essays on Gottlob Frege: Philosophical Essays, Volume 1 (Oxford: Oxford University Press).
(2021) Origins—Perception: First Form of Mind. (Oxford: Oxford University Press).
ii. Articles
(1972). ‘Truth and Mass Terms’, The Journal of Philosophy 69, 263-282.
(1973). ‘Reference and Proper Names’, The Journal of Philosophy 70, 425-439.
(1974a). ‘Demonstrative Constructions, Reference, and Truth’, The Journal of Philosophy 71, 205-223.
(1974b). ‘Truth and Singular Terms’, Noûs 8, 309-325.
(1975). ‘On Knowledge and Convention’, The Philosophical Review 84, 249-255.
(1977). ‘Belief De Re’, The Journal of Philosophy 74, 338-362. Reprinted in Foundations of Mind.
(1979a). ‘Individualism and the Mental’, Midwest Studies in Philosophy 4, 73-121. Reprinted in Foundations of Mind.
(1979b). ‘Semantical Paradox’, The Journal of Philosophy 76, 169-198.
(1982). ‘Other Bodies’, in A. Woodfield (ed.) Thought and Object (Oxford: Oxford University Press, 1982). Reprinted in Foundations of Mind.
(1984). ‘Epistemic Paradox’, The Journal of Philosophy 81, 5-29.
(1986a). ‘Intellectual Norms and Foundations of Mind’, The Journal of Philosophy 83, 697-720. Reprinted in Foundations of Mind.
(1986b). ‘Cartesian Error and the Objectivity of Perception’, in P. Pettit and J. McDowell (eds.) Subject, Thought, and Context (Oxford: Oxford University Press). Reprinted in Foundations of Mind.
(1986c). ‘Individualism and Psychology’, The Journal of Philosophy 95, 3-45. Reprinted in Foundations of Mind.
(1986d). ‘Individualism and Self-Knowledge’, The Journal of Philosophy 85, 649-663. Reprinted in Cognition Through Understanding.
(1990). ‘Frege on Sense and Linguistic Meaning’, in D. Bell and N. Cooper (eds.) The Analytic Tradition (Oxford: Blackwell). Reprinted in Truth, Thought, Reason.
(1991a). ‘Vision and Intentional Content’, in E. LePore and R. Van Gulick (eds.) John Searle and His Critics (Oxford: Blackwell).
(1991b). ‘Frege’, in H. Burkhardt and B. Smith (eds.) Handbook of Ontology and Metaphysics (Munich: Philosophia Verlag). Reprinted in Truth, Thought, Reason.
(1992). ‘Philosophy of Language and Mind: 1950-1990’, The Philosophical Review 101, 3-51. Expanded version of the portion on mind in Foundations of Mind.
(1993a). ‘Content Preservation’, The Philosophical Review 102, 457-488. Reprinted in Cognition Through Understanding.
(1993b). ‘Mind-Body Causation and Explanatory Practice’, in J. Heil and A. Mele (eds.) Mental Causation (Oxford: Oxford University Press, 1993). Reprinted in Foundations of Mind.
(1996). ‘Our Entitlement to Self-Knowledge’, Proceedings of the Aristotelian Society 96, 91-116. Reprinted in Cognition Through Understanding.
(1997a). ‘Interlocution, Perception, and Memory”. Philosophical Studies 86, 21-47. Reprinted in Cognition Through Understanding.
(1997b). ‘Two Kinds of Consciousness’, in N. Block, O. Flanagan, and G. Güzeldere (eds.) The Nature of Consciousness (Cambridge, MA: MIT Press). Reprinted in Foundations of Mind.
(1998). ‘Reason and the First Person’, in C. Wright, B. Smith, and C. Macdonald (eds.) Knowing Our Own Minds (Oxford: Clarendon Press). Reprinted in Cognition Through Understanding.
(1999). ‘Comprehension and Interpretation’, in L. Hahn (ed.) The Philosophy of Donald Davidson (Chicago, IL: Open Court Press). Reprinted in Cognition Through Understanding.
(2000). ‘Frege on Apriority’, in P. Boghossian and C. Peacocke (eds.) New Essays on the A Priori (Oxford: Oxford University Press). Reprinted in Truth, Thought, Reason.
(2003a) ‘Logic and Analyticity’, Grazer Philosophische Studien 66, 199-249.
(2003b) ‘Memory and Persons’, The Philosophical Review 112, 289-337. Reprinted in Cognition Through Understanding.
(2003c). ‘Perceptual Entitlement’, Philosophy and Phenomenological Research 67, 503-548.
(2003d). ‘Descartes, Bare Concepts, and Anti-individualism’, in M. Hahn and B. Ramberg (eds.) Reflections and Replies: Essays on the Philosophy of Tyler Burge (Cambridge, MA: MIT Press).
(2003e). ‘Mental Agency in Authoritative Self-Knowledge’, M. Hahn and B. Ramberg (eds.) Reflections and Replies: Essays on the Philosophy of Tyler Burge (Cambridge, MA: MIT Press).
(2005a). ‘Frege’, in Truth, Thought, Reason.
(2007a). ‘Disjunctivism and Perceptual Psychology’, Philosophical Topics 33, 1-78.
(2007b). ‘Predication and Truth’, The Journal of Philosophy 104, 580-608.
(2007c). ‘Descartes on Anti-individualism’, in Foundations of Mind.
(2007d). ‘Postscript: “Individualism and the mental”’, in Foundations of Mind.
(2007e). ‘Reflections on Two Kinds of Consciousness’, in Foundations of Mind.
(2007f). ‘Postscript: “Belief De Re”’, in Foundations of Mind.
(2007g). ‘Psychology Supports Independence of Phenomenal Consciousness: Commentary on Ned Block’, Behavioral and Brain Sciences, 30, 500-501.
(2009a). ‘Five Theses on De Re States and Attitudes’, in J. Almog and P. Leonardi (eds.) The Philosophy of David Kaplan (New York: Oxford University Press).
(2009b). ‘Perceptual Objectivity’, The Philosophical Review 118, 285-324.
(2010a). ‘Steps toward Origins of Propositional Thought’, Disputatio 4, 39-67.
(2010b). ‘Modest Dualism’, in R. Koons and G. Bealer (eds.) The Waning of Materialism (New York: Oxford University Press). Reprinted in Cognition Through Understanding.
(2011a). ‘Self and Self-Understanding’: The Dewey Lectures. Presented in 2007. Published in The Journal of Philosophy 108, 287-383. Reprinted in Cognition Through Understanding.
(2011b). ‘Border-Crossings: Perceptual and Post-Perceptual Object Representation’, Behavioral and Brain Sciences 34, 125.
(2012). ‘Living Wages of Sinn’, The Journal of Philosophy 109, 40-84. Reprinted in Cognition Through Understanding.
(2013a). ‘Reflection’, in Cognition Through Understanding.
(2013b). ‘Postscript: Content Preservation’, in Cognition Through Understanding.
(2013c). ‘Frege: Some Forms of Influence’, in M. Beaney (ed.) The Oxford Handbook of the History of Analytic Philosophy. Oxford: Oxford University Press.
(2014a). ‘Adaptation and the Upper Border of Perception: Reply to Block’, Philosophy and Phenomenological Research 89, 573-583.
(2014b). ‘Perceptual Content in Light of Perceptual Consciousness and Biological Constraints: Reply to Rescorla and Peacocke’, Philosophy and Phenomenological Research 88, 485-501.
(2019). ‘Psychological Content and Ego-Centric Indexes’, in A. Pautz and D. Stoljar (eds.) A. Pautz, Blockheads! Essays on Ned Block’s Philosophy of Mind and Consciousness (Oxford: Oxford University Press).
(2020). ‘Entitlement: The Basis for Empirical Warrant’, in N. Pederson and P. Graham (eds.) New Essays on Entitlement (Oxford: Oxford University Press).
b. Secondary Literature
Two volumes of essays have been published on Burge’s work: M. Frápolli and E. Romero (eds.) Meaning, Basic Self-Knowledge, and Mind: Essays on Tyler Burge (Stanford, CA: CSLI Publications, 2003); and M. Hahn and B. Ramberg (eds.) Reflections and Replies: Essays on the Philosophy of Tyler Burge (Cambridge, MA: MIT Press, 2003). The second volume is nearly unique, among Festschriften, in that Burge’s responses make up nearly half of the book’s 470 pages. Further pieces include the following:
An article on Burge in The Oxford Companion to Philosophy, Ted Honderich (ed.) Oxford: Oxford University Press, 1995.
An article on Burge, in Danish Philosophical Encyclopedia. Politikens Forlag, 2010.
Interview with Burge. Conducted by James Garvey, The Philosophers’ Magazine, 2013—a relatively wide-ranging yet short discussion of Burge’s views.
Interview with Burge. Conducted by Carlos Muñoz-Suárez, Europe’s Journal of Psychology, 2014—a discussion focused on anti-individualism and perception.
Article on Burge, in the Cambridge Dictionary of Philosophy, Peter Graham, 2015.
Article on Burge, in the Routledge Encyclopedia of Philosophy, Mikkel Gerken and Katherine Dunlop, 2018—provides a quick overview of some of Burge’s philosophical contributions.
Article on Burge, in Oxford Bibliographies in Philosophy, Brad Majors, 2018—contains brief summaries of most of Burge’s work, together with descriptions of a small portion of the secondary literature.
No person ever steps into the same river twice—or so goes the Heraclitean maxim. Obscure as it is, the maxim is often taken to express two ideas. The first is that everything always changes, and nothing remains perfectly similar to how it was just one instant before. The second is that nothing survives this constant flux of change. Where there appears to be a single river, a single person or, more generally, a single thing, there in fact is a series of different instantaneous objects succeeding one another. No person ever steps into the same river twice, for it is not the same river, and not the same person.
Is the Heraclitean maxim correct? Is it true that nothing survives change, and that nothing persists through time? These ancient questions are still at the center of contemporary metaphysics. This article surveys the main contemporary theories of persistence through time, such as three-dimensionalism, four-dimensionalism and the stage view (§ 1), and reviews the main objections proposed against them (§ 2, 3, 4).
Theories of persistence are an integral part of the more general field of the metaphysics of time. Familiarity with other debates in the metaphysics of time, universals, and mereology is here presupposed and can be acquired by studying the articles ‘Time’, ‘Universals’, ‘Properties’, and ‘Material Constitution’ in this encyclopedia.
This chapter presents contemporary theories of persistence from their most basic (§ 1a) to their most advanced forms (§ 1b and § 1c). It then discusses some ways of making sense of temporal parts (§ 1d), the relation between theories of persistence and theories of time (§ 1e), and the topic of the persistence of events (§ 1f).
a. The Basics
While the Heraclitean maxim denies that anything survives change and persists through time, we normally assume that some things do survive change and do persist through time. This bottle of sparkling water, for example, was here 5 minutes ago, and still is, despite its being now half empty. This notepad, for another example, will still exist tonight, even if I will have torn off some of its pages. In other words, we normally assume some things to persist through time. But before wondering whether our assumptions are right or wrong, we should wonder: what is it for something to persist? Here is an influential definition, first introduced by David Lewis (1986, 202):
Persistence
Something persists through time if and only if it exists at various times.
So, the bottle persists through time, if it does at all, because it exists at various times—such as now as well as five minutes ago, and the notepad persists through time because it exists at various times—such as now as well as later tonight.
Lewis’ definition makes use of the notion of existence at a time. The notion is technical, but its intended meaning should be clear enough. The following intuitive gloss might help clarify it. Something exists at, and only at, those times at which it is, in some sense, present, or to be found. So, Socrates existed in 400 B.C.E. but not in 1905, while I exist in 2019, at all instants that make up 2019, but at no time before the date of my birth (on temporal existence: Sider 2001: 58-59).
Persistence through time is sometimes also alternatively called ‘diachronic identity’—literally, ‘identity across time’. The reason for this name is simple enough. If this notepad exists now and will also exist afterwards, then there is a sense in which the notepad which exists now and the notepad that will exist later on are the same and identical. In which sense are they identical? What is the kind of identity here involved?
It is useful to introduce here a fundamental distinction between numerical and qualitative identity. On the one hand, numerical identity is the binary relation that anything bears to itself, and to itself alone (Noonan and Curtis 2018). For example, I, like everything else, am numerically identical to myself and to nothing else. Superman, for another example, is numerically identical to Clark Kent and Augustus is numerically identical with the first Roman emperor. This relation is called ‘numerical identity’, for it is related in an important way with the number of entities that exist. If superman is numerically identical to Clark Kent, then they are one entity, and not two. And if superman is numerically different from batman, then they are two entities, and not one. On the other hand, qualitative identity is nothing else than perfect similarity (Noonan and Curtis 2018). If two water molecules could have exactly the same mass, electrical charge, spatial configuration, and so on, so as to be perfectly similar, then they would be qualitatively identical. (It is controversial whether two entities can ever be perfectly similar—more on this later. Still, it is not difficult to find cases of perfect similarity. For example, an entity at a time is perfectly similar to itself at the same time.)
Having distinguished qualitative and numerical identity, what is, again, the sense of identity that is involved in diachronic identity? It is numerical identity. For recall: the question was whether, say, a river is a single—thus one—entity existing at different times, or rather a series of—thus many—instantaneous entities existing one after another.
Here is a second outstanding question that concerns persistence. Suppose that the Heraclitean maxim is wrong, and things persist through time. Do all things that persist through time persist in the same way? Or are there different ways of persisting through time? The consensus is that there are in fact several ways of persisting through time. In order to appreciate this fact, it is useful to contrast two kinds of entities that are supposed to persist, in one sense or another, through time: events and material objects. On the one hand, consider events. An event is here taken to be anything that is said to occur, happen, or take place (Cresswell 1986, Hacker 1982). Examples include a football match, a war, the spinning of a sphere, the collision of two electrons, the life of a person. Changes, processes, and prolonged states, if any, are notable examples of events. On the other hand, a material object can be thought of as the subject of those events, such as the football players, the soldiers, the sphere, the electrons and the person who lives. (For more on events see: What is an Event?)
Both material objects and events, or at least some of them, seem to persist through time. We have already discussed some examples involving objects, and it is equally easy to find examples of persisting events—basically, any temporally extended event would do. However, even if both objects and events seem to persist through time, they seem to do that in two different ways. An event persists through time by having different parts at different times. For example, a football match has two halves. These halves are parts of the match. But clearly enough they are not spatial parts of the match: they are not spread across different places, but across different times. That is why such parts are called ‘temporal parts’. The way of persisting of an event, by having different temporal parts at different times, is called ‘perdurance’ (Lewis 1986: 202).
Perdurance
Something perdures if and only if it persists by having different temporal parts at different times.
Throughout this article, ‘part’ means ‘proper part’, unless otherwise specified.
On the other hand, an object seems to persist in a different way. If an object persists through time, what is present of an object at different times is not a part of it, but rather the object itself, in its wholeness or entirety. This way of persisting, whereby something persists by being wholly present at different times, is called ‘endurance’ (Lewis 1986: 202). (‘Wholly present’ here clearly contrasts with the ‘partial’ presence of an event at different times—more on this later.)
Endurance
Something endures if and only if it persists by being wholly present at different times.
That being said, the contemporary debate on persistence focuses on material objects. In which way do they persist, if at all? A first theory, which takes the intuitions presented so far at face value, says that objects do indeed persist by being wholly present at different times, and so endure. (Endurantists include Baker (1997, 2000); Burke (1992, 1994); Chisholm (1976); Doepke (1982); Gallois (1998); Geach (1972a); Haslanger (1989); Hinchliff (1996); Johnston (1987); Lombard (1994); Lowe (1987, 1988, 1995); Mellor (1981, 1998); Merricks (1994, 1995); Oderberg (1993); Rea (1995, 1997, 1998); Simons (1987); Thomson (1983, 1998); van Inwagen (1981, 1990a, 1990b); Wiggins (1968, 1980); Zimmerman (1996).)
Endurantism
Ordinary material objects persist by being wholly present at different times; they are three-dimensional entities.
Endurantism is usually taken to be closer to common sense and favored by our intuitions. However, as we see later, endurantism does not come without problems. Due to those problems, and inspired by the spatiotemporal worldview suggested by modern physics, contemporary philosophers have also taken seriously the idea that objects are four-dimensional entities spread out both in space and time, and which divide into parts just like their spatiotemporal location does, and thus persist through time by having different temporal parts at different times, just like events do. This view is called perdurantism. (Perdurantists include Armstrong (1980); Balashov (2000); Broad (1923); Carnap (1967); Goodman (1951); Hawley (1999); Heller (1984, 1990); Le Poidevin (1991); Lewis (1986, 1988); McTaggart (1921, 1927); Quine (1953, 1960, 1970, 1981); Russell (1914, 1927); Smart (1972, 1963); Whitehead (1920).)
Perdurantism
Ordinary material objects persist by having different temporal parts at different times; they are four-dimensional entities.
Perdurantism is also known as ‘four-dimensionalism’—for perdurantism has it that objects are extended in four dimensions (this contrasts with endurantism, according to which objects are extended at most in the three spatial dimensions, and hence is also called ‘three-dimensionalism’).
Under perdurantism, what exists of me at each moment of my persistence is, strictly speaking, a temporal part of me. And each of my temporal parts is numerically different from all others.
One might be tempted to think that, as a consequence, perdurantism denies that I persist through time. This would be a mistake. While my instantaneous temporal parts do not persist—they exist at one time only—I am not any of those parts. I, as a whole person, am the temporally extended collection, or mereological sum, of all those parts. Hence, I, as a whole person, exist at different times, and thus persist. Compare this with the spatial case. I occupy an extended region of space by having different spatial parts at different places. But I am not numerically identical to those parts. I, as a whole, exist at different places in the sense that in those different places there is a part of me. That is why perdurance implies persistence through time.
We started this article with the question of whether objects persist through time. We have so far presented two theories, and both of them affirm that objects do persist through time. It is now time to introduce a third theory of persistence, the one that consists in the denial of this claim, and that has it that, in place of seemingly persisting objects, there really is a series of instantaneous stages. This theory is called the ‘stage view’, or also ‘exdurantism’. (Stage viewers include Hawley (2001), Sider (1996, 2001), Varzi (2003).)
Stage view
Ordinary material objects do not persist through time; in place of a single persisting object there really is a series of instantaneous stages, each numerically different from the others.
The stage view is often confused with perdurantism. The reason is that many contemporary stage viewers believe in a mereological doctrine called ‘universalism’, or also ‘unrestricted fusion’. According to mereological universalism, given a series of entities, no matter how scattered and unrelated, there is an object composed of those entities (see Compositional Universalism). If we combine the stage view with universalism, we get to an ontology in which the stages compose four-dimensional objects which are just like the four-dimensional objects of the perdurantist.
However, the two views are clearly distinct. Here are a few crucial differences. (i) There is, first, a semantic difference: under perdurantism, singular terms referring to ordinary objects, such as “Socrates”, usually refer to persisting, four-dimensional objects, whereas under the stage view, singular terms referring to ordinary objects refer to one instantaneous stage (which particular stage is referred to is determined by the context). So, while under the stage view there might be four-dimensional objects, so-called ordinary objects (such as “Socrates”) are not identified with them, but rather with the stages (Sider 2001, Varzi 2003). (It should be pointed out that significant work is here done by the somehow elusive notion of ‘ordinary object’; see Brewer and Cumpa 2019.) (ii) A second crucial difference has to do with the metaphysical commitment to four-dimensional entities. While perdurantism is by definition committed to four-dimensional entities, the stage view is by definition only committed to the existence of instantaneous stages. If the stage viewer eventually believes in four-dimensional collections of those stages—and she might well not—such a commitment is not an essential part of her theory of persistence. (iii) A third interesting difference has to do with the metaphysical commitment to the instantaneous stages. While this commitment is built into the stage view, it is not built into four-dimensionalism (Varzi 2003). A four-dimensionalist might believe her temporal parts to be always temporally extended and deny the existence of instantaneous temporal parts (for example, because she believes that time is gunky. Incidentally, it is worth noting that from a historical point of view, the guiding intuition of the stage view—namely that objects do not persist through time or change—emerged much earlier than the guiding intuition of four-dimensionalism. While the former can be traced back to, if not Heraclitus, at least the academic skeptics (Sedley 1982), the latter, as far as we know, emerged no earlier than the end of the XIX century (Sider 2001).
b. Locative Theories of Persistence
Here are, again, the definitions of endurantism and perdurantism that we introduced above:
Endurantism
Ordinary material objects persist by being wholly present at different times; they are three-dimensional entities.
Perdurantism
Ordinary material objects persist by having different temporal parts at different times; they are four-dimensional entities.
One can appreciate the fact that these definitions seem to mix together two aspects of persisting objects (Gilmore 2008). First, there is the mereological aspect. There, the question is whether persisting objects have temporal parts or not. Second, there is an aspect that concerns the shape and size of persisting objects. There, the question is whether persisting objects have a four-dimensional shape, and are temporally extended, or have a three-dimensional shape, and are not extended in time. How can we make sense of these two aspects? What is it for something to be three- or four-dimensional? And how can we make sense of what a temporal part really is? While the latter question is tackled in section § 1d, we shall now focus on the former question concerning shape and extension.
So, what is it for something to be three- or four-dimensional? An illuminating approach to this question—an approach that everyone who wants to work on persistence must be familiar with—comes from location theory (Casati and Varzi 1999, Parsons 2007). We shall thus focus on location first, and then come back to persistence.
Location is here taken to be a binary relation between an entity and a region of a dimension—be it space, time, spacetime—where the entity is in some sense to be found (Casati and Varzi 1999). Location is ambiguous. There is a weak sense, in which you are located at any region that is not completely free of you. In that sense, for example, reaching an arm inside a room would be enough to be weakly located in that room. But there is also a more exact sense, in which you are located at that region of space that is of your shape, size, and that is as distant to everything else as you are—roughly, the region that is determined by your boundaries (Gilmore 2006, Parsons 2007). We shall here follow standard practice and call these modes of location ‘weak location’ and ‘exact location’, respectively.
The intuitive gloss related to exact location suggests that it is interestingly linked to shape, and thus offers us a way of making a more precise sense of what is it for something to be three- or four-dimensional. To be four-dimensional simply is to be exactly located at a four-dimensional spacetime region, while to be three-dimensional is to be located at spacetime regions that are at most three-dimensional. The same gloss helps us make sense of what it is for something to be extended or unextended in time. To be extended in time is for something to be exactly located at a temporally extended spacetime region, while for something to be temporally unextended is for it to be exactly located at temporally unextended spacetime regions only (Gilmore 2006).
At this point, it might be useful to sum up the two aspects mixed together in the definitions of endurantism and perdurantism offered above. We should distinguish: (i) the mereological question of whether persisting objects have temporal parts, and (ii) the locative question of whether objects are exactly located at temporally extended, four-dimensional spacetime regions or rather at temporally unextended, three-dimensional regions only.
Mereological endurantism
Ordinary persisting objects do not have temporal parts.
Mereological perdurantism
Ordinary persisting objects have temporal parts.
Locative three-dimensionalism
Ordinary persisting objects are exactly located at temporally unextended regions only.
Locative four-dimensionalism
Ordinary persisting objects are exactly located at the temporally extended region of their persistence only.
Let us explore locative three-dimensionalism further. In particular, we explore here two consequences of the view. First, locative three-dimensionalism has it that objects persist, thus covering a temporally extended region. But they persist by being exactly located at temporally unextended regions. This requires the persisting object to be located at more than one unextended region; more precisely, at all those unextended regions that collectively make up the spacetime region covered during their persistence. Hence, locative three-dimensionalism implies multi-location, that is, the fact that a single entity has more than one exact location (Gilmore 2007). This contrasts with the unique, four-dimensional, spatiotemporal location of an object under locative four-dimensionalism.(Two remarks are in order. First, there is logical space for other locative views as well, but we shall not consider them here. Second, these definitions make use of the notion of persistence, which can now be defined in locative terms as well. Here is a simple way of doing this. Let us define the path of an entity as the mereological sum of its exact locations (Gilmore 2006). An entity persists if its path is temporally extended.)
A second interesting consequence of the view is that, under plausible assumptions, persisting objects will not have temporal parts, for what exists of an entity at a time is the entity itself, exactly located at that time, and not a temporal part thereof. So, under plausible assumptions, locative three-dimensionalism implies mereological endurantism: if something is three-dimensional it does not have temporal parts.
Interestingly, however, being multi-located at instants is not the only way to persist without temporal parts. In principle, something might be exactly located at a four-dimensional, temporally extended spacetime region without dividing into temporal parts. This is the case if the persisting, four-dimensional object is also an extended simple, that is, an entity that is exactly located at an extended region, but is also mereologically simple, in that it lacks any parts (for more on the definition, possibility and actuality of extended simples, see Hudson 2006, Markosian 1998, McDaniel 2003, 2007a, 2007b, Simons 2004). Lacking any parts at all, the persisting object will also lack any temporal parts, thus being mereologically enduring. We shall call simplism this combination between mereological endurantism and locative four-dimensionalism (Costa 2017, Parsons 2000, 2007).
Simplism
Ordinary persisting objects are mereologically simple and exactly located at the temporally extended region of their persistence only.
To sum up, making use of some conceptual tools borrowed from location theory allowed us to make better sense of perdurantism and its claim that persisting objects are four-dimensional, temporally extended entities. Moreover, it allowed us to distinguish two forms of endurantism, namely locative three-dimensionalism according to which persisting objects are exactly located at instantaneous, three-dimensional regions of spacetime, and thus lack temporal parts, and simplism, according to which persisting objects are four-dimensional, temporally extended, mereological simples, and thus lack temporal parts.
c. Non-Locative Theories of Persistence
The previous section described two radically different ways of capturing endurantism. Interestingly enough, both of them seem to be committed to controversial claims, such as the actuality of multi-location or of extended simples. Of course, any objection against the actuality of multi-location and of extended simples counts de facto also as an objection against either form of endurantism. We cover some of these objections below. For the time being, suffice it to say that both forms of endurantism are controversial.
Some scholars have taken this result as evidence that endurantism is hopeless (Hofweber and Velleman 2011). But others have taken it as a reason to look for other ways of making sense of endurantism (Fine 2006, Hawthorne 2008, Hofweber and Velleman 2011, Costa 2017, Simons 2000a). So far, we have worked under the standard assumption that it is useful and correct to try to make sense of endurantism in locative terms, that is, under the assumption that the relation between objects and times is the one described in location theory. Some scholars take this assumption to be fundamentally misguided.
Why do they believe this assumption to be fundamentally misguided? One reason might come from intuitions embedded in natural language. Fine (2006), for instance, provides linguistic data in support of the idea that objects and events are in time in fundamentally different ways, which he calls ‘existence’ and ‘extension/location’, respectively (he also offers linguistic data in support of the idea that objects and events are in space in the same way in which events are in time). Moreover, he suggests that two radically different forms of presence might come with different mereological requirements: if something is extended/located at a region, it divides into parts throughout that region, while if something exists at an extended region, it divides into parts throughout that region. Since objects are taken to exist at times instead of being extended/located at times, they will not divide into temporal parts.
Another source of evidence from natural language comes from the attribution of temporal relations (van Fraassen 1970). The intuitive gloss for exact location required any temporally located entity to enter temporal relations. However, it is awkward to attribute temporal relations to objects (consider “Alexander is 15 years after Socrates”) and we would naturally lean towards reinterpreting such attributions as attributions of temporal relations to events (“Alexander’s birth is 15 years after Socrates’ death”). This linguistic data might suggest two intuitions. The first one is that the relation between objects and times should not be the location of location theory. The second one is that the way in which objects are in time is derivative with respect to their events: for an object to exist at a time is for it to be the subject of an event located at that time. Under such a view, the possibility of endurantism coincides with the possibility for a single object to participate in numerically different events (Costa 2017, Simons 2000a).
A different non-locative approach consists in trying to make sense of the endurantism/perdurantism distinction in terms of what is intrinsic to a time (Hawthorne 2006, Hofweber and Velleman 2011). According to this approach, something is wholly present at a time if it is intrinsic to how things are that that very object exists at it (Hawthorne 2006) or if the identity of that object is intrinsic to that time (Hofweber and Velleman 2011). These definitions of wholly present are then plugged into the classic definition of endurance: something endures if it is wholly present at each time of its persistence.
Apart from their being grounded in natural language and intuitions, such views have been motivated on the basis of the controversy of their alternatives. Since both locative forms of endurantism are controversial, these non-locative views should be taken seriously.
d. What is a Temporal Part?
A notion that plays a fundamental role in the definition of perdurantism is the notion of a temporal part. Endurantists have sometimes lamented the notion to be substantially unintelligible (van Inwagen 1981, Lowe 1987, Simons 1987). Hence, it is in the interest of perdurantists to try and clarify it (as well as in the interest of those endurantists who believe that events perdure).
What is a temporal part, such as my present temporal part, supposed to be? First of all, it should be clear that a temporal part is not simply a part that is in time. A spatial part of me, such as my left hand, is certainly not outside time, but it is not a temporal part of mine. It is not, because it is not, in a sense, big enough: a temporal part of mine at a given time must be as big as I am at that time. So, one might be tempted to define a temporal part as a part that is as big as the whole is at the time at which the part is supposed to exist. Moreover, the notion of ‘being as big as’ might be spelled out in terms of spatial location. However, this definition would not do if there are perduring entities that are not in space (such as, for example, a Cartesian mind, or a mental state conceived of as non-spatial event) or if there are parts of objects that are as big as the object is at a time without being temporal parts of it, such as, for example, the shape trope of my body conceived of as something spatially located and as a part of me (Sider 2001). (For tropes and for located properties, see: The Ontological Basis of Properties.)
Sider (2001) offers a standard definition of a temporal part:. It reads:
Temporal part
x is a temporal part of y at t if (i) x is a part of y at t; (ii) x exists at, and only at, t, (iii) x overlaps at t everything that is part of y at t.
Let us have a look at each clause in turn. The first one simply says that temporal parts must be parts. The second one ensures that the temporal part exists at the relevant time only. The third one ensures that it includes all of y that exists at that time. (The reader might have noticed that Sider is here using the temporary, three-place notion of parthood—x is part of y at t—instead of the familiar, binary, timeless notion—x is part of y. Here, by ‘timeless’ we simply mean that the notion is not relativized to a time, and not that what exemplifies the notion is in any sense timeless, or outside time. The use of the temporary notion is conceived as a friendly gesture towards the endurantist who usually relativizes the exemplification of properties to times—more on this in § 2a. However, temporal parts might be defined by means of the binary, timeless notion as well. One just needs to replace in the previous definition every instance of the temporary notion with the binary one, and to replace the third clause as (iii*) x overlaps every part of y that exists at t. A second note concerns the fact that Sider’s definition is supposed to work for instantaneous temporal parts. A crucial question then is how, and whether, this definition could be adapted to a metaphysics in which time is gunky (see Kleinschmidt 2017).)
e. Theories of Persistence and Theories of Time
One of the central debates of contemporary metaphysics is the debate as to whether only the present exists, or rather past, present and future all equally exist (Sider 2001). The former view is called ‘presentism’, whereas the latter is called ‘eternalism’ (for more on presentism and eternalism, as well as further alternatives, see: Presentism, the Growing-Past, Eternalism, and the Block-Universe). What are the logical relations between endurantism/perdurantism and presentism/eternalism?
While the combinations of endurantism and presentism, and of perdurantism and eternalism have usually been accepted as possible (for example Tallant 2018), for a long time, it had been supposed that endurantism and eternalism were incompatible with each other. The reasons for this supposed incompatibility are difficult to track down. Summarily, here are two possible reasons. In part, this supposed incompatibility has to do with the so-called problem of temporary intrinsics. In part, it has to do with the idea that eternalism, when combined with spacetime unitism, yields a view in which persisting objects cover a four-dimensional region of spacetime, and thus are four-dimensional and divide into temporal parts (Quine 1960, Russell 1927). Such reasons are now usually discarded. We focus on temporary intrinsics later in § 2a. We have already explained that there are at least two ways in which an object might cover a four-dimensional region of spacetime by being four-dimensional and lacking temporal parts (simplism), or even without being four-dimensional themselves (locative three-dimensionalism). Apart from these locative options, we have also remarked that there are non-locative theories of persistence, and that such theories require the rejection of spacetime unitism. If unitism is successfully rejected, then the problem, if there is one at all, seems not present itself in the first place.
Can one be a perdurantist and also a presentist? A few publications have been devoted to this question, though no conclusive answer has been reached (Benovsky 2009, Brogaard 2000, Lombard 1999, Merricks 1995). On the one hand, one might believe that nothing can be composed of temporal parts if all except one of those parts (namely the past and future ones) do not exist. On the other hand, it has been suggested that one might solve the problem by means of an accurate use of tense operators: while past temporal parts do not presently make part of our ontological catalogue, they did, and maybe their past existence is enough to make them entitled to be parts of a perduring whole.
f. The Persistence of Events
Although contemporary metaphysicians focus mainly on the persistence of objects, there are also parallel debates concerning the persistence of other kinds of entities, such as tropes, facts, dimensions, and, in particular, of events (Galton 2006, Stout 2016). Events are traditionally taken to perdure, for it is intuitively the case that events—such as a football match—divide into temporal parts, such as its two halves. This claim is also accepted by several endurantists, who believe that while objects endure, events perdure. Such a view traces back at least to medieval scholasticism (Costa 2017a). But, once again, the traditional view does not come without dissenters. Contemporary scholars have defended the idea that events or, more precisely, processes endure (Galton 2006, Galton and Mizogouchi 2009, Stout 2016). One reason to believe that at least some entities that are said to be happening endure comes from the fact that we attribute change to them, and that, allegedly, genuine change requires endurance of its subject (Galton and Mizogouchi 2009: 78-81). For example, the very same process of walking might have different speeds at different times. But for change to occur, the numerically same subject, and not temporal parts thereof, must have incompatible properties at different times (heterogeneity of parts is not enough for change to occur). Hence, changing processes must endure. Defenders of enduring processes usually tend to believe that alongside enduring processes there are also perduring events, and sometimes claim that enduring processes are picked out by descriptions that make use of imperfective verbs (such as the walking that is/was/will be happening) while perduring events are picked out by descriptions that make use of perfective verbs (such as the walking that happened/will happen) (Stout 1997: 19). To learn more about the question of whether change requires the endurance of its subject, see the No Change objection against perdurantism, discussed below in § 3b. To learn more about the alleged distinction between processes and events and the related use of (im)perfective verbs, see (Steward 2013, Stout 1997).
2. Arguments against Endurantism
Endurantism has it that objects persist by being wholly present at each instant of their persistence. Thus conceived, objects persist without having temporal parts. Endurantism is usually recognized as the theory of persistence that is closest to common sense and intuition, and thus has sometimes been described as the default view, that is, the view to be held until or unless it is convincingly shown to be hopelessly problematic. So, is endurantism hopelessly problematic?
a. The Argument from Change, a.k.a. from Temporary Intrinsics
A first serious objection against endurantism which traces back to ancient philosophy (Sedley 1982) comes from change. In its simplest form, the objection sounds as follows. Change seems to require difference: if something has changed, it is different from how it was. But if it is different, it cannot be identical, on pain of contradiction. Now, endurantism requires a changing thing to be identical across change, hence, the objection goes, endurantism is false. In this simple form, the objection has a simple answer, that relies on the distinction between qualitative and numerical identity outlined in § 1a. The kind of difference required by change is qualitative difference (not being perfectly identical), and not numerical difference (being two instead of one). Hence, in a change, you might be the same as before (numerical identity) as well as different as before (qualitative difference) without this being contradictory.
This basic argument from change can evolve into two slightly more sophisticated forms. The first form aims to show that even if this analysis of change as numerical identity and qualitative difference is offered, change still results in a contradiction. For change requires a single object—Socrates, say—to have incompatible properties, such as being healthy and sick. But of course, exemplification of incompatible properties leads to a contradiction. For who is sick is not healthy, and hence the numerically same Socrates must be both healthy and not healthy (Sider 2001).
The second slightly more sophisticated form aims to show that change is incompatible with Leibniz’ law, also called the Indiscernibility of Identicals. Leibniz’s law says that numerically identical entities must share all properties. But change thus described is incompatible with Leibniz’s law, for it requires the numerically same entity—such as Socrates at one time and Socrates at another time—not to share all properties—while Socrates at one time is sick, at a later time he is not (Merricks 1994, Sider 2001).
One way to block these two more sophisticated forms consists in rejecting the two guiding principles they rely on. But while this could perhaps more lightheartedly be done with Leibniz’s law, rejecting the Law of Non-contradiction, though not impossible (see Paraconsistent Logic), is certainly not an obviously promising move.
A second way to block these two more sophisticated forms consists in bringing time into the picture. A veritable contradiction and veritable violation of Leibniz’ law would only result from the possession of incompatible properties atthe same time. But the incompatible properties of a change are had at the two ends of the change, and hence at two different times.
While this move certainly sounds promising, it is not obvious how time really comes into the picture. Here are two outstanding questions. The first one has to do with the Law of Non-contradiction and Leibniz’s law. When we first introduced them, we did not mention time at all. And in contemporary logic and metaphysics, the two laws are expressed in formulas in which time seems to play no role:
Law of Non-Contraction (LNC)
¬ (p ∧ ¬p)
Leibniz’ law (LL)
x = y → ∀P (Px ↔ Py)
Do such principles require a modification in light of the claim that incompatible properties are had at different times?
The second outstanding question has to do with the claim that a changing object has incompatible properties at different times. This seems to require objects to exemplify properties at times. But how is this temporary, or temporally relative, notion of exemplification to be understood (for example, Socrates is sick at time t), especially as opposed to the timeless notion of exemplification (for example, Socrates is sick) (Lewis 1986)? (Once again, here, by “timeless” we simply mean that the notion is not relativized to a time, and not that what exemplifies the notion is in any sense timeless, or outside time.)
Let us begin with the latter question first. What is it for an object to have a property at a time—what is it for, say, Socrates to be sick at time t? To have a look at the other side of the barricade, perdurantism and the stage view seem to have very simple answers to this question. Under the stage view, temporary exemplification is to be analyzed as timeless exemplification by an instantaneous stage: Socrates is sick at t if and only if the instantaneous stage we call Socrates that exists at t is sick (Hawley 2001, Sider 1996, Varzi 2003). Under perdurantism, temporary exemplification is to be analyzed as timeless exemplification by a temporal part: Socrates is sick at t if and only if the temporal part of Socrates that exists at t is sick (Lewis 1986: 203-204, Russell 1914, Sider 2001: 56). So, under perdurantism and the stage view, temporary exemplification is analyzed as timeless exemplification, and therefore there is no need of adapting LNC or LL in any way: the original timeless reading would do.
How would an endurantist make sense of temporary exemplification—of, say, Socrates being sick at time t? We shall here consider a few options. First, notice that if presentism is true, the endurantist too might analyze it in terms of timeless exemplification (Merricks 1995). If t were present, then “Socrates is sick at t” simply would reduce to “Socrates is sick”, full stop. If t were past/future, then “Socrates is healthy at t” would reduce to “Socrates is healthy” under the scope of an appropriate tense-operator, such as: “it was 5 years ago the case that: Socrates is healthy” (for tense operators, see: The Syntax of Tempo-Modal Logic). Moreover, since we cannot infer from “t was 5 years ago the case that: Socrates is healthy” that “Socrates is healthy”, no contradiction or violation of LL follows. However, this solution requires the endurantist to buy presentism.
Second, an endurantist might interpret “Socrates is sick at t” as involving a binary relation—the relation of “being sick at” —linking Socrates and time t (Van Inwagen 1990a, Mellor 1981). This solution does not require us to make any change to the timeless formulations of LNC and LL (it just follows that the relevant instances of LNC and LL will involve relations rather than properties). And, of course, no violation of LNC or LL would follow, insofar as Socrates’ being sick and healthy would be two incompatible relations involving different relata (compare: no contradiction follows from the fact that I love Sam and I do not love Maria). However, this requires a certain deal of metaphysical revisionism. To put it in Lewis’ words, if we know what health is, we know it is a monadic property and not a relation, and we know it is intrinsic and not extrinsic (Lewis 1986: 204) (for intrinsic properties, see: Intrinsic and Extrinsic Properties).
Third, an endurantist might interpret “at t” as an adverbial modifier: when Socrates is sick at t, he exemplifies the property in a certain way, namely t-ly (Johnston 1987, Haslanger 1989, Lowe 1988). If this view of temporary exemplification is accepted, we should also consider more carefully how the original formulations of LNC and LL should be adapted, for the exemplification they involve is temporally unmodified. The task might be more complicated than one might expect (Hawley 2001, 21f). In any case, under certain assumptions, this adverbialist solution makes it the case that change implies no violation of LNC or LL: Socrates is sick and healthy, but in two different ways—t-ly and t’-ly (compare: the fact that I am actually sitting and possibly standing does not imply a contradiction). But, once again, this involves a certain amount of revisionism. For while adverbial modifiers correspond to different ways of exemplifying an attribute, temporal modifiers seem not to correspond to different ways of exemplifying an attribute: for example, standing on Monday and standing on Tuesday seem not to be two different ways of standing.
There are other strategies that the endurantist might use to make sense of temporary exemplification. This is not the place to go through all of them. However, it is worth noting that even if all of them require a bit of revisionism, the endurantist might actually argue the kind of revisionism they involve is less nefarious than the revisionism required to reject endurantism itself (Sider 2001, 98).
b. The Argument from Coincidence
A second objection against endurantism comes from cases in which material objects seem to mereologically coincide—that is, share all parts and— – and locatively coincide— – that is, share the same location—without being numerically identical. If there are such cases, the objection goes, endurantists have a hard time making sense of them, while their alleged problematicity simply disappears if perdurantism or the stage view are assumed (Sider 2001).
What is so bad about mereological and locative coincidence? To start with locative coincidence, it just seems wrong that two numerically different material objects could fit exactly into a single region of space: instead of occupying the same place, they would just bump into each other. It might be the case that some particular kinds of microphysical particles, such as bosons, allow for this kind of co-location (Hawthorne and Uzquiano, 2011). It might also be the case that in some other possible world, with a different set of laws of nature, objects would not bump into each other, but rather pass through each other unaffected, and thus allow for co-location (Sider 2001). However, the ordinary middle-sized objects that populate our everyday life simply do not: they cannot share a same exact location.
Let us now turn to mereological coincidence. What is so bad about it? Suppose x and y share all parts at the same time. If they do, they will surely also happen to be spatially co-located. But if that is the case and they are numerically different, what could account for their numerical difference? What makes them different one from the other, if they have the same parts and the same location? Moreover, contemporary standard mereology—that is, classical extensional mereology—implies that no two objects can share all parts, a principle called ‘extensionality’ (Simons 1987; Varzi 2016).
Let us now consider two possible examples of mereological and locative coincidence. The first one is the case of a statue of Socrates and the lump of clay it is made of. As long as the statue exists, the statue and the lump of clay coincide both mereologically and locatively: they are exactly located at the same spatial region and they share all parts. And yet, there are reasons to believe they are numerically different. For instance, they have different properties. They indeed have different temporal properties: the clay, but not the statue, has the property of existing at times before the statue was created. And they seem to have different modal properties as well: only the clay, and not the statue, can continue to exist even if the clay gets substantially reshaped into, say, a statue of Plato. Since the statue and the lump of clay have different properties, we must conclude that they are numerically different, in virtue of Leibniz’s law.
A second case of coincidence without identity involves Tibbles the cat. As any other cat, Tibbles has a long fury tail. The tail is part of Tibbles just well as the rest of Tibbles—call it Tib—is. Tib is a part of Tibbles, and hence they are numerically different. But suppose that Tibbles loses her tail. It seems that both Tibbles and Tib would survive the accident. After all, cats do not die when losing their tails; and nothing actually happened to Tib when Tibbles lost her tail, so there is no reason to believe that Tib stopped existing. However, after the accident, Tibbles and Tib end up sharing the same exact location and end up sharing all parts. Hence, the case of Tibbles and Tib is yet another case of coincidence without identity.
Is it really the case that the statue is not the lump of clay, and Tibbles is not Tib? These claims might be resisted. For example, if identity is temporary—if x might be identical with y at one time and different at another—then one might say that even if before the accident Tibbles and Tib were different, after the accident they are identical (Gallois 1998, Geach 1980, Griffin 1977). However, this move does not come for free. Serious arguments have been offered to the effect that identity is not a temporary relation (Sider 2001: 165ff, Varzi 2003: 395).
A different option consists in saying that the statue is nothing else than the lump of clay as long as it possesses the property of being arranged statue-of-Socrates-wise (just like Socrates the philosopher is nothing else than Socrates who possess the property of being a philosopher, and certainly not a second person on top of Socrates). In that case, the statue and the lump of clay would not be numerically different (Heller 1990). However, unlike in the case of Socrates becoming a philosopher, it seems that when we create a statue, we have not merely changed something that existed before. Rather, it seems that we created something that did not exist before.
How do perdurantism and the stage view solve the problem of coincidence? Let us start with perdurantism. According to perdurantism, the statue and the piece of clay are four-dimensional objects composed of temporal parts. During the existence of the statue, they might well mereologically and locatively coincide. But since the lump of clay existed before, and will exist after, the statue, the lump has some temporal parts that the statue does not have. Hence, mereologically speaking they do not overall coincide (in fact, from the perdurantist, four-dimensional, perspective, the 4D statue is a part of the 4D lump of clay). Moreover, from a locative point of view, since the lump exists at times at which the statue does not, their spatiotemporal location is not the same. For sure, their spatial location might sometimes be the same; but this is as it should be: if you consider the exact spatial location of your hand, at that location, you and your hand coincide locatively. The same holds for Tibbles and Tib, for they do not mereologically coincide. Tibbles’ tail is a four-dimensional object that only Tibbles, and not Tib, contains as a part (Varzi 2003: 398). On the other hand, the stage viewer, who identifies ordinary objects with stages, will claim that after the creation of the statue, the statue and the piece of clay are numerically identical. Then, she will benefit from the flexibility of the temporal counterpart relation to make sense of the alleged different properties of the statue and the clay. The present clay will outlast the statue not because it will persist for a longer time—the statue is an instantaneous object, it does not persist—but because it has a clay-counterpart at times which are later than the times at which it has its last statue counterpart. The stage viewer will probably adopt a similar answer in the modal case as well. To illustrate, the claim that the clay, and not the statue, can survive reshaping translates into the claim that in a possible world in which the clay is reshaped, the actual clay has a clay-counterpart but not a statue-counterpart (Sider 2001: 194).
What can an endurantist say in cases of coincidence without identity? A first option could be to just bite the bullet: the statue and the piece of clay are indeed numerically different and indeed mereologically and locatively coincident. However, the endurantist will not want to just accept without qualification that different objects can thus coincide. Of course, she will agree, in normal circumstances different objects cannot thus coincide. She will then try to tell apart in a principled way the special cases that allow for coincidence from the normal cases which do not. One popular attempt to trace this difference in a principled way has to do with the notion of constitution. There is a sense, the idea goes, in which the clay constitutes the statue, and in which after the accident Tib constitutes Tibbles. These selected cases in which constitution is in play warrant the possibility—if not the necessity—of mereological and locative coincidence. This endurantist solution to the problem of coincidence is sometimes called the ‘standard account’ (Burke 1992, Lowe 1995). Of course, the standard account does not come for free. It requires one to adopt a theory of mereology different from classical extensional mereology, and a theory of location that allows for co-location, and this might seem to be a drawback in itself. Moreover, a proponent of such a view still has to tell a story on what she takes constitution to be. A much-discussed option is to make sense of constitution in terms of mutual parthood: the statue is part of the clay, and the clay is part of the statue (we are here using the technical notion of proper or improper part, which has numerical identity as a limit case; see Mereological Technicalities). Apart from requiring a substantial revision of even the most endurantist-friendly theories of mereology, appealing to mutual parthood is not yet enough to make sense of constitution. Mutual parthood is symmetrical while friends of constitution take constitution to be asymmetrical: the statue is constituted by the clay, but not vice versa (Sider 2001: 155-156). Contemporary neo-aristotelianism might come to the rescue in answering this question (Fine 1999; Koslicki 2008): constitution might be defined in terms of grounding (for example, one might say that the existence or nature of the clay grounds the existence or nature of the statue) or in hylomorphic terms (the statue is a compound of matter and form, and the clay is its matter).
Further endurantist solutions, to mention a few, include taking identity to be temporary (Gallois 1998, Geach 1980), embracing mereological essentialism (namely the view that changing parts results in the end of persistence; this would help with the case of Tibbles, but not with the case of the clay, which does not necessarily change its parts when arranged into a statue; see Burke 1994, Chisholm 1973, 1975, van Cleve 1986, Sider 2001, Wiggins 1979), or mereological nihilism (namely the view that there are mereologically atomic—that is, partless—objects, so that most if not all of the entities involved in the cases are not part of one’s ontological catalogue (see van Inwagen 1981, 1990a, Rosen e Dorr 2002, Sider 2013).
Apart from trying to respond to the objection, an endurantist could also launch the ball back in the opposite camp and argue that the solution proposed by the perdurantist does not apply in all cases. In the original cases, coincidence was only temporary: there were times at which the two objects did not coincide, either because one did not yet exist (the statue) or because one had a part that the other did not have (Tibbles and her tail). But what about cases in which coincidence is permanent? Consider for example the case in which an artist creates both the statue and the lump of clay at the same time and later on destroys them at the same time. In such a case, the perdurantist’s solution seems to be precluded, for the statue and the piece of clay will share all their temporal parts, so they will end up mereologically and spatiotemporally coinciding (Gibbard 1975, Hawley 2001, Mackie 2008, Noonan 1999). When confronted with such a case, a perdurantist might be forced to accept one of the endurantist’s solutions, and thus will not be allowed anymore to declare her position better off with respect to endurantism. Notice, though, that the perdurantist might actually reply that permanent coincidence does indeed result in numerical identity. After all, if coincidence is permanent, we have lost one of the two reasons to believe that the statue and the piece of clay are numerically different—namely that they existed at different times. Moreover, as regards the difference in modal properties, the perdurantist might just accept the aforementioned solution: the claim that the clay, and not the statue, can survive reshaping translates into the claim that in a possible world in which the clay is reshaped, the actual clay, numerically identical to the statue, has a clay-counterpart but not a statue-counterpart (Hawley 2001). Finally, notice that the problem of permanent coincidence is no problem at all for the stage viewer, who did not appeal to a difference in temporal parts between the statue and the piece of clay to explain coincidence away (Sider 2001).
c. The Argument from Vagueness
A third objection against endurantism comes from the phenomenon of temporal vagueness. Suppose a table is gradually mereologically decomposed: slowly, from top to bottom, one by one, each of the atoms composing it is taken away until, finally, nothing of the table remains. At the end of the process, the table does not exist anymore. So, it must have ceased to exist at some time. But which time? Even if we might have a rough idea of when it happened, it is much more difficult to tell the precise moment in which the table ceased to exist. Recall that we are removing from the table one atom after the other. The removal of which atom is responsible for the disappearance of the table? And how far away must the atom be to count as removed? It seems really hard to give a precise answer to these questions. The case of the disappearance of the table seems somehow to be vague or indeterminate.
How should we make sense of these ubiquitous cases of temporal vagueness or indeterminacy? One option could be to say that the kind of indeterminacy here involved is merely epistemic. This amounts to saying that there is a clear-cut instant at which the table stops existing, and that our inability to determine which one is due to our ignorance of the facts. There is a definitive atom which, once removed, is responsible for the disappearance of the table. Our puzzlement comes simply from the fact that we do not know which one it is. Though some scholars are happy to defend this epistemic option, others find it odd to insist that there must be a precise atom the removal of which results in the disappearance of the table. And that there is a precise distance of the atom from the rest of the table to make it count as removed. Why is it really that atom as opposed to, say, the immediately previous one? What is it so special about that atom that makes the table stop existing? And what is so special about the given distance to be enough to make the atom count as removed? After all, if you look at what remains of the table after the removal of that atom, you would probably be unable to tell any significant difference from what was there before the removal.
A second option could be to say that the kind of indeterminacy here involved does not have to do with our epistemic profile, but rather with the world itself. The reason why it is so difficult to identify a sharp cut-off point at which the table stops existing is that there is no fact of the matter about what this point is. While at some earlier and later times the table definitely does or does not exist, there are some times at which it simply is indeterminate whether the table still exists. Philosophers have always had a hard time in trying to understand ontic or worldly indeterminacy. For a long time, the standard option has been simply to reject this option as impossible (Dummett 1974; Russell 1923; Sider 2001).
However, if the indeterminacy here involved is neither epistemic nor ontic, what is it? Interestingly enough, perdurantism offers a clear way out from this dilemma. The perdurantist will believe that there is a series of four-dimensional entities involved in the case of the disappearing chair. A first four-dimensional entity includes temporal parts up to the point at which the first atom is removed, a second four-dimensional entity includes temporal parts up to the point at which the second atom is removed, and so on until we get to a four-dimensional entity that includes temporal parts up to the point at which only one atom of the table remains. Given this metaphysical picture, the question of the instant at which the table stops existing translates into the question of which of those four-dimensional entities is picked out by the term “table”. While a perdurantist might still say that the kind of indeterminacy here involved is epistemic or ontic, she could also say that it neither has to do with our epistemic limitations nor with the world itself. Rather, she could say that the problem arises because the term “table” is vague. Although the term is used in everyday circumstances, we simply have not made a decision as to how it should work in special circumstances such as the one that we are discussing here. That is where our puzzlement comes from. This kind of indeterminacy results from a mismatch between our language and the world and is therefore semantic in nature.
The endurantist might accept the alleged oddity that comes with interpreting these cases of indeterminacy as either epistemic or ontic and try to live with it. While endurantists have traditionally had a preference for the epistemic option, renewed interest in ontic indeterminacy—due for example to attempts to take canonical interpretations of quantum mechanics at face value—might make the second option a live one as well (Williams and Barnes 2011, Wilson and Calosi 2018). It has also been remarked that the endurantist might in principle mimic the perdurantist solution, along the following lines. The endurantist might posit in place of a single enduring table a series of coinciding enduring objects, each of which ceases to exist slightly later than the other. Such objects will have temporal boundaries that coincide with the nested temporal parts of the perdurantist solution, but unlike them will endure instead of perduring. Having this series of enduring objects in place, the question of the instant at which the table stops existing might translate into the question of which of those enduring entities is picked out by the term “table”. Thus, also for the endurantist this kind of indeterminacy will turn out to be semantic (Haslanger 1994). What can be said about this mimicking strategy? At first, one might be baffled by the sheer number of enduring, coinciding, and table-like entities that the solution requires. However, an endurantist might respond that the number of entities is no more than the one required by the perdurantist solution. In any case, while in the case of the perdurantist the position of this series of entities is part of the view itself, in the case of the endurantist it seems to be a mere strategy to solve the problem of vagueness, and thus it would not be surprising if perdurantists would consider it ad hoc.
d. The Unintelligibility Objection
Endurantism has it that persisting objects are wholly present at each time of their persistence. But what is it for something to be wholly present a time? If no account of this crucial notion is given, endurantism itself remains not properly defined. Moreover, if no account of the notion is possible at all—that is, if we cannot make sense of whole presence—then endurantism itself will turn out to be an unintelligible doctrine. And admittedly endurantists have no easy time in spelling out what whole presence really amounts to (Sider 2001).
Hence, again, what is it for x to be wholly present at time t? It might mean that:
(1) at time t, x has all of its parts.
But what does it mean to say that x has all of its parts? Are we talking about all the parts that x has at t? Or rather about all the parts that x had, has, and will ever have? In both cases, the endurantist is in trouble. In the former case, (1) becomes
(2) at time t, x has all the parts that it has at t.
However, this hardly identifies the endurantist solution alone. The perdurantist too will believe that at any given time, a four-dimensional entity has all the parts it has at that time. Given that the endurantist intended her view to be different from the perdurantist one, this was not what the endurantist had in mind when saying that persisting objects are wholly present at different times. In the latter case, (1) becomes:
(3)
at time t, y has all the parts that it had, has or will ever have.
However, recall that according to endurantism persisting objects are supposed to be wholly present at each time of their persistence. If whole presence is defined as in (3), this will imply that objects will never gain or lose parts. Which seems again to mischaracterize endurantism, which was supposed to be compatible with mereological change.
We should point out that in interpreting (1) as (2) or (3) we have switched from an apparently timeless notion of parthood (x is part of y) to a temporary one (x is part of y at time t). The move is a straightforward one for an endurantist to make. Usually, endurantists want their properties or relations—or at least the contingent ones—to be exemplified temporarily. However, at least some endurantists, those who are also presentists, might resist this switch and stick to the timeless notion of parthood. They might simply say that x is wholly present just in case it has all the parts it has, full stop (Merricks 1999). Whether or not this solution works in a presentist setting, it can hardly be applied in a non-presentist one.
Another option might be to argue that to be wholly present simply means to lack any proper temporal parts. This move sounds promising. However, it is not totally uncontroversial, for it has been argued that in special cases an endurantist might want her enduring objects to have proper temporal parts. Suppose, for instance, that an artist creates a bronze statue of Socrates by mixing copper and tin into the mold and then, unsatisfied with the result, destroys the statue by separating tin and copper again, so that the statue and the bronze will begin and cease to exist at the same times. Suppose, further, that the bronze and the statue are numerically different from each other (for reasons why they should be, see § 2b). The bronze might be taken to be a part of the statue (a proper part, insofar as it is different from the whole), but it will mereologically coincide with it during its existence. In this somehow tortuous scenario, even if the bronze and the statue might be conceived as enduring, the bronze will count as a temporal part of the statue at the interval of their persistence. For have a look back at the definition of temporal parts given before:
Temporal part
x is a temporal part of y at t if (i) x is a part of y at t; (ii) x exists at, and only at, t; (iii) x overlaps at t everything that is part of y at t.
Indeed, (i) the piece of bronze is a part of the statue that (ii) exists only at the interval for the persistence of the statue, and that (iii) overlaps everything that is part of the statue and exists at that time.
What lesson should we learn from this particular case? According to Sider (2001), a defender of the unintelligibility charge against endurantism, the conclusion to be drawn is that an endurantist might want her enduring objects to have, at least sometimes, proper temporal parts. And that consequently that endurantism cannot simply be the doctrine that objects persist without having proper temporal parts. In principle, one might be tempted to draw a different lesson, that is, that Sider’s definition of temporal parts is unsuccessful and that the notion of a temporal part should be defined in a different way.
In any case, it should be noted that so far, we have tried to characterize the notion of whole presence in mereological terms. However, the reader shall recall that in § 1b we distinguished two aspects which are mixed together in the canonical definition of endurantism offered above. Once again, we should distinguish (i) the mereological question of whether persisting objects have temporal parts, and (ii) the locative question of whether objects are exactly located at temporally extended, four-dimensional spacetime regions or rather at temporally unextended, three-dimensional regions only. So far, in trying to define whole presence in mereological terms, we have assumed that the notion pertained to the mereological question, rather than the locative one. On the other hand, if whole presence is to be characterized in locative terms, the task does not seem to be too difficult (Gilmore 2008, Parsons 2007, Sattig 2006). For example, under the view that we called locative three-dimensionalism, whole presence simply translates as exact location: a persisting object is wholly present at each instant of its persistence in the sense that it is exactly located at each instantaneous time or spacetime region of its persistence.
e. Arguments against Specific Versions of Endurantism
In § 1b and § 1c, we characterized several different versions of locative and non-locative endurantism. Each of them helped characterize better what the endurantist might have had in mind. However, each of them is subject to specific objections, which we here review summarily.
First, we have defined locative three-dimensionalism, according to which persisting objects are exactly located at temporally unextended regions only. This form of endurantism is committed to the possibility of multi-location, that is, to the possibility of a single entity having more than one exact location. Multi-location has been put to work in several contexts, in helping to make sense not only of endurantism, but also of Aristotelian universals and property exemplification, to mention only a few cases. Still, several scholars take multi-location to be problematic, either because it implies contradictions (Ehring 1997a), or because it is at odds with the very notion of an exact location (Parsons 2007), or because it creates specific problems when applied to the case of persistence (Barker and Dowe 2005). Moreover, locative three-dimensionalism is prima facie committed to the existence of instants of time, which cannot be the case if time is gunky (see Leonard 2018).
Second, we have defined simplism, according to which persisting objects are mereologically simple and exactly located at the temporally extended region of their persistence. Simplism is committed to the possibility of extended simples, that is, the possibility that something without any proper parts can be located at an extended region. Extended simples have enjoyed a fair share of popularity and have been argued to be a possibility which flows from recombinatorial considerations (McDaniel 2007b, Saucedo 2011, Sider 2007), from quantum mechanics (Barnes and Williams, 2011) and from string theory (McDaniel 2007a). Still, some scholars look at extended simples with a distrustful stare, because they think that dividing into parts is part of the nature of extension (Hofweber and Velleman 2011), because extended simples are excluded by our best theories of location (Varzi and Casati 1999), or because specific reasons given in favour of the possibility of extended simples are unsuccessful.
Third, we have introduced non-locative versions of endurantism. These versions usually assume that there are two radically different ways of being in a dimension, that objects are in space in a radically different way with respect to the one in which they are in time, and that these two different ways explain why objects divide into spatial but not into temporal parts. Such views are immune from the specific problems of locative three-dimensionalism and of simplism. Still, they have been argued to come with specific drawbacks of their own. In particular, they seem to be at odds with spacetime unitism (see § 1e). Indeed, under spacetime unitism, regions of time and regions of space are simply spatiotemporal regions of some sort. So, it seems that if anything holds a relation to a region of space, it cannot fail to hold the same relation to some region of time as well (Hofweber and Lange 2017).
3. Arguments against Perdurantism
Perdurantism has become a popular option. However, it does not come without its own drawbacks. This section briefly reviews arguments to the effect that that it offends against our intuitions (§ 3a), it makes change impossible (§ 3b), it is committed to mysterious and yet systematic cases of coming into existence ex nihilo (§ 3c), it is ontologically inflationary (§ 3d), it involves a category mistake (§ 3e), it does not make sense (§ 3f), and it has a problem with counting (§ 3g).
a. The Argument from Intuition
Endurantists and their foes alike often agree that endurantism is closer to common sense beliefs, or more intuitive, than perdurantism. Moreover, some philosophers believe that common sense beliefs or intuition should be taken seriously when doing philosophy. This often translates into the idea that such intuitions or beliefs should be preserved as much as possible, that is, until eventually proven false or at least significantly problematic (Sider 2001). Presumably, this is also why endurantism is sometimes considered the champion view, and that the burden of proof in the persistence debate lies on the perdurantist side of the debate (Rea 1998). Now, has endurantism been proven false or significantly problematic? The previous section reviewed several arguments to this effect and registered that several endurantists remain unconvinced. They would therefore conclude that perdurantism is unmotivated and, since it is the challenger view, should be rejected.
We shall not here tackle the question of whether endurantism has been proven false (see § 2 for this). Rather, we focus on other possible ways in which the perdurantist might respond to this specific challenge.
First of all, though, we should wonder: why is endurantism supposed to be more intuitive than perdurantism? What aspects of perdurantism are supposed to be that counter-intuitive? Perdurantism implies that when seeing a tree or talking with a friend, what you have in front of you is not a whole tree or a whole person, but rather only parts of them. It also implies that objects are extended in time just like they are extended in space and a bit like an event is supposed to be. These mereological and locative consequences of perdurantism are supposed to be counter-intuitive: intuitively, we would say that what we have in front of us in the cases described are a whole tree and a whole person, and that we are not extended in time like we are in space, or like events are supposed to be.
Clearly enough, one option for the perdurantist is simply to reject the idea that in philosophy intuitions or common sense should have the weight the endurantist is here proposing. What an endurantist calls “intuitions” a perdurantist might insist are nothing more than unwarranted biases. However, we do not discuss this option here. Tackling the general question of the role of intuition in philosophy goes beyond the scope of this article (for an introduction to the topic, see Intuition).
A second option consists in pointing out that while perdurantism does indeed have counter-intuitive consequences, endurantism is not immune from counter-intuitiveness too. For example, we have already mentioned that several popular versions of endurantism are committed to claims—as the claim that things can have more than one exact location or that extended simples are possible (see § 2e)—which might arguably be taken to be counter-intuitive.
A third option consists in pointing out that even if intuition should play a role in philosophy, the kind of evidence that it offers might be biased, for it might be based on our misleading vantage point on reality. In particular, it might be argued that our endurantist intuitions are based on the fact that human beings commonly experience reality a time after a time. However, if spacetime unitism and eternalism are true, a more veritable perspective would be one that would allow us to perceive the whole of spacetime in a single bird’s-eye view. Were we able to see the whole of spacetime in a single bird’s-eye view, our intuitions might be different, and we might rather be led to believe persisting objects to be spatiotemporally extended, and to see their instantaneous “sections” with which human beings are usually acquainted, as parts of them. In that case, our usual condition would be reminiscent of that of the inhabitants of Flatland, who perceive the passage of a three-dimensional sphere on their plane of perception as the sudden expansion and contraction of a bi-dimensional circle. Once again, here we shall not tackle the question of whether eternalism and spacetime unitism are true (for an introduction to the topic, see Gilmore, Costa, Calosi 2016).
b. The No-Change Objection
A second objection traditionally marshalled against perdurantism is that it makes change impossible (Geach 1972, Lombard 1986, Mellor 1998, Oderberg 2004, Sider 2001, Simons 1987; 2000a). But change quite obviously occurs everywhere and everywhen. Hence, perdurantism is false.
Why would perdurantism make change impossible? Change requires difference and identity. In order for a change to occur, the argument goes, something must be different, that is, must have incompatible properties, but must also be identical, that is, must be one and the same thing. The identity condition is important, for we would not normally call a change a situation in which two numerically different things have incompatible properties. For example, we would not call a change a situation in which an apple is red and a chair is blue. However, the perdurantist account of change (§ 2b) seems committed to invariably violate the identity condition. Under perdurantism, when a change occurs, it is not the numerically same thing which has the incompatible properties. Rather, the incompatible properties are had by numerically different temporal parts of said thing. For example, if a hot poker becomes cold, it is not the persisting poker itself which is hot and cold. Rather, two numerically different temporal parts of it are hot and cold.
Is it really the case that perdurantism violates the identity condition? For sure, under perdurantism, the incompatible properties are had by numerically different temporal parts: an earlier part of the poker is hot, a later one is cold. However, can we not say that the persisting thing has them too: the perduring poker itself is hot and cold? After all, we call red a thing even if not all, but only some, of its parts are red. It is crucial here to stop and wonder what we might mean that the perduring poker itself is hot and cold. One straightforward option would be to say that the poker itself literally is hot and cold, just like its different temporal parts are. However, this is implausible. After all, one of the main motivations for being a perdurantist consists in saying that it is impossible for the numerically same poker to be hot and cold, for it would violate Leibniz’s Law or even the Law of Non-contradiction (§ 2b). Hence, when a perdurantist says that the perduring poker itself is hot and cold she must mean something different. Presumably, she means that the poker is hot insofar as it has hot parts and is cold insofar as it has cold parts. However, if this is what the perdurantist really means, she would presumably be violating the difference condition. For change requires the same subject to have incompatible properties. Whereas having hot parts and having cold parts are not incompatible properties.
A second and popular move consists in rejecting the identity condition. Change does not require one and the same thing to have incompatible properties. At least in some cases, different things would do too (Sider 2001). However, foes of perdurantism would insist that it is not possible to give up the identity condition so lightly. They would insist, for example, that having parts with incompatible properties is insufficient for change. For example, a single poker would not change for the simple fact of having hot parts and cold parts: mereological heterogeneity is not change. Perdurantists might concede that mereological heterogeneity is not always change, but specify that under certain circumstances, it is. In particular, mereological heterogeneity is change in cases where incompatible properties are had by different temporal parts of a single thing.
Some endurantists remain unconvinced by this proposed amendment to the identity condition. They would say, for example, that since temporal parts are numerically different from each other, under perdurantism there is no change, but only replacement. At this point, perdurantists have at least two options. The first one is simply to disagree: change is a particular kind of replacement. The second one consists in giving up on change: if change really requires the original identity condition, then let it be: philosophy has taught us that where we believed there to be change, there really only is replacement (Simons 2000b; Lombard 1994).
c. The Crazy Metaphysic Objection
A third objection against perdurantism is that it is a “crazy metaphysic”, for it involves systematic and yet mysterious cases of coming into existence. The objection refers here to the fact that, under perdurantism, new temporal parts of a single thing come into (and go out of) existence continuously. As Thomson famously puts it:
[perdurantism] seems to me a crazy metaphysic (…). [It] yields that if I have had exactly one bit of chalk in my hand for the last hour, then there is something in my hand which is white, roughly cylindrical in shape, and dusty, something which also has a weight, something which is chalk, which was not in my hand three minutes ago, and indeed, such that no part of it was in my hand three minutes ago. As I hold the bit of chalk in my hand, new stuff, new chalk keeps constantly coming into existence ex nihilo. That strikes me as obviously false (Thomson 1983, 213).
Under perdurantism, these cases of coming into being really are systematic. But what does it mean to say that they are crazy or mysterious? It might mean that they do not make sense (for this option, see the unintelligibility objection in § 3e). But there is another option which is worth exploring. According to this option, mystery has to do with the absence of an indispensable explanation. If perdurantism is true, the objection goes, there are systematic cases of coming into existence. These cases cry out for an explanation: how is it that these new things come into existence? Where do they come from? However, perdurantism seems to be unable to offer an explanation for these cases. Under perdurantism, the systematic coming into being of new and new temporal parts is a brute fact, of which there is no explanation.
First, we shall wonder: is perdurantism really unable to offer an explanation for these cases of coming into existence? Thomson seems to be persuaded that it cannot. If perdurantism is true, these temporal parts do not come from a source which might explain their appearance. In her words, they come into existence ex nihilo. But is this really the case? What does it mean that a new temporal part of a thing comes into existence ex nihilo, from nothing? Does it mean that nothing existed before the temporal part? Certainly not: other temporal parts of the thing existed before the appearance of that particular temporal part. Does it mean that the coming into existence of the temporal part is an event which has no cause? Again, this seems to be implausible. If perdurantists take causation seriously (for if they do not, the objection would not apply in the first place, see (Russell 1913)), some perdurantists would say that there is a causal connection between temporal parts of a single thing: the later ones are caused to exist by the previous ones (Heller 1990, Oderberg 1993, Sider 2001). Endurantists might disagree here. For example, they might believe that later temporal parts cannot be caused to exist by the previous ones, for (immediate) causation requires simultaneity of cause and effect (see Huemer and Kovitz 2003, Kant 1965).
We have discussed how a perdurantist might try to offer an explanation for the continuous coming into existence of new temporal parts. But is it really the case that the coming into existence of new temporal parts really requires an explanation? In that connection, perdurantists usually follow two lines of reasoning. First, they argue that the succession of new temporal parts as we move through time is analogous to the succession of new spatial parts as we move through space. And since we do not think there is anything mysterious in the latter case, so we should have the same attitude in the former one as well (Heller 1990, Varzi 2003). This argument from analogy gains plausibility especially under a unitist view of spacetime. However, one might argue that the analogy fails, for example because causation unfolds diachronically over time and not synchronically through space, so we have a reason not to expect there to be an explanation in the spatial case and to require an explanation in the temporal one instead. The second line of reasoning takes the form of a tu quoque. Thomson believes that the continuous coming into existence of new temporal parts requires an explanation. But is the continuous existence of an enduring object not equally mysterious? How is it that an enduring object continues to exist instead of ceasing to exist? If the endurantist’s continuous existence is no mystery, so also is the continuous coming into existence of new temporal parts proposed by the perdurantist (Sider 2001, Varzi 2003).
d. The Objection from Ontological Commitment
One criterion that has been sometimes employed in order to evaluate metaphysical doctrines is Ockham’s razor, according to which a theory should refrain from making commitments if such commitments are not necessary to its theoretical success. One particular kind of commitment is ontological commitment, that is, the commitment of a theory to the existence of entities of kinds thereof. According to Ockham’s razor, this commitment is to be avoided if possible, and any theory which is less ontologically committed is, ceteris paribus, preferable with respect to one which has more ontological commitment (see The Razor).
Now, it might be noted that perdurantism is committed to the existence of a higher number of entities with respect to both endurantism and the stage view. Perdurantism is more ontologically committed than endurantism, for on top of a single persisting thing, it is committed to the existence of a series of numerically different temporal parts thereof. Perdurantism is also more ontologically committed than the stage view. Indeed, unlike perdurantism, the stage view is not necessarily committed to the existence of the perduring mereological sums of the instantaneous stages. If perdurantism is indeed more ontologically committed than endurantism and the stage view, the question is whether this commitment is really necessary. This question is of course discussed in § 2 and § 4. However, more generally, the perdurantist might wish to reject Ockham’s razor—for what reasons do we have to believe that the world is not more complex than our simplest theories? —or to ride the wave of contemporary metaphysicians which simply downplays the importance of ontological commitment, and suggests that the fundamental question of metaphysics is not what there is, but rather what is fundamental, or what grounds what (Schaffer 2009). Yet another response on behalf of the perdurantist is based on the distinction between quantitative and qualitative parsimony (Lewis 1973; 1986). A metaphysical system is more quantitatively parsimonious the fewer entities it acknowledges, while it is more qualitatively parsimonious the fewer ontological categories it introduces. Offending against quantitative parsimony is often considered to be less problematic, if at all, than offending against qualitative parsimony. And indeed, one might say, perdurantism offends against quantitative, but not qualitative, parsimony, for each temporal part of a material object is itself a material object.
e. The Category Mistake Argument
Perdurantism has it that persisting objects have temporal parts. This makes objects similar to events, for events too are also usually taken to have temporal parts. Because of this similarity, perdurantists have sometimes presented as a consequence of their view that objects and events are entities of the same kind, and the difference between events and objects is, at best, one of degree of stability (Broad 1923, Quine 1970). In the words of Nelson Goodman (1951, 357): “a thing is a monotonous event, an event is an unstable thing”.
Are events and objects entities of the same kind? Critics of perdurantism have sometimes argued that they are not, and that conflating objects and events would result in a serious category mistake. Perdurantism, which is committed to this mistake, would therefore need to be rejected. This is the category mistake argument against four-dimensionalism (Hacker 1982, Mellor 1981, Strawson 1959, Wiggins 1980).
What reasons are there to believe that events and objects belong to different ontological categories? For example, it has been pointed out that while objects are said to exist, events are said to happen, or take place (Cresswell 1986, Hacker 1982). This linguistic difference is sometimes said to be a reflection of an ontological one, that is, that objects and events enjoy different modes of being. Moreover, while objects exist at times and are at places, events are supposed to be at places and times. Once again, this linguistic difference is supposed to reflect an ontological one, that is, that objects and events relate to space and time in radically different ways (Fine 2006). Furthermore, objects do not usually allow for co-location, at least not to the extent in which events do (Casati and Varzi 1999, Hacker 1982). Finally, it is sometimes said that the spatial boundaries of events are usually vaguer than those of objects (what are the spatial boundaries of a football match?), whereas the temporal boundaries of events are usually less vague than those of objects (Varzi 2014).
A first way to resist this argument is to insist that conflating objects and events is no category mistake. Putative differences between objects and events will then either be considered irrelevant when it comes to metaphysics—for example because they are merely linguistic differences which do not reflect any underlying significant difference in reality—or in any case not enough to imply that objects and events belong to different ontological categories. After all, presumably, not all differences between kinds of entities are supposed to make them entities of a different kind (Sider 2001).
On the other hand, if a perdurantist is persuaded that conflating objects and events would be a category mistake, she could simply reject the claim that perdurantism implies that objects are events or vice versa. Perdurantism is the claim that objects have one feature that is usually—and not universally—attributed to events, that is, having temporal parts. And sharing some features is not a sufficient condition to belonging to the same ontological category. After all, entities of other kinds, such as time intervals or spacetime regions, are usually taken to have temporal parts without being events.
f. The Unintelligibility Objection
Some endurantists believe that perdurantism is not (only) false, but utterly unintelligible. According to this possible objection, perdurantism is a “mirage based on confusion” (Sider 2001, 54), a doctrine which makes “no sense” (Simons 1987, 175) or which is, at best, “scarcely intelligible” (Lowe 1987, 152). In the trenchant words of Peter van Inwagen:
I simply do not understand what [temporal parts of ordinary objects] are supposed to be, and I do not think this is my fault. I think that no one understands what they are supposed to be, though of course plenty of philosophers think they do. (van Inwagen 1981, 133)
In response to this objection, David Lewis (1986) famously stated that if one is unable to understand a view, one should not debate about it. Colorful as it is, Lewis’ stance misfires. The point of the objection is not that the objector has not understood perdurantism, but rather that perdurantism itself is unintelligible. Lewis’ point would apply in case where the objector was simply admitting her epistemic limitations. But the objector is not making a point about herself. Rather, she is making a point about the view itself, saying that it does not make sense. (Is it possible for something to be false and also not to make sense? Several scholars have indeed endorsed the view that some claims, such as contradictions or category mistakes, are false and do not make sense. But this view might be attacked.)
What is it precisely that is supposed not to make sense in perdurantism? Is it the notion of a temporal part itself? This is hardly the crux of the problem, since many endurantists claim that the notion itself, when applied to events, makes perfect sense (Lowe 1987). The unintelligibility of the view should rather come from some other aspect of the view. But if so, wherefrom? One option consists in saying that the unintelligibility comes from the fact that perdurantism is committed to a category mistake, and category mistakes, or at least some of them, are unintelligibile (for a discussion see § 3e). A second option might have to do with mereology. Indeed, Sider (2001), who takes the objection seriously, considers that the problem might lie in the fact that the notion of a temporal part is usually defined in terms of the timeless notion of parthood—x is part of y. Rather, endurantists tend to use the temporary notion of parthood—x is part of y at t. Sider suggests that maybe the sense of unintelligibility comes from the fact that perdurantists tend to use a mereological notion that endurantists take to be unintelligible—or to yield unintelligible claims when applied to everyday material objects. If Sider’s diagnosis is correct, then his definition of temporal parts in terms of temporary parthood discussed before (§ 1d) seems to take care of it.
g. The Objection from Counting
The objection from counting is traditionally presented as an objection against perdurantism and in favor of the stage view. The semantic difference between the two views is of particular importance here. Recall that the two views disagree about the reference of expressions referring to ordinary objects. Under perdurantism, expressions referring to ordinary objects, such as “Socrates”, refer to persisting, four-dimensional objects, whereas under the stage view, expressions referring to ordinary objects refer to one instantaneous stage (which particular stage is referred to is determined by the context).
Let us consider again the case of the statue and the piece of clay (§ 2b). Under perdurantism, both of them are four-dimensional entities, and their apparent coincidence boils down to their sharing some temporal parts. In particular, at any time in which the statue exists, there is an instantaneous statue-shaped entity that is both a temporal part of the statue and a temporal part of the piece of clay. Now suppose that at that particular time someone asks the question: how many statue-shaped objects are there? Intuitively, we would like to answer that there is only one. And this is the answer given by the stage view. For the stage view takes ordinary expressions such as “statue-shaped object” to refer to instantaneous stages, and there is only one of them that exists at that time. On the other hand, perdurantism counts by four-dimensional entities. And since that particular instantaneous stage is a temporal part of two ordinary objects, the statue and the piece of clay, perdurantism implies that there are in fact two statue-shaped objects there at that time. Hence, perdurantism, unlike the stage view, yields unwelcome results as regards the number of entities involved in such cases. This is the argument from counting against perdurantism (Sider 2001).
A possible answer consists in saying that in that particular context the predicate “statue-shaped object” does indeed refer to two four-dimensional entities, the statue and the piece of clay, but that we count them as one because they are, in a sense, identical at the time of the counting (Lewis 1976). In saying so, we are using an apparently time-relative notion of identity—x is identical to y at t—instead of the usual timeless one—x is identical to y. What does that mean? A four-dimensionalist would define the time-relative notion in terms of the timeless one: x is identical to y at t if the temporal part of y at t is identical to the temporal part of y at t. Stage theorists will probably remain unconvinced by this move for, they would insist, counting can only be done by identity. In Sider’s words: “I doubt that this procedure of associating numbers with objects is really counting. Part of the meaning of ‘counting’ is that counting is by identity; ‘how many objects’ means ‘how many numerically distinct objects’ (…). Moreover, the intuition that [there is just one statue-shaped object at that time] arguably remains even if one stipulates that counting is to be identity” (Sider 2001, 189).
4. Arguments against Stage View
This section reviews arguments against the stage view, to the effect that it goes against our intuitions (§ 4a), it makes change impossible (§ 4b), it is committed to mysterious and yet systematic cases of coming into existence ex nihilo (§ 4c), it is ontologically inflationary (§ 4d), it is incompatible with temporal gunk (§ 4e), it is incompatible with our mental life (§ 4f) and it has problems with counting (§ 4g).
a. The Argument from Intuition
In § 3a we discussed the argument from intuition against perdurantism. A similar argument has been proposed against the stage view as well. While the details of the present argument are somewhat different from the previous one, its general structure remains the same. The general idea is that closeness to intuitions or common sense constitutes a theoretical advantage that a view might have. And, the objector says, both endurantism and perdurantism are closer to intuitions than the stage view.
Why is the stage view supposed to be especially counter-intuitive? Presumably, the aspect of the stage view which offends the most against our intuitions is the fact that it denies persistence. Indeed, while endurantism and perdurantism agreed on the fact that some ordinary objects persist, either by enduring or by perduring, the stage view denies that ordinary objects persist. In place of a single persisting object, the stage view posits a series of numerically different instantaneous stages.
In order to tackle this objection, the stage viewer might decide to deploy some of the generic strategies outlined in § 3a. First, the stage viewer might insist that intuitions are no more than biases and thus deny that that intuitions place any disadvantage on the stage view. Second, the stage viewer might believe that the disadvantage exists, but is nevertheless outweighed, either by the fact that other views are intrinsically counter-intuitive too (see again § 3a), or by the fact that the other views have been proven false or at least significantly problematic.
Here, however, we focus on a fourth and more specific strategy available to the stage viewer. The strategy consists in arguing that the intuition that is supposed to disfavor the stage view does not really disfavor it. It is indeed true, the stage viewer would say, that we commonly have beliefs such as “I was once a child”. The critic of the stage viewer takes them to imply the persistence of the self, for how could I have been a child without existing in the past? But this, the stage viewer says, is a mistake. In fact, we could make sense of beliefs such as “I was once a child” just as well by means of the counterpart relation: “I was once a child” is true if a past counterpart of mine is a child. In other words, those beliefs are undetermined with respect to the question of whether things exist at more than one time (Sider 2001).
A possible reply is that the strategy might not be applied to all putative cases of commonsensical beliefs involving the past. Consider for example a tenseless statement of cross-time identity such as “I am identical to a young child”, in which I affirm my identity with my previous self. This statement cannot be taken care of in terms of counterparts. The stage viewer’s rejoinder might here be that these beliefs are perhaps too technical to be common sense or that, in any case, what really matters is that the stage viewer is able to make sense of and to validate cognate statements that are framed in terms which are much more mundane, such as “I was once a child” (Sider 2001).
b. The No-Change Objection
In § 3b we discussed the no-change objection against perdurantism. The objection was that change requires the numerical identity of the subject of change before and after the process of change. There, we discussed the option of amending this identity requirement. Change does not require that the subject before and the subject after the change be identical. They just need to be temporal parts of a single thing. The stage viewer might adopt this strategy to suit her needs. Change does not require that the subject before and the subject after the change be identical. They just need to be related by the counterpart relation. Some endurantists remain unconvinced by the perdurantist amendment. We might reasonably expect them to be unconvinced by the amendment proposed by the stage viewer too. Since the relevant stages are numerically different from each other, under the stage view there is no change, but only replacement. The stage viewer’s rejoinder might be either to insist that change is a particular kind of replacement or to give up on change and insist that there is nothing bad in saying that where we believed there to be change, there really is replacement.
c. The Crazy Metaphysic Objection
Section 3c reviewed an argument against perdurantism to the effect that it involved systematic and yet mysterious cases of coming into existence. The stage view is subject to a similar objection. Just like perdurantism requires the systematic coming into existence of new temporal parts, so the stage view requires the systematic coming into existence of new instantaneous stages. And if perdurantism did not have a plausible explanation for this systematic coming into existence, neither does the stage view.
However, it should also be noted that the stage viewer can apply the exact same strategies there proposed on behalf of the perdurantist. The stage viewer might insist that there indeed is an explanation for the coming into existence of new temporal parts. Their coming into existence is caused by the previous stages (Varzi 2003). Or she might argue that the systematic coming into existence is not mysterious after all, for it is no more mysterious than the succession of spatial parts through space, and no more mysterious than the continuous existence of an enduring object through time (Sider 2001; Varzi 2003).
d. The Objection from Ontological Commitment
Section § 3d reviewed an argument from ontological commitment against perdurantism. Its guiding principle was that unnecessary ontological commitments should be avoided and, therefore, any theory which is less ontologically committed is, ceteris paribus, preferable with respect to one which has more ontological commitment.
This kind of argument seems to disfavor the stage view with respect to endurantism. Indeed, instead of a single enduring thing, the stage view posits a myriad of numerically different instantaneous stages. However, this kind of argument does not disfavor the stage view with respect to perdurantism. Indeed, often the ontological commitments of the stage view and of perdurantism are perfectly aligned. Indeed, because of their commitment to mereological universalism, many stage viewers believe in the existence of four-dimensional aggregates on top of instantaneous stages (see § 1a). However, the stage view is not committed to four-dimensional aggregates by definition. So, depending on further metaphysical parameters, it might turn out that a stage viewer’s overall metaphysical view ends up being less ontologically committed than perdurantism.
In order to block this kind of argument, the stage viewer might adopt the usual strategies already described on behalf of the perdurantist. In particular, she might argue that the further ontological commitments of the stage view is fully justified because of the failures of endurantism (§ 2), or she might argue that a philosopher should not be scared to make all the ontological commitments that she sees fit, for what reasons do we have to believe that the world is not more complex than our simplest theories? Finally, she could ride the wave of contemporary metaphysicians which simply downplays the importance of ontological commitment and suggests that the fundamental question of metaphysics is not what there is, but rather what is fundamental, or what grounds what (Schaffer 2009).
e. The Objection from Temporal Gunk
When introducing the stage view, we pointed out that unlike perdurantism and endurantism, its very definition commits it to the existence of instantaneous entities. This might be a drawback of the stage view, in case time turns out to be gunky, that is, if it turns out that every temporal region can be divided into smaller temporal regions, and thus, temporal instants turn out not to exist (Arntzenius 2011, Whitehead 1927, Zimmerman 2006).
We do not here focus on the question of whether there exist temporal instants at all. Instead, we shall briefly remark that, as it stands, the argument implicitly assumes that if there are no instants, there cannot be instantaneous entities. This assumption might be taken to follow from a series of principles of location, most notably the principle of exactness, according to which anything that is in some sense in a dimension must also have an exact location in that dimension. Now, located entities share shape and size with their exact locations. Hence, if exactness is true, instantaneous entities do indeed require the existence of instants in order to exist. However, exactness has come under attack on different grounds, one of which concerns indeed the possibility of pointy entities in gunky dimensions (Gilmore 2018, Parsons 2001). Hence, it seems in principle possible for a stage viewer to uphold her view even if she takes time to be gunky.
f. The Objection from Mental Events
There is an objection often proposed against the stage view which concerns in particular the persistence of subjects of mental states. The stage view implies that ordinary objects, persons included, do not persist through time. However, some mental processes and states seem not to be possibly performed or possessed by instantaneous entities. For example, we say that people reflect on some ideas, make decisions, ponder means, act, fall in love, change their mind. All those mental events take time, and thus cannot possibly be possessed by instantaneous stages. (Brink 1997, Hawley 2001, Sider 2001, Varzi 2003).
The stage viewer will typically reply that acting, reflecting, pondering, making decisions and so on do not require a persisting subject. For example, they might insist that, say, acting is something that can be possessed by a stage in virtue of its instantaneous state as well as in virtue of its relations to its previous stages, provided that the previous stages possess their appropriate mental properties (Hawley 2001, Sider 2001, Varzi 2003). Alternatively, the stage viewer might insist that there are indeed extended mental events such as acting or pondering, but that such mental states do not have one single subject, but rather a series of subjects which succeed themselves one after the other. For each of them, to be acting is to be the subject of an instantaneous temporal part of a temporally extended event of acting. In any case, the stage viewer will concede that her view, unlike endurantism and perdurantism, is incompatible with the idea that such mental events are temporally extended and are possessed by a single subject.
g. The Objection from Counting
Section § 3g discussed an objection against perdurantism to the effect that it delivered the wrong counting results in cases of mereological coincidence. To the question “how many statue-shaped objects are there?”, asked at a time in which the piece of clay and the statue mereologically coincide, the perdurantist has to answer that there are two, whereas the stage viewer can give the intuitive answer that there is only one. However, while considerations about counting in cases of coincidence seem to favor the stage view, similar considerations in cases which are far more common seem to disfavor the stage view over its rivals, and endurantism in particular. Suppose Sam remains alone in a room for an hour. How many people have been in that room during that hour? Intuitively, we would like to answer that there has been only one. And this is the answer given by endurantism. For endurantism takes Sam to be one single persisting object that exists through the hour. On the other hand, the stage view takes ordinary expressions such as “person” to refer to instantaneous stages, and there is such an instantaneous stage of Sam for each instant making up the hour. Hence, the stage view, unlike endurantism, yields unwelcome results as regards the number of entities involved in such cases (Sider 2001). (How does perdurantism fare with this objection? It depends on whether it counts temporal parts of persons as persons. If it does (and it usually does, see § 2b), perdurantism is subject to the same objection.)
Suppose that the stage viewer is concerned with this problem and takes intuitions about counting seriously (and she arguably should, if she endorses the argument from counting in favor of her view presented in § 3g). In that case, the stage viewer has at least three options. The first one consists in saying that in that particular context the predicate “person” does indeed apply to several instantaneous stages, but that we count them as one because they are counterparts of each other. This option is subject to a rejoinder which was already employed in § 3g against the Lewisian solution to the problem of counting in favor of the stage view. Indeed, the present option suggests that sometimes we count by counterparthood and not by identity. This offends against the view that “part of the meaning of ‘counting’ is that counting is by identity” (Sider 2001, 189). A second option is available to the stage viewer who believes in the existence of four-dimensional sums of instantaneous stages. This stage viewer might claim that in the present context the predicate “person” applies to one single four-dimensional object instead of the instantaneous stages. In so doing, the stage viewer is adopting an unorthodox view which mixes the stage view and perdurantism, in which reference of ordinary terms such as “person” is flexible: sometimes they pick out instantaneous stages (as in the stage view), sometimes they pick out four-dimensional sums thereof (as in perdurantism). A third and final option consists in taking domains of counting to be restricted to entities existing to the time of utterance, or restricted in some other suitable way (Viebahn 2013).
5. What Is Not Covered in this Article
This section lists several aspects and issues concerning the metaphysics of persistence that are not covered in this article. Each of them is complemented with some references so as to guide readers in their exploration.
When it comes to the characterization of the views and of the debate, it is worth noting that some philosophers have tried to characterize the endurantism/perdurantism dispute in terms of explanation (Donnelly 2016; Wassermann 2016), while others have argued that the dispute is not substantial, but rather merely verbal (Benovsky 2016; Hirsch 2007; McCall and Lowe 2003; Miller 2005). It is also worth noting that apart from a few introductory words, not much has been covered about the history of the metaphysics of persistence (Carter 2011; Costa 2017; 2019; Cross 1999; Helm 1979).
When it comes to arguments for and against the various metaphysics of persistence, a couple of traditional arguments against perdurantism have not been covered in § 3, namely the modal argument (Heller 1990; Jubien 1993; van Inwagen 1990a; Shoemaker 1988; Sider 2001) and the rotating disk argument (Sider 2001). Moreover, it is important to note that several arguments have been drawn from physics for and against theories of persistence presented in this article, among which figure several arguments against endurantism, namely the shrinking chair argument (Balashov 2014; Gibson and Pooley 2006; Gilmore 2006; Sattig 2006), the explanatory argument (Balashov 1999; Gibson and Pooley 2006; Gilmore 2008; Miller 2004; Sattig 2006), the location argument (Gibson and Pooley 2006; Gilmore 2006; Rea 1998; Smart 1972), the superluminar objects argument (Balashov 2003, Gilmore 2006, Hudson 2005; Torre 2015), the invariance argument (Balashov 2014; Calosi 2015; Davidson 2014) as well as an argument from quantum mechanics against perdurantism (Pashby 2013; 2016).
6. References and Further Reading
Armstrong, D. M., 1980, “Identity Through Time”, in Peter van Inwagen (ed.), Time and Cause. Dordrecht: D. Reidel, 67–78.
Arntzenius, F., 2011, “Gunk, Topology, and Measure” The Western Ontario Series in Philosophy of Science, 75: 327–343.
Arntzenius, F., 2011, “The CPT theorem”, in The Oxford Handbook of Philosophy of Time, eds. Craig Callender, Oxford: Oxford University Press, 634-646.
Baker, L. R., 1997, “Why Constitution is not Identity”, Journal of Philosophy, 94: 599–621.
Baker, L. R., 2000, Persons and Bodies, Cambridge: Cambridge University Press.
Balashov, Y. 1999, “Relativistic Objects”, Noûs 33(4), 644-662.
Balashov, Y. 2000, “Enduring and Perduring Objects in Minkowski Space-Time”, Philosophical Studies, 99, pp. 129–166.
Balashov, Y., 2003, “Temporal Parts and Superluminar Motion”, Philosophical Papers 32, 1-13.
Balashov, Y., 2014, “Relativistic Parts and Places: A Note on Corner Slices and Shrinking Chairs”, in Calosi, C. and Graziani, P. (eds.), Mereology and the Sciences, Springer, 35-51.
Barker, S. and P. Dowe, 2003, “Paradoxes of Multi-Location”, Analysis, 63: 106–114.
Barker, S. and P. Dowe, 2005, “Endurance is Paradoxical”, Analysis, 65: 69–74.
Barnes, E. J. and J. R. G. Williams, 2011, “A Theory of Metaphysical Indeterminacy”, Oxford Studies in Metaphysics, vol. 6.
Burke, M., 1992, “Copper statues and pieces of copper: a challenge to the standard account”, Analysis, 52: 12-17.
Burke, M., 1994, “Preserving the Principle of One Object to a Place: A Novel Account of the Relations among Objects, Sorts, Sortals and Persistence Conditions”, Philosophy and Phenomenological Research, 54: 591–624.
Calosi, C., 2015, “The Relativistic Invariance of 4D shapes”, Studies in History and Philosophy of Science 50, 1-4.
Carnap, R., 1967, The Logical Structure of the World, (trans.) George, R. A., Berkeley: University of California Press.
Carter, J., 2011, “St. Augustine on Time, Time Numbers, and Enduring Objects” Vivarium 49: 301–323.
Casati, R. and Varzi, A., 1999, Parts and Places, Cambridge, MA: MIT Press.
Casati, R. and Varzi, A., 2014, “Events”, The Stanford Encyclopedia of Philosophy (Winter 2015 Edition), Edward N. Zalta (ed.).
Chisholm, R. M., 1973, “Parts as Essential to their Wholes”, Review of Metaphysics, 26: 581–603.
Chisholm, R. M., 1975, “Mereological Essentialism: Some Further Considerations”, The Review of Metaphysics, 28 (3):477-484.
Chisholm, R. M., 1976, Person and Object, La Salle (IL): Open Court.
Cleve, J., 1986, “Mereological Essentialism, Mereological Conjunctivism, and Identity Through Time”, Midwest Studies in Philosophy, 11 (1):141-156.
Costa, D., 2017, “The Transcendentist Theory of Persistence”, Journal of Philosophy, 114 (2):57-75.
Costa, D. 2017a, “The Limit Decision Problem and Four-dimensionalism”, Vivarium 55, 199-216.
Costa, D. 2019, “Was Bonaventure a Four-dimensionalist?”, British Journal for the History of Philosophy 28(2), 393-404.
Cresswell, M. J., 1986, “Why Objects Exist but Events Occur”, Studia Logica, 45: 371–375; reprinted in Events, pp. 449–453.
Crisp, T. M., and D. P. Smith, 2005, “’Wholly Present’ defined”, Philosophy and Phenomenological Research 71(2): 318-344.
Cross, R., 1999, “Four-dimensionalism and Identity Across Time: Henry of Ghent vs. Bonaventure”, Journal of the History of Philosophy 37: 393–414.
Davidson, M., 2014, “Special Relativity and the Intrinsicality of Shape”, Analysis, 74, 57-58.
Donnelly, M., 2016, “Three-Dimensionalism”, in Davis, M. (ed.), Oxford Handbook of Philosophy Online, Oxford University Press.
Dummett, M., 1975, “Wang’s Paradox”, Synthese, 30: 301–24.
Ehring, D., 1997, Causation and Persistence, New York: Oxford University Press.
Ehring, D., 2002, “Spatial Relations between Universals” Australasian Journal of Philosophy, 80(1): 17–23.
Fine, K., 1999, “Things and their Parts”, Midwest Studies in Philosophy 23 (1), 61-74.
Fine, K., 2006, “In Defense of Three-Dimensionalism”, The Journal of Philosophy, 103 (12): 699–714.
Gallois, A., 1998, Occasions of Identity, Oxford: Clarendon Press.
Galton, A. P., 2006, “Processes as continuants”. In J. Pustejovsky & P. Revesz (eds), 13th International Symposium on Temporal Representation and Reasoning (TIME 2006: 187). Los Alamitos, CA: IEEE Computer Society.
Galton, A. P., and Mizoguchi, R., 2009, “The water falls but the waterfall does not fall: New perspectives on objects, processes and events”, Applied Ontology, 4 (2): 71-107.
Geach, P. T., 1972, Logic Matters, Oxford: Blackwell.
Geach, P. T., 1980, Reference and Generality, Ithaca, NY: Cornell University Press.
Gibbard, A., 1975, “Contingent Identity”, Journal of Philosophical Logic, 4(2):187-221.
Gibson, I. and Pooley, O. 2006. “Relativistic Persistence”, Philosophical Perspectives 20 (1), 157-198.
Gilmore, C., 2006, “Where in the Relativistic World Are We?”, Philosophical Perspectives, 20: 199–236.
Gilmore, C., 2007, “Time Travel, Coinciding Objects, and Persistence,” in Dean Zimmerman, ed., Oxford Studies in Metaphysics, vol. 3, New York: Oxford University Press, pp. 177–98.
Gilmore, C., 2008, “Persistence and Location in Relativistic Spacetime”, Philosophy Compass, 3.6: 1224–1254
Gilmore, C., Costa, D., and Calosi, C., 2016, “Relativity and Three Four-Dimensionalisms”. Philosophy Compass 11, no. 2: 102–120.
Goodman, N., 1951, The Structure of Appearance, Cambridge (MA): Harvard University Press.
Griffin, N., 1977, Relative Identity, New York: Oxford University Press.
Hacker, P. M. S., 1982, “Events, Ontology and Grammar”, Philosophy, 57:477–486; reprinted in Events, pp. 79–88.
Haslanger, S., 1989, “Endurance and Temporary Intrinsics”, Analysis, 49: 119–25.
Haslanger, S., 1994, “Humean Supervenience and Enduring Things”, Australasian Journal of Philosophy 72, 339-59.
Haslanger, S., 2003, “Persistence Through Time”, in Loux, M. and Zimmerman, D. (eds.) The Oxford Handbook of Metaphysics, Oxford: Oxford University Press
Hawley, K., 1999, “Persistence and Non-Supervenient Relations”, Mind, 108: 53–67.
Hawley, K., 2001, How Things Persist, Oxford: Oxford University Press.
Hawthorne, J. and G. Uzquiano, 2011, “How Many Angels Can Dance on the Point of a Needle? Transcendental Theology Meets Modal Metaphysics”, Mind, 120: 53–81.
Heller, M., 1984, “Temporal Parts of Four Dimensional Objects”, Philosophical Studies, 46: 323-34.
Heller, M., 1990, The Ontology of Physical Objects, Cambridge: Cambridge University Press.
Helm, P., 1979, “John Edwards and the Doctrine of Temporal Parts” Archiv für Geschichte der Philosophie, 61: 37–51.
Hinchliff, M., 1996, “The Puzzle of Change”, Philosophical Perspectives, 10: 119-136.
Hirsch, E., 2007, “Physical-object ontology, verbal disputes, and common sense”, Philosophy and Phenomenological Research 70(1), 67-97.
Hofweber, T. and D. Velleman, 2011, “How to Endure”, Philosophical Quarterly, 61: 37–57.
Hofweber, T., & Lange, M., 2017, “Fine’s fragmentalist interpretation of special relativity” Nous, 51(4), 871–883.
Hudson, H., 2000, “Universalism, Four-Dimensionalism and Vagueness”, Philosophy and Phenomenological Research, 60: 547–60.
Hudson, H., 2005, The Metaphysics of Hyperspace, Oxford: Oxford University Press.
Kant, I., 1965, Critique of Pure Reason, orig. 1781, trans. N. Kemp Smith. New York: Macmillan Press.
Koslicki, K., 2008, The Structure of Objects, Oxford: Oxford University Press.
Leonard, M., 2018, “Enduring Through Gunk”, Erkenntnis, 83: 753-771.
Le Poidevin, R., 1991, Change, Cause and Contradiction, Basingstoke: Macmillan.
Lewis, D. K., 1986, On the Plurality of Worlds, Oxford: Blackwell.
Lewis, D. K., 1988, “Re-arrangement of Particles: Reply to Lowe”, Analysis, 48: 65–72.
Lewis, D., 1976, “Survival and Identity”, in Amelie Rorty (ed.), The Identities of Persons, Berkeley, CA: University of California Press, 117–40. Reprinted with significant postscripts in Lewis’s Philosophical Papers vol. I, Oxford: Oxford University Press.
Lombard, L. B., 1999, “On the alleged incompatibility of presentism and temporal parts”, Philosophia, 27 (1-2): 253-260.
Lombard, L. B., 1986, Events: A Metaphysical Study, London: Routledge.
Lombard, L. B., 1994, “The Doctrine of Temporal Parts and the ‘No-Change’ Objection”, Philosophy and Phenomenological Research, 54.2: 365–72.
Lowe, E. J., 1987, “Lewis on Perdurance versus Endurance”, Analysis, 47: 152–4.
Lowe, E. J., 1988, “The Problems of Intrinsic Change: Rejoinder to Lewis”, Analysis, 48: 72-7.
Lowe, E. J., 1995, “Coinciding objects: in defence of the ‘standard account’”, Analysis, 55(3), 171–178.
Mackie, P., 2008, “Coincidence and Identity”, Royal Institute of Philosophy Supplement, 62: 151-176.
Markosian, N., 2004, “Simples, Stuff and Simple People”, The Monist, 87: 405-428.
McCall, S. and Lowe, J., 2006, “The 3D/4D controversy: a storm in a teacup”, Noûs 40(3), 570-578.
McDaniel, K., 2003, “Against MaxCon Simples”, Australasian Journal of Philosophy, 81: 265-275.
McDaniel, K., 2007a, “Brutal Simples”, in D. Zimmerman (ed.), Oxford Studies in Metaphysics, 3: 233–265.
McDaniel, K., 2007b, “Extended Simples”, Philosophical Studies, 133: 131–141.
McTaggart, J. M. E., 1921, The Nature of Existence, I, Cambridge: Cambridge University Press.
McTaggart, J. M. E., 1927, The Nature of Existence, II, Cambridge: Cambridge University Press.
Mellor, D. H., 1981, Real Time, Cambridge: Cambridge University Press.
Mellor, D. H., 1998, Real Time II, London: Routledge.
Merricks, T., 1994, “Endurance and Indiscernibility”. Journal of Philosophy, 91: 165–84.
Merricks, T., 1995, “On the incompatibility of enduring and perduring entities”, Mind, 104 (415): 521-531.
Merricks, T., 1999, “Persistence, Parts, and Presentism”, Noûs 33, 421-438.
Miller, K., 2004, “Enduring Special Relativity”, Southern Journal of Philosophy 42, 349-70.
Miller, K., 2005, “The Metaphysical Equivalence of Three and Four Dimensionalism”, Erkenntnis 62 (1), 91-117.
Noonan, H. and Curtis, B., 2018, “Identity”, The Stanford Encyclopedia of Philosophy (Summer 2018 Edition), Edward N. Zalta (ed.).
Noonan, H., 1999, “Identity, Constitution and Microphysical Supervenience”, Proceedings of Aristotelian Society, 99: 273-288.
Oderberg, D., 1993, The Metaphysics of Identity over Time. London/New York: Macmillan/St Martin’s Press.
Oderberg, D., 2004, “Temporal Parts and the Possibility of Change”, Philosophy and Phenomenological Research, 69.3: 686–703.
Parsons, J., 2000, “Must a Four-Dimensionalist Believe in Temporal Parts?” The Monist, 83(3): 399–418.
Parsons, J., 2007, “Theories of Location”, in D. Zimmerman (ed.), Oxford Studies in Metaphysics, pp. 201-32.
Pashby, T., 2013, “Do Quantum Objects Have Temporal Parts?”, Philosophy of Science 80(5), 1137-47.
Pashby, T., 2016, “How Do Things Persist? Location in Physics and the Metaphysics of Persistence”, Dialectica 70(3), 269-309.
Quine, W. V. O., 1953, “Identity, Ostension and Hypostasis”, in his From a Logical Point of View, Cambridge, MA: Harvard University Press, 65–79.
Quine, W. V. O., 1960, Word and Object, Cambridge, Mass.: MIT Press.
Quine, W. V. O., 1970, Philosophy of Logic, Englewood Cliffs (NJ): Prentice-Hall.
Quine, W. V. O., 1981, Theories and Things, Cambridge, MA: Harvard University Press.
Rea, M., (ed.), 1997, Material Constitution, Lanham, MD: Rowan & Littlefield.
Rea, M., 1995, “The Problem of Material Constitution”, Philosophical Review, 104: 525–52.
Rea, M., 1998, “Temporal Parts Unmotivated”, Philosophical Review, 107: 225–60.
Rosen, G. and Dorr, C., 2002, “Composition as Fiction”, in Gale, R., (ed.), The Blackwell Guide to Metaphysics, Oxford: Blackwell, pp. 151-174.
Russell, B., 1914. Our Knowledge of the External World, London: Allen & Unwin Ltd.
Russell, B., 1923, “Vagueness”, Australasian Journal of Philosophy and Psychology, 1: 84–92.
Russell, B., 1927, The Analysis of Matter, New York: Harcourt, Brace & Company.
Sattig, T., 2006, The Language and Reality of Time, Oxford: Oxford University Press.
Saucedo, R., 2011, “Parthood and Location”, in K. Bennett and D. Zimmerman (eds.), Oxford Studies in Metaphysics, 6: 223–284.
Schaffer, J., 2009, “On What Grounds What”, in Chalmers, D., D. Manely, and R. Wasserman (eds.), Metametaphysics, pp. 347–283.
Seadley 1982, “The Stoic Criterion of Identity”, Phronesis, 27 (3): 255-275.
Shoemaker, S., 1988, “On What There Are”, Philosophical Topics, 26: 201-23.
Sider, T., 1996, ‘All the World’s a Stage’, Australasian Journal of Philosophy, 74: 433–53.
Sider, T., 2001, Four-Dimensionalism, Oxford: Oxford University Press.
Sider, T., 2007, “Parthood”, The Philosophical Review, 116: 51–91
Sider, T., 2013, “Against Parthood”, in Bennett, K. and Zimmerman, D.W., (ed.), Oxford Studies in Metaphysics, vol. 8, Oxford: Oxford University Press, pp. 237-93.
Simons, P., 1987, Parts: A Study in Ontology, Oxford: Clarendon Press.
Simons, P., 2000a, “How to Exist at a Time When You Have No Temporal Parts,” The Monist, 83 (3): 419–36.
Simons, P., 2000b, “Continuants and Occurrents”, Proceedings of the Aristotelian Society, Supplementary Volume 74: 59–75.
Simons, P., 2004, “Extended Simples: A Third Way Between Atoms and Gunk”, The Monist, 87: 371-84.
Smart, J. J. C., 1963, Philosophy and Scientific Realism, London: Routledge & Kegan Paul.
Smart, J. J. C., 1972, “Space-Time and Individuals”, in Richard Rudner and Israel Scheffler, eds., Logic and Art: Essays in Honor of Nelson Goodman, New York: Macmillan Publishing Company, pp. 3–20.
Steem, K. I., 2010, “Threedimentionalist Semantic Solution to the Problem of Vagueness”, Philosophical Studies 150 (1): 79-96.
Zimmerman, D., 1996, “Could Extended Objects Be Made Out of Simple Parts? An Argument for ‘Atomless Gunk’”, Philosophy and Phenomenological Research, 5 (1): 1–29.
Zimmerman, D., 2006, Oxford Studies in Metaphysics, Volume 2, New York: Oxford University Press.
Author Information
Damiano Costa
Email: damiano.costa@usi.ch
University of Italian Switzerland (Universita’ della Svizzera Italiana, University of Lugano)
Switzerland
Associationism in the Philosophy of Mind
Association dominated theorizing about the mind in the English-speaking world from the early eighteenth century through the mid-twentieth and remained an important concept into the twenty-first. This endurance across centuries and intellectual traditions means that it has manifested in many different ways in different views of mind. The basic idea, though, has been constant: Some psychological states come together more easily than others, and one factor in explaining this connection is prior pairing.
Authors sometimes trace the idea back to Aristotle’s brief discussion of memory and recollection. Association got its name—“the association of ideas”—in 1700, in John Locke’s Essay Concerning Human Understanding. British empiricists following Locke picked up the concept and built it into a general explanation of thought. In the resulting associationist tradition, association was a relation between imagistic “ideas” in the trains of conscious thought. The rise of behaviorism in the early twentieth century brought with it a reformulation of the concept. Behaviorists treated association as a link between physical stimuli and motor responses, omitting any intervening “mentalistic” processes. However, they still treated association just as centrally as the empiricist associationists. In later twentieth-century and early twenty-first-century work, association is variously treated as a relation between functionally defined representational mental states such as concepts, “subrepresentational” states (in connectionism), and activity in parts of the brain such as neurons, neural circuits, or brain regions. As a relation between representational states, association is viewed as one process among many in the mind; however, as a relation between subrepresentational or neural activities, it is again often viewed as a general explanation of thought.
Given this variety of theoretical contexts, associationism is better viewed as an orientation or research program rather than as a theory or collection of related theories. Nonetheless, there are several shared themes. First, there is a shared interest in sequences of psychological states. Second, though the laws of association vary considerably, association by contiguity has been a constant. The idea of association by contiguity is that each pairing of psychological states strengthens the association between them, increasing the ease with which the second state follows the first. In its simplest form, this can be thought of as akin to a footpath: Each use beats and strengthens the path. Third, this carries with it a more general emphasis on learning and a tendency to posit minimal innate cognitive structure.
The term “association” can refer to the sequences of thoughts themselves, to some underlying connection or disposition to sequence, or to the principle or learning process by which these connections are formed. This article uses the term to refer to underlying connections unless otherwise specified, as this is the most common use and the one that unites the others.
This article traces these themes as they developed over the years by presenting the views of central historical figures in each era, focusing specifically on their conception of the associative relation and how it operates in the mind.
Associationism as a general philosophy of mind arguably reached its pinnacle in the work of the British Empiricists. These authors were explicit in their view of association as the key explanatory principle of the mind. Associationism also had a massive impact across the intellectual landscape of Britain in this era, influencing, for instance, ethics (through Reverend John Gay, Hume, and John Stuart Mill), literature, and poetry (see Richardson 2001).
Association in this tradition was called upon to solve two problems. The first was to explain the sequence of states in the conscious mind. The thought here is that there are some reliable patterns to the sequences which must be explained. These were explained by the “laws of association.” The basic procedure was, first, to identify sequences or patterns in sequence. Hobbes’s discussion of “mental discourse” demonstrates this interest, inspiring later associationist theories of mind and providing a famous example:
For in a discourse of our present civil war, what could seem more impertinent than to ask (as one did) what was the value of a Roman penny? Yet the coherence to me was manifest enough. For the thought of the war introduced the thought of the delivering up the king to his enemies; the thought of that brought in the thought of the delivering up of Christ; and that again the thought of the 30 pence which was the price of treason; and thence easily followed that malicious question; and all this in a moment of time, for thought is quick. (Leviathan, chapter 3)
Once the sequences have been identified, the next step is to classify them by the relations between their elements. For example, two ideas may be related by having been frequently paired, or may be similar in some way. This section and the next use “suggestion” to refer to particular incidents of sequence and “association” to refer to the underlying disposition. Secondly, some authors took the same relation to explain the generation of “complex” ideas out of “simple” ideas, often viewed as a kind of psychological atom. The empiricist project requires explaining how all knowledge could be generated from experience. This was perhaps the most common way of doing so, though it was not universal.
Associationist authors then show how associations of the various sorts that they posit can or cannot explain various phenomena. For example, belief may be treated as simply a strong association. Abilities like memory, imagination, or even sometimes reason can be treated as simply different kinds of associative sequence. As empiricists, most eschew innate knowledge and tend to limit innate mental structure relative to competing traditions, though the claim that the mind is truly a blank slate would oversimplify. Their opponents in the Scottish school, for example, treat each of these as manifesting distinct, innate faculties, and posit innate beliefs in the form of “principles of common sense.”
a. John Locke (1632-1704)
John Locke laid the groundwork for empiricist associationism and coined the term “association of ideas” in a chapter he added to the fourth edition of his Essay Concerning Human Understanding (1700). He sets up the Cartesian notion of innate ideas as a primary opponent and asserts that experience can be the only source of ideas, through two “fountains” (book 2, chapter 1): “sensation,” or experience of the outside world, and “reflection,” or experience of the internal operations of our mind. He distinguishes between “simple” ideas, such as the idea of a particular color, or of solidity, and “complex” ideas, such as the idea of beauty or of an army. Simple ideas are “received” in experience through sensation or reflection. Complex ideas, on the other hand, are created in the mind by combining two or more simple ideas into a compound.
In his chapter on association of ideas (book 2, chapter 33), Locke emphasizes the ways that different ideas come together. As he puts it:
Some of our ideas have a natural correspondence and connexion one with another: it is the office and excellency of our reason to trace these . . . Besides this, there is another connexion of ideas wholly owing to chance or custom. Ideas that in themselves are not all of kin, come to be so united in some men’s minds, that . . . the one no sooner at any time comes into the understanding, but its associate appears with it.
His discussion in this chapter focuses on the connections based on chance or custom and describes them as the root of madness. Associations as described here are formed by prior pairings and strengthened passively as habitual actions or lines of thought are repeated.
Thus, despite the significance of his work in setting the stage for later associationists, Locke does not treat association as explaining the mind in general. He treats it as a failure to reason properly, and his interest in it is not only explanatory but normative. For these reasons, some have questioned whether one ought to treat Locke as an associationist, on the thinking that associationists viewed association as the central explanatory posit in the mind (for example, see Tabb 2019). Where one lands on this question seems to depend on the use of the term. After all, Locke’s description of the formation of complex ideas by combining simple ideas was counted as a kind of association by many later associationists. The key, for Locke, is that association is a passive process, while the mind is more active in other processes. The passive nature of association will return as a criticism of associationism; see also Hoeldtke (1967) for a discussion of the history of this line of thought in British psychiatry.
b. David Hume (1711-1776)
David Hume presented arguably the first attempt to understand thought generally in associative terms. He first lays out these views in A Treatise of Human Nature (1739) and then reiterates them in An Enquiry Concerning Human Understanding (1748). According to Hume, the trains of thought are made up of ideas, which are basically images in the mind. Simple ideas, such as a specific color, taste, or smell, are copies of sensory impressions. Thoughts in general are built from these simple ideas by association.
He begins his discussion of association in the Enquiry:
It is evident that there is a principle of connexion between the different thoughts or ideas of the mind, and that, in their appearance to the memory or imagination, they introduce each other with a certain degree of method and regularity. (Enquiry, section III)
His use of the term is not limited to irrationality and madness, as Locke’s was, but it is applied to the trains of thought generally. He questions what relations might explain the observed regularities and claims that there are three: resemblance, contiguity in time or place, and cause or effect. He mentions contrast or contrariety as another candidate in a footnote (footnote 4, section III), but rejects it, arguing it is a mixture of causation and resemblance. Association also explains the combination of simple ideas into complex ideas.
Hume’s inclusion of “cause or effect” as one of the primary categories of association might be thought incongruous with his general view on causality. While the best understanding of association by cause or effect has been controversial, Hume treats it as an independent principle of association, and it can be understood as such, and not, for example, as just a strong association by contiguity. He argues that we gain the impression of a causal power by coming to expect, in the imagination, the effect with the presentation of the cause. As a general matter, he suggests that we cannot feel the relations between sequential ideas, but we can uncover them with imagination and reasoning, though these relations may be different from the factors responsible for association.
Just how generally Hume applied his conception of association may also be subject to interpretation. On the one hand, his discussions of induction, probability, and miracles in the Enquiries suggest that he views association, or habit, as the sole basis of our reasoning about the world, and as such, a normatively adequate means for doing so. On the other hand, he arguably posits several other principles of mind throughout his work. For example, he often treats the imagination as a separate capacity, and he discusses several moral sentiments that would seem to require separate principles. He also expresses uncertainty in the completeness of his list of laws of association. Moreover, he characteristically avoids claims about the ultimate foundation of human nature. In the Treatise, he says: “as to its [association’s] causes, they are mostly unknown, and must be resolv’d into original qualities of human nature which I pretend not to explain” (pg. 13). It may be that, despite its centrality in his philosophy, Hume did not view association as a bedrock principle or cause of thought, though that view later became common, due in large part to the work of David Hartley.
c. David Hartley (1705-1757)
Hartley’s Observations on Man (1749) was published just after Hume’s Enquiry, though he claimed to have been thinking about the power of association for about 18 years. Hartley’s discussion of association is more focused and sustained than Hume’s because of his explicitly programmatic goals. Following Newton’s axiomatization of physics, Hartley sought to axiomatize psychology on the twin principles of association and vibration. Vibrations, in Hartley’s system, are the physiological counterpart of associations. As association carries the mind from idea to idea, vibrations in the nerves carry sensations to the brain and through it. He references physical vibrations as causing mental associations (pg. 6), but then expresses dissatisfaction with this framing and uncertainty on the exact association-vibration relation (pp. 33-34).
The idea is that external stimuli act on nerves, inciting infinitesimally small vibrations in invisible particles of the nerve. These vibrations travel up the nerves, and upon reaching the brain, cause our experience of sensations. If a particular frequency or pattern of vibration is repeated, the brain gains the ability to incite new vibrations like them. This is, effectively, storing a copy of the idea for later thought. These ‘ideas of sensation’ are the elements from which all others are built. Ideas become associated when they are presented at the same time or in immediate succession, meaning that the first idea will bring the second to mind, and, correspondingly, their vibrations in the brain will follow in sequence.
Hartley, like Hume, viewed association as both the principle by which ideas came to follow one another and by which simple ideas were combined into complex ideas: A complex idea is the end point of the process of strengthening associations between simple ideas. Unlike Hume, though, Hartley only posited association by contiguity and did not allow for any other laws of association.
He was also, as noted, explicit in his goal of capturing psychology with the principle. He argues that supposed faculties like memory, imagination, and dreaming, as well as emotional capacities like sympathy, are merely applications of the associative principle. He also emphasized associations between sensations, ideas, and motor responses. For instance, the tuning of motor responses by association explains how we get better at skilled activities with practice. He recognizes that the resulting picture is a mechanical picture, but he does not see this as incompatible with free will, appropriately conceived.
Hartley’s most important contribution is the very project of describing an entire psychology in associative terms. This animated the associationist tradition for the next hundred years or so. In setting up his picture, he was also the first to connect association to physiological mechanisms. This became important in the work of the later empiricist associationists, and in reformulations of associative views after the cognitive revolution discussed in section 4.
d. The Scottish School: Reid, Stewart, and Hamilton
The Scottish Common Sense School, led by Thomas Reid (1710-1796) and subsequently Dugald Stewart (1753-1828) and William Hamilton (1788-1856), was the main competition to associationism in Britain. Their views are instructive in articulating the role and limits of the concept, as well as in setting up Brown’s associationism, discussed below. The Scottish School differed from the associationists in two main ways. Firstly, they took humans to be born with innate knowledge, which Reid called “principles of common sense.” Secondly, they argued for a faculty psychology: They took the mind to be endowed with a collection of distinct “powers” or capacities such as memory, imagination, conception, and judgment. The associationists, in contrast, usually treated these as different manifestations of the single principle of association. Nevertheless, the Scottish School did provide a role for associations.
Reid takes the train of conscious thoughts to be an aggregate effect of the perhaps numerous faculties active at any given time (Essays on the Intellectual Powers of Man, Essay IV, chapter IV). He does allow that frequently repeated trains might become habitual. He treats habit, then, as another faculty that makes these sequences easier to repeat. Associations, or dispositions for certain trains to repeat, are an effect of the causally prior faculty of habit.
Stewart reverses the causal order between association and habit (see Mortera 2005). For Stewart, association is a distinct operation of the mind, which produces mental habits. Association plays a more important role in his system than in Reid’s. He does retain other mental faculties, though, which are responsible for at least the first appearance of any particular sequence in thought. The mistake the associationists make, on his view, is in thinking that they have traced all mental phenomena to a single principle (1855, pp. 11-12). He admits it is possible that philosophers may someday discover the ultimate principle of psychology but doubts that the associationists have done so. Stewart is responding specifically to Joseph Priestly, who edited a famous abridged edition of Hartley’s work.
William Hamilton’s contributions to the concept of association are less direct. He provides the first history of the concept of association of ideas in his notes on The Works of Thomas Reid (1872, Supplemental Dissertation D). Hamilton’s own views also inspired later work by John Stuart Mill in his Examination of Sir William Hamilton’s Philosophy (1878).
e. Thomas Brown (1778-1820)
Thomas Brown occupies a unique position in the history of associationism. His main work, Lectures on the Philosophy of the Human Mind (1820), was published after his death at the age of 43. On the one hand, he is a student of the Scottish School, having studied under Dugald Stewart. On the other hand, he was an ardent associationist, reducing all of the supposed faculties to association. Brown explicitly casts his project as one of identifying and classifying the sequences of “feelings” in the mind, which was his general term for mental states, including ideas, emotions, and sensations.
Arguably, his philosophy of mind is more Humean than Hume’s, in that he extends Hume’s arguments against necessary connections between cause and effect in the world to the mind. He argues that an association is not a “link” between ideas that explains their sequence; it is the sequence itself. The idea of an associative link is vacuous and explanatorily circular. Brown actually argues for the term “suggestion” over “association,” though he uses the terms interchangeably when he fears no misinterpretation (Lecture 40). He differentiates two kinds of suggestion: simple suggestion, in which feelings simply follow in sequence, and relative suggestion, in which the relationship between sequential ideas is felt as well. Simple suggestion is responsible for capacities like memory and imagination, while relative suggestion allows capacities like reason and judgment.
Brown also differs from the standard associationist picture in that he, like Reid, embraces innate knowledge, which he calls “intuitive beliefs.” His prime example is belief in personal identity over time. Another is that “like follows like,” which can serve as the basis for the associating principle. He expresses an expectation that all associations will eventually be shown to be instances of association by contiguity, but does not think this has been shown yet. He thus finds it best to “avail ourselves of the most obvious categories” of contiguity, similarity, and contrast (Lecture 35).
Brown introduces several “secondary” laws of association, which can help predict which of any particular associations are likely to be followed in any given case (Lecture 37). He lists nine, including liveliness of feelings associated, frequency with which they had paired, recency, and differences arising from emotional context. While members of subsequent lists changed, the introduction of secondary laws of association may have been Brown’s most enduring legacy.
In common with those associationists above, Brown emphasizes a role for association in the formation of complex ideas out of simple ideas. However, he views ideas as states of the mind itself, not objects in the mind—a mistake he attributes primarily to Locke. As a result, he argues that it is metaphysically impossible that complex ideas are literally built of simple ideas, since the mind can only occupy one state at a time. He does argue that it is useful to think of simple ideas as continuing in a “virtual coexistence” in complex ideas, but the focus here is an historical/etiological story of how complex ideas came to be, rather than a literal decomposition.
Despite his idiosyncratic views, Brown identified his position as associationist, and it was accepted as such by the tradition. Though his work has been largely forgotten, it was very influential in the United Kingdom and United States in the years following its publication. Brown’s place in the associationist tradition strains standard interpretations of the tradition and what, if anything, unites it. After all, he denies the central associationist posit, the associative link, and allows innate knowledge.
f. James Mill (1773-1836) and John Stuart Mill (1806-1876)
James Mill’s view rivals Hartley’s as a candidate prototypical associationist picture of mind. Mill presents his views in his Analysis of the Phenomena of the Human Mind (originally published 1829, cited here from 1869; this edition includes comments from John Stuart Mill and Alexander Bain).
Like Hartley, James Mill argues that contiguity is the only law of association. Specifically, James Mill argues that similarity is just a kind of contiguity. The claim is that we are used to seeing similar objects together, as sheep tend to be found in a flock, and trees in a forest. In his editorial comments in the 1869 edition, John Stuart Mill calls this “perhaps the least successful attempt at a generalisation and simplification of the laws of mental phenomena, to be found in the work” (pg. 111). For his part, James Mill does not attribute much significance to the question, saying: “Whether the reader supposes that resemblance is, or is not, an original principle of association, will not affect our future investigations” (pg. 114).
In discussing the associative relation itself, James Mill distinguishes synchronous and successive association. Some stimuli are experienced simultaneously, as in those emanating from a single object, and others successively, as in a sequence of events. The resulting ideas are associated correspondingly. Synchronous ideas arise together and themselves constitute complex ideas. Thus, a complex idea, in James Mill’s system, is a literal composite of simpler ideas. Successively associated ideas will arise successively. Of successive association, James Mill remarks that it is not a causal relation, though he does not elaborate on what he means by this (pg. 81). He describes three different ways that the strength of an association can manifest: “First, when it is more permanent than another: Secondly, when it is performed with more certainty: Thirdly when it is performed with more facility” (pg. 82). Adapting some of Brown’s secondary laws, he argues that strength is caused by the vividness of the associated feelings and frequency of the association.
James Mill reduces the various “active” and “intellectual” powers of the mind to association. He limits his discussion of association to mental phenomena, though he recognizes the significance of physiology for motor movements and reflexes. For instance, conception, consciousness, and reflection simply refer to the train of conscious ideas itself. Memory and imagination are particular segments of the trains. Motives are associations between actions and positive or negative sensations which they produce. The will is also reduced to an association between various ideas and muscular movements. Thus, even the active powers are mechanistic. Belief is just a strong association. Ratiocination, as in syllogistic reasoning, simply chains associations. Consider the syllogism: “All men are animals: kings are men : therefore kings are animals” (pg. 424). This produces the compound association “kings – men – animals.” For James Mill, this compound association includes an intermediate that remains in place, but is simply passed over so quickly it can be imperceptible and appear to simply be “kings – animals”; much in the same way that complex ideas still include all of the simpler ideas. This sets up a noteworthy disagreement between James and his son, John Stuart Mill.
John Stuart Mill argues, against his father, that complex ideas are new entities, not mere aggregates of simple ideas, and that intermediate ideas can drop out of sequences like that above. In general, John Stuart Mill analogizes the association of ideas to a kind of chemistry, where a new compound has new properties separate from its constituent elements (A System of Logic, chapter IV). In James Mill’s view of association, ideas retain their identity in combination, like bricks in a wall.
John Stuart Mill’s views on association are spread through several texts (see Warren 1928 pp.95-103 for a summary of his views), and his psychological aspirations are not as imperial or systematic as his father’s. This is evident partly in his lack of a sustained treatment, but also in the phenomena he does not attribute to association. For instance, he does not treat induction as an associative phenomenon, breaking with Hume (see A System of Logic). Similarly, breaking with his father, he does not view belief as simply a strong association, arguing that it must include some other irreducible element (notes in James Mill’s Analysis, pg. 404). When John Stuart Mill does allude to a systematic development of association, he usually defers to our next subject, Alexander Bain.
g. Alexander Bain (1818-1903)
Alexander Bain presents a sophisticated version of empiricist associationism. His main work on the topic comes in The Senses and the Intellect (originally published 1855, cited here from 3rd ed., 1868). Bain’s early work was developed and published with significant help from his close friend and mentor, J. S. Mill, but became a standard.
Bain differs most from previous associationists in the role he grants to instincts. By “instincts,” he means reflex actions, basic coordinated movement patterns such as walking and simple vocalization, and the seeds of volition (the potential for spontaneous action). This discussion is unique, first, in that he separates these out from the domain explained by association. He takes instincts to be “primordial,” inborn, and unlearned. Second, he opens his text with a discussion of basic neuroanatomy and function and explains instincts largely by appeal to the structure of the nervous system and the flow of nervous energy. This discussion was aided in part by recent progress in physiology, but also by an avowed interest in bringing physiology and psychology in contact.
Bain, nonetheless, takes association to be the central explanatory principle for phenomena belonging to the intellect. By “intellect,” he has in mind phenomena one might call thought, such as learning, memory, reasoning, judgment, and imagination. When he switches to his discussion of the intellect, his physiological discussions drop out, and his method is entirely introspective. As Robert Young notes: “his work points two ways: forward to an experimental psychophysiology, and backward to the method of introspection” (1970, pg.133).
Bain never makes any distinction between simple and complex ideas, and he discusses association in successive terms. He also does not restrict association to ideas and argues that the same principles can combine, sequence, and modify patterns of movement, emotions, sensations, and the instincts generally.
He admits three fundamental principles of association: similarity, contiguity, and contrast. Contiguity is the basic principle of memory and learning, while similarity is the basic principle of reasoning, judgment, and imagination. Nonetheless, the three are interdependent in complex ways. For instance, similarity is required for contiguity to be possible: Similarity is required for us to recognize that this sequence is similar enough to a former sequence for them to both strengthen the same association by contiguity. The principle of contrast has a more complex role. On the one hand, it is fundamental to the stream of consciousness in the first place. We would not recognize changes in consciousness as changes without this principle. As such, we cannot be conscious of anything as something without recognizing that there is something else it is not: If red were the only color, we would simply not be conscious of color. The other principles would be impossible. Nonetheless, it can also drive sequences, but only when properly scaffolded by similarity or contiguity. Similarity is necessary for association by contrast because contrast is always within a kind, and similarity is necessary for recognition of that original kind; he notes, “we oppose a long road to a short road, we do not oppose a long road to a loud sound” (1868, pg. 567). In many particular cases, contrast can be driven by contiguity, as contrasting concepts are frequently paired: up and down, pain and pleasure, true and false, and so on. Experiences of contrast themselves, he notes, often arouse emotional responses, as in works of poetry and literature. In other work, however, Bain does not seem to find the question of whether contrast is a separate principle of association to be all that interesting, since transitions based on contrast are very rare, and many instances of contrast-based associations are in fact based in contiguity (1887).
He discusses two other kinds of association: compound association and constructive association (in his first edition, he lists these as additional principles of association, but drops that categorization by the third). Compound association includes the ways associations can interact. For instance, if there are several features present that all remind us of a friend, all of those associative strengths can combine to make it more likely that we think of the friend. He groups imagination and creativity under “constructive association,” an active process of combining ideas, as in imagination, creativity, and the formation of novel sentences.
h. Themes and Lessons
Surveying these views uncovers significant diversity, even among the “pure” associationists found in the empiricist tradition. Most abstractly, the authors differed in their metaphysics. Brown was an avowed dualist. Hartley expresses uncertainty on the mind/brain relation but posited a physiological counterpart to association. Hume and Reid refused to speculate on metaphysics. Precursors include George Berkeley, an idealist, and Thomas Hobbes, a materialist.
The topics of debate within associationism itself included, first, the proposed list of laws of association. While all of the authors mentioned took association by contiguity to be among them, Hume included resemblance and cause or effect, Brown and Bain included similarity and contrast, and Hartley and James Mill included no others. It is common to view associationism as defined by the reliance on association by contiguity. While contiguity was generally posited, this is an oversimplification. It misses not only the diversity in laws posited, but also by the attitude authors take towards those laws. Many central associationists, including Hume, Brown, James Mill, and Bain, either describe their classification to be provisional, or express some willingness to defer. Overall, Stewart’s discussion of the question of how far one traces the causal/explanatory thread captures the general situation. The starting point is observed sequences of conscious thought, and the question is how far one can go in finding the principles that explain those sequences.
Authors also disagreed on whether the process, force, or principle combining simple ideas into complex ideas (simultaneous association) was the same as that producing the sequences of ideas through the mind (successive association). All of the theorists discussed here accept successive association, while simultaneous association is more controversial. Brown disavows simultaneous association, while Bain simply ignores it. Even proponents of simultaneous association disagree on how it operates, as evidenced in John Stuart Mill’s disagreement with his father on “mental chemistry.” Questions like this, about how more complex ideas are formed, remain at issue(for example, see Fodor and Pylyshyn 1988 and Fodor 1998). The formation of abstract ideas was a particularly difficult version of this problem through much of the tradition; it is much easier to see how ideas formed through sensory impressions can refer to concrete objects. Simultaneous association could provide an answer according to which abstract ideas include all of the particulars, while others take abstract ideas to simply include a particular feature, or simply a name for a feature, by, for instance, examining a feeling of similarity between two ideas of particulars.
Finally, there is disagreement in what psychological elements associations are supposed to hold between. Discussion of association often latches onto Locke’s term “association of ideas,” ignoring views that take stimuli and motor movements (most of the authors above, including arguably Locke himself as he describes a visual context improving a dance; Essay, book 2, chapter 33, section 16), reflexes, and instincts (Bain) to be associable in just the same way. Even when discussing association as a relation between ideas, there is disagreement on the nature of ideas and their relationship to mind. For instance, Brown criticizes Locke for treating ideas as independent objects in the mind, rather than states of the mental substance.
The diversity in associationist views suggests that associationism is better viewed as a research program with shared questions and methods, rather than a shared theory or set of theories (Dacey 2015). Such an approach makes better sense of similarities and differences in the views. Hume, Hartley, and James Mill make good prototypes for associationism, but one misses much if one takes any particular author to speak for the tradition as a whole.
2. Fractures in Associationism (1870s-1910s)
In the late nineteenth and early twentieth centuries, the associationist tradition began to fracture. Several factors combined to shape this overall trend. Important changes in the intellectual landscape included the arrival of evolutionary theory, the rise of experimental psychology—bringing with it psychology’s separation from philosophy as a field—and increasing understanding of neurophysiology. At the same time, several criticisms of the pure associationist philosophies became salient. Through this era, the basic conception of association was still largely preserved from the previous one: It is a relation between internal elements of consciousness. By this time, materialism had largely taken over, and most authors here view association as having some neural basis, even if association itself is a psychological relation.
Associationism fractured in this era because the trend was to disavow the general, purely associationist program described in the last section, even if authors still saw association as a central concept. Thus, while associationism lost a shared outlook and purpose, there was still much progress made in testing the possibilities and limits of the concept of association.
a. Herbert Spencer (1820-1903)
Herbert Spencer’s philosophy was framed by a systematic worldview that placed evolutionary progress, as he conceived it, at its core. His psychology was no different. His Principles of Psychology was first published in 1855, four years before On the Origin of Species, but was substantially rewritten by the third edition, which is the focus here (1880, cited here from 1899). By this point, the work had been folded into his System of Synthetic Philosophy, a ten-volume set treating everything from physics to psychology to social policy (Principles of Psychology became volumes 4 and 5). Spencer’s conception of evolution was quite different from later views. Firstly, Spencer believed in the inheritance of acquired traits. Secondly, and partly as a result of this, Spencer viewed evolution as a universal force for progress; species literally get better as they evolve.
The basic units of consciousness for Spencer are nervous shocks, or individual bursts of nervous activity. Thus, the atoms in his picture are much smaller than what we might usually call thoughts or ideas, and all of the psychological activities he describes are assumed to be localizable within the nervous system. Spencer distinguishes between “feelings” proper and relations between feelings. Feelings include what would previously have been called sensations and ideas, as well as emotions. They can exist in the mind independently. Relations are felt, in that they are present in consciousness, but they can only exist between two feelings. For instance, we might feel a relation of relative spatial position between objects in a room as we scan or imagine the scene. Both feelings and relations are associable.
The primary kind of association is that between a particular feeling and members of its same kind. Thus, similarity is the fundamental law of association, both with feelings and relations. A particular experience of red will revive a feeling corresponding to other red feelings. Spencer seems to think that the resulting “assemblages” do not constitute new feelings, effectively siding with James Mill over John Stuart. “Revivability” varies with the vividness of the reviving feeling, the frequency with which feelings have occurred together, and with the general “vigor” of the nervous tissues. This last variable includes the particular claim that a long time spent contemplating one subject will deplete resources in the corresponding bits of brain tissue, making related ideas temporarily less revivable. Relations are generally more revivable, and so more associable, than feelings. Relations can, themselves, aggregate into classes, and revive members of the class. As a result, many relations may arise in mind between two feelings, though some, perhaps most, of these will pass too quickly to be noticed.
Spencer takes the laws of association to simply be manifestations of certain relations between feelings, which are actually associated based on similarity. For instance, he takes association by contiguity to be a relation of “likeness of relation in Time or in Space or in both” (267), which is just a kind of similarity. He does not seem to see any problem in making this claim, while still asserting frequency of co-occurrence as an independent law of revivability above. Moreover, when two feelings arrive in sequence in the mind, they are always mediated by at least two relations: one of difference, as the feelings must not be identical, and one of coexistence or sequence.
Spencer claims to have squared empiricist and rationalist views of mind using evolution (pg. 465). He combines the law of frequency with his view on the heritability of acquired traits to argue that associations learned by members of one generation can be passed on to the next. The empiricists are right that knowledge comes from learning, but the rationalists are right that we are individually born with certain frameworks of understanding the world. In early animals, simple reflexes were so combined to create more flexible instincts. Some relations in the world, like those of space and time, are so reliably encountered that their inner mental correspondents are fixed through evolutionary history. Thus, human beings are born with certain basic ideas, like those of space and time. The resulting view is one in which thought is structured by association, but associations are accrued across generations (see Warren 1928, pg. 132).
b. Early Experimentalists: Galton, Ebbinghaus, and Wundt
Francis Galton (1822-1911), Darwin’s polymath cousin, published the first experiments on association under the title Psychometric Experiments in 1879. He ran his experiments on himself; the method was to work through a list of 75 words, one by one, and record the thoughts each suggested and the time it took to form each associated thought clearly. He did so four different times, in different contexts at least a month apart. He reports 505 associations over 600 seconds total, for a rate of about 50 associations per minute. Of the 505 ideas formed, 289 were unique, with the rest repetitions. He emphasizes that this demonstrates how habitual associations are. He notes that ideas connected to memories from early in his life were more likely to be repeated across the four presentations of the relevant word. This he takes to show that older associations have achieved greater fixity.
Among his pioneering studies on memory, Hermann Ebbinghaus (1850-1909) tested capacity for learning sequences of nonsense syllables, arguably the first test of the learning of associations (1885). He found, using himself as his subject, that the number of repetitions required to learn a sequence increased with the length of the sequence. He also found that rehearsing a sequence 24 hours before learning it brought savings in learning. The savings increased with increasing number of rehearsal repetitions.
Though the first experimental psychology labs were established in Germany, where the concept of association never reached the significance it had in Britain, association remained a target of early experiments, directly or indirectly (see Warren 1928, chapter 8 for a fuller survey; see also sections on Calkins and Thorndike below). These studies established association as a controllable, measurable target for experiment, even among those who did not subscribe to associationism as a general view. This role arguably sustained association as a central concept of psychology into the twenty-first century.
Wilhelm Wundt (1832-1920) provides perhaps the most complete theoretical treatment of association among the early experimentalists (1896, section 16; 1911, chapter 3). While association plays an important role in his system, he objects that associationists leave no place for the will among the passive processes of association. Thus, he distinguishes the passive process of combination he calls association and an active process of combination he calls “apperception” These ideas were developed into structuralism in America by Wundt’s student E. B. Titchener.
c. William James (1842-1910)
William James is not generally considered an associationist, and he attacks the associationists at several points in his Principles of Psychology (originally published 1890, cited here from 1950). However, at the close of his chapter on association (chapter XIV), he professes to have maintained the body of the associationist psychology under a different framing. His framing is captured as follows:
Association, so far as the word stands for an effect, is between THINGS THOUGHT OF—it is THINGS, not ideas, which are associated in the mind. We ought to talk of the association of objects, not of the association of ideas. And so far as association stands for a cause, it is between processes in the brain—it is these which, by being associated in certain ways, determine what successive objects shall be thought. (pg. 554)
James notes here an ambiguity in the term “association”; that between association as an observed sequence of states in the conscious mind (an effect) and association as the causal process driving those sequences. His handling of each side of the ambiguity highlights, in turn, his major criticisms of associationist psychologies before him.
His claim that we ought to talk of association of objects rather than association of ideas stems from his criticism of the associationist belief that the stream of consciousness is made up of discrete “ideas.” James shares with the associationists an emphasis on the stream of consciousness: He takes it to be the first introspective phenomenon of analysis for psychology (chapter IX). However, his introspective analysis of the stream of consciousness reveals it to be too complicated to be broken up into ideas. There are two main reasons for this: First, he notes, ideas are standardly treated as particular entities that are repeatedly revived across time: My idea of “blue” is the same entity now as it was 5 years ago. In contrast, James notes that the totality of our conscious state is always varied. Some of these differences come from external conditions, such as the current illumination of a blue object, or different sounds present, temperatures, and so on. Other differences come internally, including particular moods, varying emotional significance to a particular object, and previous thoughts fading away. He even suggests that organic differences in the brain, like blood flow, might influence our experience of some thought at different times.
His second concern is that consciousness does not present breaks, as one would expect when transitioning between discrete ideas. Rather, consciousness is continuous. Thoughts arise and fade, but they overlap, sometimes attitudes persist in the background, and he insists there is always a feeling present, even if some are transient and difficult to name. Thus, he prefers the term “streams of consciousness” to “trains of thought.”
The association of ideas presents a false view because conscious states are not discrete, and they are never revived in exactly the same way. Both mistakes share one major cause: the fact that we name and identify representational states by the objects that they represent. It is the common referent in the world that makes us think that the idea itself is the same each time, ignoring the nuance of particular experiences. Similarly, we focus on these ideas, ignoring the feelings that bridge them and persist through them. Thus, these problems are solved by shifting to association of objects. This, however, is just a description of the stream of consciousness, and cannot explain it.
James believes that looking at association as a brain process can explain the streams of thought while still respecting the nuances of consciousness just discussed. This claim depends on his view of habit, which he treats as a physiological, even generally physical, fact (chapter IV). Actions often repeated become easier. He explains that channels for nerve discharge become worn with use, just as a path is worn with use, or a paper creased in folding.
Thus, brain processes become associated in the sense that processes frequently repeated in sequence will tend to come in sequence. At any given moment, there are many processes operating behind a particular conscious state: Some processes will have to do with a thought we are considering, some with moods, some with emotional states, and some with ongoing perception as we think. Each of these will, in some way, contribute to the set of thoughts and feelings that come next. This, James held, could explain the various, multifaceted sequences of thought. The various feelings present are not literal “parts” of any conscious state, as in the common associationist picture of complex ideas. Even so, different feelings can potentially influence the direction of the stream of consciousness at any given point because each is attended by brain processes which are separable, and which actually direct the stream. This also allows active processes, like attention and interest, to contribute to guiding the stream of consciousness, even if they are, in effect, operating through habit.
A natural question would be how we know which of any candidate set of thoughts will come next. James discusses some factors much like Brown’s “secondary laws” above, including interest, recency, vividness, and emotional congruity. This is the question taken up by Mary Whiton Calkins.
d. Mary Whiton Calkins (1863-1930)
Mary Whiton Calkins was both the first woman president of the American Psychological Association and the first woman president of the American Philosophical Association. She was a student of James, and despite his enthusiastic support, she was refused her PhD from Harvard because of her gender. This did not prevent her from an influential career and many years as a faculty member at Wellesley College. Her description of association in her textbook (1901) largely follows James’s. However, Calkins was much more interested in experimental methods than him.
She was particularly interested in the question, “What one of the numberless images which might conceivably follow upon the present percept or image will actually be associated with it?” (1896, pg. 32), taking this to be the key to making concrete predictions about the stream of consciousness, and even perhaps to control problematic sequences. In so doing, she targets what had elsewhere been called the secondary laws: frequency, vividness, recency, and primacy. In a paired-associate memory task, she finds frequency to be by far the most significant factor. She finds this surprising, as she takes introspection to indicate that recency and vividness are just as important. She sees this result as significant for training and correcting associative sequences.
e. Sigmund Freud (1856-1939)
Sigmund Freud’s relationship to associationism is most evident in two aspects of his work. First, Freud outlined a thoroughly associationist picture of the mind and brain in his early and unpublished Project for a Scientific Psychology (written 1895, published posthumously in 1950). Second is his invention of the method of free association.
In the Project, Freud conceives of the nervous system as a network of discrete, but contacting, neurons, through which flows a nervous energy he calls “Q.” As neurons become “cathected” (filled) with Q, they eventually discharge to the next downstream neurons. The ultimate discharge of Q results in motor movements, which is how we actually release Q energy. In the central neurons, responsible for memory and thought, there is a resistance at the contact barrier. There is no such resistance at the barriers of sensory neurons. Learning occurs because frequent movements of Q through a barrier will lower its resistance. He identifies this as association by contiguity (pg. 319). Thus, the neurophysiological picture is also a psychological picture, and these basic processes are associative.
In addition, Freud adds two other systems. First is a class of neurons that respond to the period of activity in other neurons. These are able to track which perceptions are real and which are fantasy or hallucination, because stimuli coming in through the senses have characteristic periods. Second is the ego. In this work, the ego is simply a pattern of Q levels distributed across the entire network. By maintaining this distribution, the ego prevents any one neuron or area from becoming too heavily cathected with Q, which would result in erratic thought and action because of the resulting violent discharge. The role of the ego is thus inhibitory. Together, these additional systems control the underlying associative processes in ways that allow rational thought.
Freud never published this work and abandoned most of the details. Nonetheless, it arguably previews the basic underlying theories of much of his later work (as noted by the editor of the standard edition of Freud [Vol 1. pp. 290-292] and Kitcher 1992; see Sulloway 1979 for discussion of continental associationist influences on Freud). The thinking would go that breakdowns in rationality, as in dreaming or pathology, come when basic processes like association operate uncontrolled.
Regardless of exactly how it fits in his overall theoretical framework, his invention of the method of free association deserves note as well. Freud began using free association in the 1890s as an alternative to hypnosis. The patient would lie in a relaxed but waking state and simply discuss thoughts as they came freely to mind. The therapist would then analyze the sequence of thoughts and attempt to determine what unconscious thoughts or desires might be directing them. In later versions, patients are asked to keep in mind a starting point of interest or are presented a particular word or image to respond to. Free association was massively influential, and it remains the core psychoanalytic method (and has also been used in mapping semantic networks; see section 4). It also takes associative processes to operate in the unconscious, another view that would be revived later (see section 5).
f. G. F. Stout (1860-1944)
G. F. Stout continues the trend of criticizing associationism while allowing a significant role for association in his Manual of Psychology (1899). A prominent British philosopher and psychologist at the turn of the century, Stout taught, at different times, at Cambridge (including students G. E. Moore and Bertrand Russell), Aberdeen, Oxford, and St. Andrews. He accepts association as a valuable story for the reproduction of particular elements of consciousness, but he argues that there is an independent capacity for generating new elements. He specifically attacks John Stuart Mill and his analogy of mental chemistry (1899, book I, chapter III). According to Stout, Mill was right that complex ideas are not mere aggregates of simple ideas, but failed to recognize that this means that a new idea must be generated: The new idea had aggregates of associated simple ideas as precursors, not as parts—previewing the work of the Gestalt psychologists. He claims that Mill’s attempt to include the simple ideas in complex ideas as in chemical combination is a desperate attempt to save the theory from a fatal flaw.
Stout does grant association a significant part in the reproduction of ideas in the train of thought. There, as well, he provides a novel interpretation (book IV, chapter II). Specifically, he argues that association by contiguity should be rephrased as “contiguity of interest.” This means that only those elements that are interesting—at the time, based on goals, intentions, and other states—will be associated, and uninteresting elements will be dropped. He takes this to be the sole law of association. Apparent associations by similarity are in fact associations by contiguity of interest, because similar objects will have some aspects that are identical, and these aspects drive the suggestion. He also addresses the question of which of several competing associations will actually lead thought. He mentions Brown’s secondary laws as factors, but he takes the most important to be the “total mental state,” or the “general trend of psychical activity,” such that factors like intentions or background desires are usually decisive.
Finally, he argues that the process of ideational construction is active at all times and does not merely generate new ideas. It also modifies ideas as they are revived. Ideas take on new relations to other ideas. They may be seen in a different light, with different aspects emphasized based on differences in context, as well as in mental state and interests. Ideas are, in a real sense, remade as they are revived.
g. Themes and Lessons
The proliferation of interpretations of association through this era demonstrates the decline of the pure empiricist versions of the view. Nonetheless, the empiricist conception remains prominent. Authors who disavow that position still hold views substantially similar to it. Those working to refine the concept are still working from an empiricist starting point: Associations hold between conscious states, and contiguity and similarity remain the most common laws of association. Compared with the associationists described in the previous section, the diversity of views in this section is greater by a quantitative, rather than qualitative, degree.
Nonetheless, these authors do not proclaim their adherence to associationism, and many expressly disavow it. Worries about the theory itself center on its atomism—treating simple ideas as discrete indivisible units that are reified in thought—and its passive, mechanical depiction of mind. More general trends include increasing knowledge in related fields such as evolutionary theory, neurophysiology, and experimental psychology. Evolutionary theory poses a challenge to associationist empiricism, as it allows a mechanism for innate ideas. Neurophysiology and experimental psychology both contributed to the fracturing of associationism, partly because progress on each came at the time from the continent, where there was less interest in a general associationist picture than in the United Kingdom. Nonetheless, each development supported a role for association. At least superficially, the network of neural connections looks a lot like the network of associated ideas. And associations make good experimental targets because they are easy to induce and test.
It does not seem that associationism must stand or fall with any of these challenges or developments singly, as there are views broadly consistent with each in the previous section. Rather, these problems persisted and compiled at the same time as new ideas from other fields allowed researchers to step out of the old paradigm and cast about for new formulations of the old idea. The general picture, then, is of a concept losing its role as the single core-concept of psychology and philosophy of mind, but nonetheless retaining several important roles. The development that finally brought this particular associationist tradition to an end, the rise of behaviorism, returned association to its central position.
3. Behaviorism (1910s-1950s)
Behaviorism arose in America as a reaction to the introspective methods that had dominated psychology to that point. Most of the authors listed above built their systems entirely from introspection. Even the experimentalists mostly recorded introspective reports, often using themselves as the only subject. The behaviorists did not see this as a reliable basis for a scientific psychology. Science, as they saw it, only succeeded when it studied public, observable phenomena that could be recorded, measured, and independently verified. Introspection is a private process, which is not independently verifiable or objectively measurable.
The result of adopting this viewpoint was a complete change in the conceptual basis of psychology, as well as in its methodology and theory. Behaviorists abandoned concepts like “ideas” and “feelings,” and the notion that the stream of consciousness was the primary phenomenon of psychology. Some even denied the phenomenon of consciousness itself. What they did not abandon, however, was the concept of association. In fact, association regained its role as the central concept of psychology, now reimagined as a relation between external stimuli and responses rather than internal conscious states. Even the law of association by contiguity was co-opted.
a. Precursors: Pavlov, Thorndike, and Morgan
Ivan Pavlov’s (1849-1936) famous work provided what would be a core phenomenon and some of the basic language of the behaviorists. Pavlov (1902) was interested in the physiology of the digestive system of dogs and the particular stimuli which elicit salivation. In the course of his studies, he observed that salivation would occur as the attendant who usually fed the animal approached. He noted a difference between “unconditional reflex,” as when salivation occurs due to a taste stimulus, and a “conditional reflex,” as when salivation occurs due to the approaching attendant (1902, pg. 84). Pavlov was able to show that a stimulus as arbitrary as a musical note or a bright color could cause salivation if paired frequently with food. He notes that the effect is only caused when the animal is hungry, and that it seems important that the unconditional reflex is tied to a basic life process. His account of the phenomenon is characteristically physiological:
It would appear as if the salivary centre, when thrown into action by the simple reflex, because a point of attraction for influences from all organs and regions of the body specifically excited by other qualities of the object. (pg. 86)
This phenomenon came to be known as “classical conditioning.” As Pavlov presciently remarks: “An immeasurably wide field for new investigation is opened up before us” (pg. 85). In subsequent work, Pavlov (1927) further explores these processes, including inhibitory processes such as extinction, conditioned inhibition, and delay.
Edward Thorndike (1874-1949) explicitly targeted the processes of association in animals (1898). He laments that existing work tells us that a cat will associate hearing the phrase “kitty kitty” with milk, but does not tell us the actual sequence of associated thoughts, or “what real mental content is present” (pp. 1-2). To test this objectively, he placed animals in a series of puzzle boxes with food visible outside. Most were cats, but he also experimented with dogs and chicks. Escape, and thus food, required unlocking the door using one or more actions such as pulling a string, pressing a lever, or depressing a paddle. If they did not escape within a certain time limit, they would be removed without food.
As Thorndike describes it, animals placed in the box first perform “instinctive” actions like clawing at the bars and attempting to squeeze through the gaps. Eventually, the animal will happen upon the actual mechanism and accidentally manipulate it. Once some action is successful, the animal will associate it with the stimulus of the inside of the box. This association gradually strengthens with repeated presentation, as shown by learning curves of animals more rapidly escaping with sequential trials, which came to be known as operant, or instrumental, conditioning. He argues that this must be explained with associations between an idea or sense impression and an impulse to a particular action, rather than the “association of ideas,” as ideas themselves are inert (pg.71). He expresses the belief that animals have conscious ideas but remains officially agnostic, and he emphasizes that humans are not merely animals plus reason; human associations are different from animal associations as well. Thus, he arrives at the basic idea that he later restated under the name “the law of effect”:
Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with that situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond. (1911, pg. 244)
While the name “law of effect” has stuck, it is worth noting that in his dissertation (1898) and his textbook (1905 pp. 199-203), Thorndike simply calls it the “law of association.”
Lloyd Morgan (1852-1936) also discusses “the association of ideas” in nonhuman animals. However, his most significant contribution to the use of the concept is indirect, through a methodological principle that came to be known as his “Canon”:
In no case may we interpret an action as the outcome of the exercise of a higher psychical faculty, if it can be interpreted as the outcome of the exercise of one which stands lower in the psychological scale. (Morgan 1894, pg. 53)
The behaviorists took Morgan’s Canon to encourage positing minimal mental processes. More generally, associative processes are usually thought to be among the “lowest,” or “simplest,” processes available. This means that an associative explanation will be preferred until it can be ruled out; a practice that remains (see sections 4 and 5).
b. John B. Watson (1878-1958)
Watson rung in the behaviorist era with his paper Psychology as the Behaviorist Views It (1913). In that work, he attacks the introspective method and claims about conscious feelings or thoughts. As he develops the view (1924/1930), he says that all of psychology can be reframed in terms of stimulus and response. The connection between them is a “reflex arc” of neural connections running from the sense organ to the muscles and glands necessary for a response. Watson thus identifies each stimulus with specific physical features, and each response with specific physiological changes or movements. This came to be known, following Tolman (1932), as the “molecular” definition of behavior, distinct from the “molar” definition, which characterizes behaviors more abstractly; purposively (intentionally), or as a pattern of specific excitations and movements.
Watson applies the same system to humans and to nonhuman animals. He takes infants to be born with only a small stock of simple reflexes, or “unconditioned” stimulus-response pairs—nothing that could properly be called instinct. These basic reflex patterns are modified by conditioning. In conditioning, the new conditioned stimulus either “replaces” the original unconditioned stimulus as a cause of the response, like the musical notes in Pavlov’s experiments, or a new response is conditioned to an existing stimulus, as when one becomes afraid of a dog that had been previously seen as friendly. As these conditioned changes compound, stimulus-response sets can be coordinated in the ways that allow sophisticated behaviors in humans. He backs this up using experiments with infants, such as his ethically fraught Little Albert experiment: Watson conditioned a fear response to a white rat in 11-month-old Albert by making a loud noise every time the rat was produced (1924/1930, pp. 158-164).
Though Watson does not cast his own view in associative terms, his stimulus-response psychology effectively places association back at the center of psychology, and offhand references to association suggest he recognizes some connection. Even setting aside the specific points that S-R connections operate like associations, and classical conditioning like association by contiguity, Watson’s behaviorism shares with associationism an empiricist, anti-nativist orientation and an ideal of basing psychology on a single principle.
c. Edward S. Robinson (1893-1937)
Edward S. Robinson’s work Association Theory To-Day (1932) argues that associations themselves are the same in both behaviorism and the older associationist tradition. The difference is what answer one gives to the question, “What is associated?” Associationism had been rejected in large part because it was taken to be a relation between mentalistic ideas. Robinson takes this to be unfair, pointing to the diversity of views in earlier associationists. Robinson was far from the first to note the role of association in behaviorism (the earliest paper he cites as arguing along these lines is Hunter 1917; see also Guthrie 1930, discussed below), but he presents a systematic attempt to import previously existing associationist machinery to behaviorism.
An association is still an association, according to Robinson, whether it holds between ideas, stimuli and responses, or neural pathways. He adopts the generic term “psychological activities” to capture all of these, saying that association is a disposition of some activities to instigate particular others. He tentatively adopts a “molar” view of psychological activities over Watson’s molecular view because he doesn’t think existing research has actually shown associations between particular physiological activities. Thus, he argues that the relevant activities must be described at a more abstract level. Robinson does rely on behavioral evidence but does not proclaim the behaviorist rejection of all mentalistic postulates. He takes it to be an open empirical question which activities will be associated in the most effective version of the theory.
Robinson goes on to discuss several laws of association, describing how each should be viewed and summarizing relevant experimental findings. Contiguity, the first, is apparent in conditioning. He attributes the second, assimilation, to Thorndike’s observation that a person will give the same response when presented with sufficiently similar situations (pp. 81-82). Robinson denies this is the same as association by similarity proper, but it is the same basic role Bain gives similarity. Others include frequency, duration, context, acquaintance, composition, and individual differences. He takes the actual associative strength to be a sum of all of these features, lamenting the overemphasis on contiguity itself.
d. B. F. Skinner (1904-1990)
Skinner, like Watson, does not frame his understanding of behaviorism in terms of association. Nonetheless, his work is noteworthy for placing reinforcement at the center of learning. The focus here is on his early career. Skinner studied operant conditioning using an apparatus in which a rat would press a lever to receive food. The food, in this case, reinforces the action of pressing the lever. In Skinner’s view, reinforcement is necessary for operant learning. While this basic idea was known as part of Thorndike’s law of effect, it was not widely believed that effects could reinforce behavioral causes until Skinner. He went on to study reinforcement itself, especially the effects of various schedules of reinforcement (1938).
Skinner differentiated operant conditioning from Pavlovian, or classical, conditioning based on the sequences of stimulus/response (1935). Operant conditioning requires a four-step chain involving two reflexes: from a stimulus (sight of the lever) to an action (pressing the lever), which then causes another stimulus (food, the reinforcer) to a final action (eating/salivating). In Pavlovian-style experiments, a stimulus (for example, a light) switches from triggering an arbitrary reflex (such as orienting towards the light) to triggering a reflex relevant to the reinforcer (such as salivation if food is the reinforcer). Reinforcement is necessary for both; it simply plays a different role. Thinking in associative terms, different types of conditioning are differentiable by structure of associations. But this again modifies the conception of the process of association. Simple contiguity is not enough, one of the stimuli involved must also play the role of reinforcer.
Later, Skinner abandoned the stimulus-response framing of operant conditioning, arguing that the action (lever press) need not be viewed as a direct response to a stimulus (seeing the lever). To explain behavior in such a case, one must look back to the history of reinforcement, rather than any particular eliciting stimulus (1978). Skinner generally opposed private mentalistic posits, but his views on this were not always clear or consistent. He did, like Watson, treat behavior as the only legitimate target of study, retain a generally empiricist picture of mind, and take the view to apply generally. He was able to show that “shaping” techniques based on operant conditioning could train animals to complete sophisticated tasks, and he took this to apply to humans as well (1953), including with regard to language (1957) and even society (1976).
e. Edwin Guthrie (1886-1959)
Edwin Guthrie argues that the core phenomenon of conditioning is just association by contiguity, which he views as the single principle of learning. He states the principle as such: “Stimuli acting at a given instant tend to acquire some effectiveness toward the eliciting of concurrent responses, and this effectiveness tends to last indefinitely” (1930, pg. 416). He goes on to argue that various empirical phenomena of learning, including even forgetting and insight, “may all be understood as instances of a very simple and very familiar principle, the ancient principle of association by contiguity in time” (1930, pg. 428). He later builds on this conception by arguing that stimuli to which animals pay attention will become associated. He takes this to be the actual action by which reinforcers work, dissatisfied by Skinner’s seemingly circular definition of the term “reinforcer.” He presents the new version in simplified form as follows: “What is being noticed becomes a signal for what is being done” (1959, pg. 186).
Guthrie takes the focus on behavior to be an abstraction intended to make psychology empirically tractable, in the same way that physics models frictionless planes. As such, his behaviorism could be seen as less extreme than Watson or Skinner, but perhaps more so than Robinson.
f. Themes and Lessons
Across behaviorist views, association remains the core concept. As in the previous section, though, some authors explicitly take on the associationist mantle while others ignore it. Also as above, there is a diversity in views on the actual structure of associations, how they develop, and what is taken to be associated. Skinner (1945) captured perhaps the largest division: that between the radical behaviorists and the methodological behaviorists. This division is easily cast in terms of their views on association. The radical behaviorists, exemplified by Watson and Skinner, aim to eliminate mentalistic concepts; association can allow this, via the minimal connection between stimulus and response. The methodological behaviorists, exemplified here by Guthrie and Robinson, take the emphasis on behavior to be a methodological abstraction or simplification necessary for scientific progress. By implication, association itself is an abstract relation, which in principle can subsume various possible mechanisms, rather than excluding them.
4. After the Cognitive Revolution (1950s-2000s)
As cognitivism came to dominate in the mid-twentieth century, association took up various roles in different literatures. The rise of cognitivism brought two key changes in psychology generally. First, internal mental states returned. However, these states were generally viewed as functionally defined representational states rather than as imagistic ideas, as in the empiricist associationists. Second, cognitivism views the mind in broadly computational terms. Cognitivists take many psychological processes, called “cognitive processes,” to be algorithms that operate by applying formal rules to symbolic representational states, perhaps in a manner similar to language. Cognitive processes are often contrasted with associative processes, setting up a general view in which association is one kind of psychological process among many. Association is thought to be limited, in particular, because it is too simple to account for complex, rational thought (see Dacey 2019a). Learning by contiguity cannot differentiate which experienced sequences reflect real-world relations and which are mere accidents. Associative sequences in thought do not allow flexible application; they must be rigidly followed. Thus, associative processes are usually posited in simpler systems, like nonhuman animals, or the human unconscious. However, as connectionist computational strategies began to bear fruit, some treated these as a new, revitalized form of general associationism.
This section discusses three research programs that each treat associations in different ways and collectively capture the main threads of late twentieth- and early twenty-first-century thought on association.
a. Semantic Networks
The first program represents semantic memory—memory for facts—as a network of linked concepts. Retrieval or recall of information in such a model is described by activation spreading through this network. When activation reaches some critical level, the information is retrieved and available for use or report. This program got its formal start in the late 1960s with work by Ross Quillian and Allen Collins (Collins and Quillian 1969), and subsequently John R. Anderson (1974) and Elizabeth Loftus (Collins and Loftus 1975). The general idea is that different patterns of association explain facts about information retrieval, such as when it succeeds or fails, and how long it takes. John Anderson generalized the basic idea as part of his Human Associative Memory (HAM) model (Anderson and Bower 1973) and Adaptive Control of Thought (ACT) model and its descendants (Anderson 1996). In more specific circumstances, this basic strategy has been applied in a number of phenomena where information is accessed automatically, including: cued recall, priming (McNamara 2005), word association task responses, false memory (Gallo 2013), reading comprehension (Ericsson and Kintsch 1995), creativity (Runco 2014), and implicit social bias (Fazio 2007, Gawronski and Bodenhausen 2006; see also section 5).
Spreading activation in a network manifests one side of the standard associative story. The difference from previous traditions is that associations relate concepts or propositions, and these networks usually include a possibility of subcritical activation of a concept that can facilitate later retrieval. These models rarely say anything explicitly about learning, but they sometimes carry implications for learning. Often, links are not taken to represent any particular relation, signifying only the disposition to spread activation. This is taken to indicate that the links are learned through a process like association by contiguity, which cannot encode meaningful real-world information. However, sometimes links are labeled with a meaningful relationship between concepts, which would imply a learning process capable of tracking that relation. In addition, some models that emerged out of related research, such as Latent Semantic Analysis (LSA) (Landauer and Dumais 1997) and Bound Encoding of the Aggregate Language Environment (BEAGLE) (Jones and Mewhort 2007), extract semantic information (for example, semantic similarity) about words in a linguistic corpus based on clustering patterns with other words.
b. Associative Learning and the Resorla-Wagner Model
Work on learning proceeded largely separately from the work on semantic networks just described. After the cognitive revolution, conditioning effects remained a representative phenomenon of basic learning processes. They were, again, re-described. Since the associations were taken to be formed between internal mental representations, conditioning was subsumed under the heading of “contingency learning” or “associative learning”: the learning of relations between events that tend to co-occur. “Associative learning” is sometimes used in this literature to refer to this phenomenon, regardless of what mechanism is taken to produce it. In this literature, human and nonhuman animal research have long informed one another. However, the orientation can depend on the subjects. It has long been accepted that humans have complex cognitive processes running in parallel with any simple associative processes (Shanks 2007). The question in the human literature is often whether purely associative models can explain any human learning. Research on animal minds is still heavily influenced by Morgan’s Canon (section 3.a). As a result, associative explanations have been heavily favored. Thus, the question is often whether nonhuman animals have any processes that cannot be described in associative terms.
The Rescorla–Wagner model (1972) has dominated much of this research, either by itself or through its various modifications and descendants. This model includes a “prediction” that is made when the antecedent cue is produced. Associative strength is either increased or decreased based on whether that prediction is borne out. For instance, if an animal has a strong association between a cue and a target, the animal will expect the target once the cue is presented. If the target does not follow, the associative strength is reduced. This presents a different conception of association from those encountered so far, as a prediction-error process, contrasted with the footpath notion of contiguity and with reinforcement (Rescorla 1988; see also Danks 2014, pg. 20, arguing that the prediction itself is not usually taken realistically). It also makes the Rescorla-Wagner model more successful at predicting various phenomena in contingency learning than previous conceptions of association. For instance, it predicts the fact that existing associations can block new associations from forming (Miller, Barnet, and Graham 1995). The computational precision and simplicity of associative models like the Rescorla-Wagner model are a major draw, and they have been further supported by neural evidence of prediction-error tracking in the brain (Schultz, Dayan, and Montague 1997).
However, one can also complicate models like this in various ways. Some models allow interactions between existing associations during learning (Dickinson 2001). Others allow interactions between association and other processes, like attention or background knowledge (Pearce and Macintosh 2010, Dickinson 2012, Thorwart and Livesey 2016). Finally, one can also model interference between associations at retrieval, as in the SOCR (Sometimes-Competing Retrieval) model (Stout and Miller 2007).
Even with these complicated types of models, critics have argued that simple associative stories cannot capture the complexity of associative learning. For instance, some argue that the processes responsible for human associative learning must be propositional (Mitchell, DeHouwer, and Lovibond 2009). Gallistel has been perhaps the most prominent opponent of associative theories of learning in animals generally, arguing that the processes responsible must be symbolic (Gallistel 1990, Gallistel and Gibbon 2002).
c. Connectionism
The arrival of connectionism as a major theory of mind in the 1980s was hailed as a revolution by many of its proponents (Rumelhart, McClelland, and PDP research group 1986). Connectionist models perform especially well in various kinds of categorization tasks. They are a kind of spreading activation model in which activation spreads through sequential layers of nodes. Though there were important precursors, especially Hebb (1949) and Rosenblatt (1962), connectionism came into its own when new techniques allowed much more computationally powerful three-layer networks. These networks include a “hidden” layer between “input” and “output” layers. The revolutionary claims of connectionism are usually based on the idea that the hidden layer represents information in a distributed manner, as a pattern of activation across multiple nodes. Thus, nodes are treated as “subrepresentational” units of information that also presumably correspond to something in the brain, such as neurons, sets or assemblies of neurons, or brain regions (Smolensky 1988). This is also thought to be a realistic view of representation in the brain, which is likely distributed. Unlike the other research programs discussed in this section, which take association to describe one kind of processing among many, connectionism, at least initially, purported to provide a general model of mind.
Connectionism has been treated as a version of associationism by both proponents (Bechtel and Abrahamsen 1991, Clark 1993) and opponents (Fodor and Pylyshyn 1988). This is because it implements a kind of spreading activation, as well as the fact that connectionist networks are able to learn—something symbolic systems struggle with. While the emphasis on learning aligns with a generally empiricist approach, the specific mechanism matters for what, exactly, to make of this. Perhaps the most common process, backpropagation, is not usually thought to be realistic. Another common process, Hebbian learning, implements a version of association by contiguity (Hebb 1949). This is treated as more biologically plausible, but models implementing it are less powerful.
These networks modify the treatment of association by providing another set of answers to the question of what is associated. In this case, it is subrepresentational units or parts of the brain. While neural level stories have attended association throughout its history (see above sections on Hartley, Freud, and Watson; see also Sutton 1998 for discussion of similarities between connectionism and these historical views), they are usually secondary to a psychological-level story. Connectionists, in contrast, actually attempt to model neural-level phenomena.
In many networks, the number of hidden-layer nodes is chosen somewhat arbitrarily, and the network is tuned in whatever way gets the input-output mappings right. The question of what each node might represent in the brain is secondary, complicating their interpretation as actual models of the mind/brain. Arguably, later work during this period split between two approaches. Many researchers simply explore the framework as a computational tool, up to and including deep learning. These researchers are not primarily concerned with accurate modeling of brain processes, though they may view their models as “how-possibly” models (see Buckner 2018 for such a discussion of deep learning models and abstraction). Computational neuroscientists, on the other hand, generally start with neural information like single unit recordings, and model specific neural circuits, networks, or regions.
5. Ongoing Philosophical Discussion (2000s-2020s)
This section briefly surveys two debates that brought the concept of association back under philosophical scrutiny. These debates take place largely in the frameworks outlined in the last section.
a. Dual-Process Theories and Implicit Bias
One of the most philosophically important implications of early twenty-first-century work in psychology, especially social psychology, was the finding that much of our behavior is driven, or heavily influenced, by unconscious processes. Theorists generally captured these findings with Dual-Process theories, which separate the mind into two systems or processing types. Type 1 processing is fast, effortless, uncontrolled, and unconscious, while Type 2 processing is slow, effortful, controlled, and conscious. It is often the case that association is considered to be among the processes in Type 1, but Type 1 is also sometimes treated as associative in general (Kahneman 2011, Uhlmann, Poehlman, and Nosek 2012). This stronger claim is controversial (Mandelbaum 2016), but it is often implicit in discussions of unconscious processing.
The conception of association involved largely stems from the semantic network program described above. These authors, however, tend to emphasize the simplicity of associative processing, and so take onboard an associative account of learning as well. Thus, at stake is not just how one thinks about the mechanisms of unconscious processing, but how they relate to one’s agency and responsibility. It is often thought that unconscious processes cannot produce responsible action because they are associative and as such are too inflexible to produce responsible action (Levy 2014). How one understands and attributes associative models and associative processes is, as a result, significant for the conclusions one draws from this work (Dacey 2019b).
b. The Association/Cognition Distinction
The second discussion has occurred in relation to work in comparative animal psychology. In that literature, many debates are centered on whether the process responsible is associative or cognitive, with association gaining a default status due to Morgan’s Canon. As a result, associative processes are usually thought to be ubiquitous and sometimes can even potentially explain seemingly complex behavior (see Heyes 2012). Some authors have attacked the associative or cognitive framing as unproductive (Buckner 2011, Smith, Couchman, and Beran 2014, Dacey 2016). It remains an empirical question whether psychological processes cluster in ways that support a distinction between associative and cognitive processes. Nonetheless, there are reasons to reframe associative models as operating at either a lower, neural level (Buckner 2017) or a higher, more abstract level (Dacey 2016). Either move would, in principle, allow associative models and cognitive models to be applied to the same process, dissolving the problematic dichotomy.
6. Conclusion
Association is one of the most enduring concepts in the history of theorizing about the mind because it is one of the most flexible and one of the most powerful. The basic phenomena seem clear and indisputable: Some thoughts follow easily in sequence, and frequency of repetition is one reason for this. The models that formalize and articulate this insight seem capable of capturing many psychological phenomena. What this means is disputed and much less clear. There are questions pertaining to the specific mechanisms behind these phenomena, how many phenomena can be explained in these terms, what the associations are, and what is associated. The various views discussed above present very different answers to these questions.
7. References and Further Reading
Anderson, J. R. (1974). Retrieval of Propositional Information from Long-Term Memory. Cognitive Psychology, 6(4), 451-474.
Anderson, J. R. (1996). ACT: A Simple Theory of Complex Cognition. American Psychologist, 51(4), 355.
Anderson, J. R., and Bower, G. H. (1973). Human Associative Memory. Washington, D. C.:V. H. Winston and Sons.
Aristotle (2001). Aristotle’s On the Soul and On Memory and Recollection. J. Sachs (Trans.). Santa Fe: Green Lion Press.
Bain, A. (1868). The Senses and the Intellect. 3rd ed. London: Longman’s, Green, and Co.
Bain, A. (1887). On ‘Association’-Controversies. Mind, 12(46), 161-182.
Bechtel, W., and Abrahamsen, A. (1991). Connectionism and the Mind: Parallel Processing, Dynamics, and Evolution in Networks. Oxford: Blackwell Publishing.
Buckner, C. (2011). Two Approaches to the Distinction between Cognition and ‘Mere Association’. International Journal of Comparative Psychology, 24(4).
Brown, T. (1820). Lectures on the Philosophy of the Human Mind. Edinburgh: W. and C. Tait.
Buckner, C. (2017). Understanding Associative and Cognitive Explanations in Comparative Psychology. The Routledge Handbook of Philosophy of Animal Minds. Oxford: Routledge, 409-419.
Buckner, C. (2018). Empiricism without Magic: Transformational Abstraction in Deep Convolutional Neural Networks. Synthese, 195(12), 5339-5372.
Calkins, M. W. (1896). Association (II.). Psychological Review, 3(1), 32.
Calkins, M. W. (1901). An Introduction to Psychology. London: The Macmillan Company.
Clark, A. (1993). Associative Engine: Connectionism, Concepts, and Representational Change. Cambridge MA: MIT Press.
Collins, A. M., and Loftus, E. F. (1975). A Spreading-Activation Theory of Semantic Processing. Psychological Review, 82(6), 407.
Collins, A. M., and Quillian, M. R. (1969). Retrieval Time From Semantic Memory. Journal of Verbal Learning and Verbal Behavior, 8(2), 240-247.
Dacey, M. (2015). Associationism without Associative Links: Thomas Brown and the Associationist Project. Studies in History and Philosophy of Science Part A, 54, 31–40.
Dacey, M. (2016). Rethinking Associations in Psychology. Synthese, 193(12), 3763-3786.
Dacey, M. (2019a). Simplicity and the Meaning of Mental Association. Erkenntnis, 84(6), 1207-1228.
Dacey, M. (2019b). Association and the Mechanisms of Priming. Journal of Cognitive Science, 20(3), 281-321.
Danks, D. (2014). Unifying the Mind: Cognitive Representations as Graphical Models. Cambridge, MA: MIT Press.
Dickinson, A. (2001). Causal Learning: An Associative Analysis. The Quarterly Journal of Experimental Psychology, 54B(1), 3-25.
Dickinson, A. (2012). Associative Learning and Animal Cognition. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1603), 2733–2742.
Ebbinghaus, H. (1885). 1913. Memory: A Contribution to Experimental Psychology.
Ericsson, K. A., and Kintsch, W. (1995). Long-Term Working Memory. Psychological Review, 102(2), 211.
Fazio, R. (2007). Attitudes as Object-Evaluation Associations of Varying Strength. Social Cognition, 25(5), 603–637.
Fodor, J. A. (1998). Concepts: Where Cognitive Science Went Wrong. Oxford: Oxford University Press.
Fodor, J. A., and Pylyshyn, Z. W. (1988). Connectionism and Cognitive Architecture: A Critical Analysis. Cognition, 28(1-2), 3-71.
Freud, S. (1953-1964). The Standard Edition of the Complete Psychological Works of Sigmund Freud (J. Strachey and A. Freud Eds.), 24 vols. London: The Hogarth Press and the Institute of Psycho-Analysis.
Includes the Projectfor a Scientific Psychology in Volume 1.
Gallistel, C. R. (1990). The Organization of Learning. Cambridge, MA: The MIT Press.
Gallistel, C. R., and Gibbon, J. (2002). The Symbolic Foundations of Conditioned Behavior. n. p.: Psychology Press.
Gallo, D. (2013). Associative Illusions of Memory: False Memory Research in DRM and Related Tasks. n. p.: Psychology Press.
Galton, F. (1879). Psychometric Experiments. Brain, 2(2), 149-162.
Gawronski, B., and Bodenhausen, G. V. (2006). Associative and Propositional Processes in Evaluation: An Integrative Review of Implicit and Explicit Attitude Change. Psychological Bulletin, 132(5), 692.
Guthrie, E. R. (1930). Conditioning as a Principle of Learning. Psychological Review, 37(5), 412.
Guthrie, E. (1959). Association by Contiguity. in Psychology: A Study of a Science. Vol. 2: General Systematic Formulations, Learning, and Special Processes. S. Koch (ed.). New York: McGraw Hill Book Company.
Hartley, D. (1749/1966). Observations on Man. Gainesville, FL: Scholars’ Facsimiles and Reprints.
Hebb, D. O. (1949). The Organization of Behavior. New York: Wiley.
Heyes, C. (2012). Simple Minds: A Qualified Defence of Associative Learning. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1603), 2695-2703.
Hobbes, T. (1651/1991). Leviathan, R. Tuck (ed.). Cambridge: Cambridge University Press.
Hoeldtke, R. (1967). The History of Associationism and British Medical Psychology. Medical History, 11(1), 46-65.
A history of associationism focusing on psychiatric applications.
Hume, D. (1739/1978). A Treatise of Human Nature. L. A. Selby-Bigge, and P. H. Niddich (eds.), Oxford: Clarendon Press.
Hume, D. (1748/1974), Enquiries concerning Human Understanding and concerning the Principles of Morals. L. A. Selby-Bigge (ed.). Oxford: Clarendon Press.
Hunter, W. S. (1917). A Reformulation of the Law of Association. Psychological Review, 24(3), 188.
James, W. (1890/1950). The Principles of Psychology. New York: Dover Publications.
Jones, M. N., and Mewhort, D. J. (2007). Representing Word Meaning and Order Information in a Composite Holographic Lexicon. Psychological Review, 114(1), 1.
Kahneman, D. (2011). Thinking, Fast and Slow. New York: Farrar, Straus and Giroux.
Kitcher, P. (1992). Freud’s Dream: A Complete Interdisciplinary Science of Mind. Cambridge, MA: MIT Press.
Landauer, T. K., and Dumais, S. T. (1997). A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. Psychological Review, 104(2), 211.
Levy, N. 2014. Consciousness and Moral Responsibility. New York: Oxford University Press.
Locke, J. (1700/1974). An Essay concerning Human Understanding. Peter H. Nidditch (ed.). Oxford: Clarendon Press.
Mandelbaum, E. (2016). Attitude, Inference, Association: On the Propositional Structure of Implicit Bias. Noûs, 50(3), 629-658.
McNamara, T. P. (2005). Semantic Priming: Perspectives from Memory and Word Recognition. n. p.: Psychology Press.
Mill, J. (1869) An Analysis of the Phenomena of the Human Mind. (A. Bain and J. S. Mill Eds.). London: Longmans, Green and Dyer.
This edition includes comments from both Alexander Bain and John Stuart Mill.
Mill, J. S. (1963-91). The Collected Works of John Stuart Mill. J. M. Robson. (Gen. Ed.) 33 vols. Toronto: University of Toronto Press.
Miller, R. R., Barnet, R. C., and Grahame, N. J. (1995). Assessment of the Rescorla–Wagner Model. Psychological Bulletin, 117(3), 363–386.
Mitchell, C. J., De Houwer, J., and Lovibond, P. F. (2009). The Propositional Nature of Human Associative Learning. Behavioral and Brain Sciences, 32(2), 183-198.
Morgan, C. Lloyd. (1894). An Introduction to Comparative Psychology. London: Walter Scott.
Mortera, E. L. (2005). Reid, Stewart and the Association of Ideas. Journal of Scottish Philosophy, 3(2), 157-170.
Pavlov, I. P. (1897/1902). The Work of the Digestive Glands. W. H. Thompson (Trans.). London: Charles Griffin and Company.
Pavlov, I. P. (1927). Conditional Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex. G. V. Anrep (Trans.). London: Oxford.
Pearce, J. M., and Mackintosh, N. J. (2010). Two Theories of Attention: A Review and a Possible Integration. Attention and Associative learning: From Brain to Behaviour. Oxford: Oxford University Press.
Rapaport, D. (1974). The History of the Concept of Association of Ideas. New York: International Universities Press, Inc.
This history focuses on the prehistory of the idea of association, applying the term somewhat more broadly than the authors themselves do.
Reid, T. (1872). The Works of Thomas Reid, D. D. W. Hamilton (ed.). Edinburgh: MacLaghlan and Stewart.
Includes Essays on the Intellectual Powers of Man and William Hamilton’s history of association, discussed here.
Rescorla, R. A. (1988). Pavlovian Conditioning: It’s Not What You Think it Is. American Psychologist, 43(3), 151.
Rescorla, R. A., and Wagner, A. R. (1972). A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement. In A. H. Black and W. F. Prokasy (eds.), Classical Conditioning II (pp. 64–99). New York: Appleton-Century-Crofts.
Richardson, A. (2001) British Romanticism and the Science of the Mind. Cambridge: Cambridge University Press.
Robinson, E. S. (1932). Association Theory To-day: An Essay in Systematic Psychology. New York: The Century Co.
Rosenblatt, F. (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Washington: Spartan Books.
Rumelhart, D. E., McClelland, J. L., and PDP Research Group (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations. Cambridge, MA: MIT Press.
Runco, M.A. (2014). Creativity: Theories and Themes: Research, Development, and Practice. Amsterdam: Academic Press.
Schultz, W., Dayan, P., and Montague, P. R. (1997). A Neural Substrate of Prediction and Reward. Science, 275(5306), 1593-1599.
Shanks, D. R. (2007). Associationism and Cognition: Human Contingency Learning at 25. The Quarterly Journal of Experimental Psychology, 60(3), 291-309.
Skinner, B. F. (1935). Two Types of Conditioned Reflex and a Pseudo Type. Journal of General Psychology, Vol. 13, 1: 66-77.
Skinner, B. F. (1938). The Behavior of Organisms. New York: Appleton-Century-Crofts, Inc.
Skinner, B. F. (1945). The Operational Analysis of Psychological Terms. Psychological Review, 52, 270-277, 291-294.
Skinner, B. F. (1953). Science and Human Behavior. London: Collier Macmillan Publishers.
Skinner, B. F. (1957). Verbal Behavior. New York: Appleton-Century-Crofts, Inc.
Skinner, B. F. (1976). Walden two. Indianapolis: Hackett Publishing.
Skinner, B. F. (1978). The Experimental Analysis of Behavior (A History). In B. F. Skinner (ed.), Reflections on Behaviorism and Society (pp.113-126). Englewood Cliffs, NJ: Prentice-Hall.
Smith, J. D., Couchman, J. J., and Beran, M. J. (2014). Animal Metacognition: A Tale of Two Comparative Psychologies. Journal of Comparative Psychology, 128(2), 115.
Smolensky, P. (1988). On the Proper Treatment of Connectionism. Behavioral and Brain Sciences, 11(1), 1-23.
Spencer, H. (1898). Principles of Psychology Vol 1. New York: D. Appelton and Company.
The substantially revised 3rd edition was first published in 1880 and also serves as Volume 4 of his System of Synthetic Philosophy.
Stewart, D. (1855). Philosophical Essays. In W. Hamilton (ed.), The Collected Works of Dugald Stewart (Vol. V) Edinburgh: Thomas Constable and Co.
Stout, G. F. (1899) A Manual of Psychology. New York: University Correspondence College Press.
Stout, S. C., and Miller, R. R. (2007). Sometimes-Competing Retrieval (SOCR): A Formalization of the Comparator Hypothesis. Psychological Review, 114(3), 759.
Sulloway, F. J. (1979) Freud, Biologist of the Mind: Beyond the Psychoanalytic Legend. New York: Basic Books, Inc.
Sutton, J. (1998). Philosophy and Memory Traces: Descartes to Connectionism. Cambridge: Cambridge University Press.
Tabb, K. (2019). Locke on Enthusiasm and the Association of Ideas. Oxford Studies in Early Modern Philosophy Vol 9. DOI: 10.1093/oso/9780198852452.003.0003
Thorndike, E. L. (1898). Animal Intelligence: An Experimental Study of the Associative Processes in Animals. Psychological Monographs: General and Applied, 2(4), i-109.
Thorndike, E. L. (1905). The Elements of Psychology. New York: A. G. Seiler.
Thorndike, E. L. (1911). Animal Intelligence: Experimental Studies. New York: The MacMillan Company
Thorwart, A., and Livesey, E. J. (2016). Three Ways that Non-Associative Knowledge May Affect Associative Learning Processes. Frontiers in Psychology, 7, 2024.
Tolman, E. C. (1932/1967). Purposive Behavior in Animals and Men. New York: Irvington Publishers, Inc.
Uhlmann, E. L., Poehlman, T. A., and Nosek, B. (2012). Automatic Associations: Personal Attitudes or Cultural Knowledge? In Jon D. Hanson (ed.), Ideology, Psychology, and Law. New York: Oxford University Press, 228-260.
Warren, H. C. (1916). Mental Association from Plato to Hume. Psychological Review, 23(3), 208.
Warren, H. C. (1928) A History of the Association Psychology. New York: Charles Scribner’s Sons.
The most complete history of associationism in existence, covering the period up to its publication. Includes more detail on views of most authors covered here, and many others.
Watson, J. B. (1913). Psychology as the Behaviorist Views it. Psychological Review, 20(2), 158.
Watson, J. B. (1924/1930). Behaviorism. Chicago: The University of Chicago Press.
Wundt, W. (1901/1902). Outlines of Psychology 4th ed. C. H. Judd (Trans.). Leipzig: Wilhelm Engelmann
Wundt, W. (1911/1912). An Introduction to Psychology. R. Pintner (Trans.). London: George Allen and Company.
Young, R. M. (1970). Mind, Brain and Adaptation in the Nineteenth Century: Cerebral Localization and Its Biological Context from Gall and Ferrier. Oxford: Clarendon Press.
Climate change is one of the defining challenges of the 21st century. But what is climate change, how do we know about it, and how should we react to it? This article summarizes the main conceptual issues and questions in the foundations of climate science, as well as of the parts of decision theory and economics that have been brought to bear on issues of climate in the wake of public discussions about an appropriate reaction to climate change.
We begin with a discussion of how to define climate. Even though “climate” and “climate change” have become ubiquitous terms, both in the popular media and in academic discourse, the correct definitions of both notions are hotly debated topics. We review different approaches and discuss their pros and cons. Climate models play an important role in many parts of climate science. We introduce different kinds of climate models and discuss their uses in detection and attribution, roughly the tasks of establishing that the climate of the Earth has changed and of identifying specific factors that cause these changes. The use of models in the study of climate change raises the question of how well-confirmed these models are and of what their predictive capabilities are. All this is subject to considerable debate, and we discuss a number of different positions. A recurring theme in discussions about climate models is uncertainty. But what is uncertainty and what kinds of uncertainties are there? We discuss different attempts to classify uncertainty and to pinpoint their sources. After these science-oriented topics, we turn to decision theory. Climate change raises difficult questions such as: What is the appropriate reaction to climate change? How much should we mitigate? To what extent should we adapt? What form should adaptation take? We discuss the framing of climate decision problems and then offer an examination of alternative decision rules in the context of climate decisions.
Climate science is an umbrella term referring to scientific disciplines studying aspects of the Earth’s climate. It includes, among others, parts of atmospheric science, oceanography, and glaciology. In the wake of public discussions about an appropriate reaction to climate change, parts of decision theory and economics have also been brought to bear on issues of climate. Contributions from these disciplines that can be considered part of the application of climate science fall under the scope of this article. At the heart of the philosophy of climate science lies a reflection on the methodology used to reach various conclusions about how the climate may evolve and what we should do about it. The philosophy of climate science is a new sub-discipline of the philosophy of science that began to crystalize at the turn of the 21st century when philosophers of science started having a closer look at methods used in climate modelling. It comprises a reflection on almost all aspects of climate science, including observation and data, methods of detection and attribution, model ensembles, and decision-making under uncertainty. Since the devil is always in the detail, the philosophy of climate science operates in close contact with science itself and pays careful attention to the scientific details. For this reason, there is no clear separation between climate science and the philosophy thereof, and conferences in the field are often attended by both scientists and philosophers.
This article summarizes the main problems and questions in the foundations of climate science. Section 2 presents the problem of defining climate. Section 3 introduces climate models. Section 4 discusses the problem of detecting and attributing climate change. Section 5 examines the confirmation of climate models and the limits of predictability. Section 6 reviews classifications of uncertainty and the use of model ensembles. Section 7 turns to decision theory and discusses the framing of climate decision problems. Section 8 introduces alternative decision rules. Section 9 offers a few conclusions.
Two qualifications are in order. First, we review issues and questions that arise in connection with climate science from a philosophy of science perspective, and with special focus on epistemological and decision-theoretic problems. Needless to say, this is not the only perspective. Much can be said about climate science from other points of view, most notably science studies, sociology of science, political theory, and ethics. For want of space, we cannot review contributions from these fields.
Second, to guard against possible misunderstandings, it ought to be pointed out that engaging in a critical philosophical reflection on the aims and methods of climate science is in no way tantamount to adopting a position known as climate scepticism. Climate sceptics are a heterogeneous group of people who do not accept the results of ‘mainstream’ climate science, encompassing a broad spectrum from those who flat out deny the basic physics of the greenhouse effect (and the influence of human activities on the world’s climate) to a small minority who actively engage in scientific research and debate and reach conclusions at the lowest end of climate impacts. Critical philosophy of science is not the handmaiden of climate scepticism; nor are philosophers ipso facto climate sceptics. So, it should be stressed here that we do not endorse climate scepticism. We aim to understand how climate science works, reflect on its methods, and understand the questions that it raises.
2. Defining Climate and Climate Change
Climate talk is ubiquitous in the popular media as well as in academic discourse, and climate change has become a familiar topic. This veils the fact that climate is a complex concept and that the correct definitions of climate and climate change are a matter of controversy. To gain an understanding of the notion of climate, it is important to distinguish it from weather. Intuitively speaking, the weather at a particular place and a particular time is the state of the atmosphere at that place and at the given time. For instance, the weather in central London at 2 pm on 1 January 2015 can be characterised by saying that the temperature is 12 degrees centigrade, the humidity is 65%, and so forth. By contrast, climate is an aggregate of weather conditions: it is a distribution of particular variables (called the climate variables) arising for a particular configuration of the climate system.
The question is how to make this basic idea precise, and this is where different approaches diverge. 21st-century approaches to defining climate can be divided into two groups: those that define climate as a distribution over time, and those that define climate as an ensemble distribution. The climate variables in both approaches include those that describe the state of the atmosphere and the ocean, and sometimes also variables describing the state of glaciers and ice sheets [IPCC 2013].
Distribution over time. The state of the Earth depends on external conditions of the system such as the amount of energy received from the sun and volcanic activity. Assume that there is a period of time over which the external conditions are relatively stable in that they exhibit small fluctuations around a constant mean value c. One can then define the climate over this time period as the distribution of the climate variables over that period under constant external conditions c [for example, Lorenz 1995]. Climate change then amounts to successive time periods being characterised by different distributions. However, in reality the external conditions are not constant and even when there are just slight fluctuations around c, the resulting distributions may be very different. Hence this definition is unsatisfactory [Werndl 2015].
This problem can be avoided by defining climate as the empirically observed distribution over a specific period of time, where external conditions are allowed to vary. Again, climate change amounts to different distributions for successive time periods. This definition is popular because it is easy to estimate from the observations, for example, from the statistics taken over thirty years that are published by the World Meteorological Organisation [Hulme et al. 2009]. A major problem of this definition can be illustrated by the example in which, in the middle of a period of time, the Earth is hit by a meteorite and becomes a much colder place. Clearly, the climate before and after the hit of the meteor differ. Yet this definition has no resources to recognize this because all it says is that climate is a distribution arising over a specific time period.
To circumvent this problem, Werndl [2015] introduces the idea of regimes of varying external conditions and suggests defining climate as the distribution over time of the climate variables arising under a specific regime of varying external conditions. The challenge for this account is to spell out what exactly is meant by a regime of varying external conditions.
Ensemble Distribution. An ensemble of climate systems (not to be confused with a model ensemble) is an infinite collection of virtual copies of the climate system. Consider the sub-ensemble of these that satisfy the condition that the present values of the climate variables lie in a specific interval around the values measured in the actual climate system (that is, the values compatible with the measurement accuracy). Now assume again that there is period of time over which the external conditions are relatively stable in that they exhibit small fluctuations around a constant mean value c. Then climate at future time t is defined as the distribution of values of the climate variables that arises when all systems in the ensemble evolve from now to t under constant external conditions c [for example, Lorenz 1995]. In other words, the climate in the future is the distribution of the climate variables over all possible climates that are consistent with current observations under the assumption of constant external conditions c.
As we have seen previously, in reality, external conditions are not constant and even small fluctuations around a mean value can lead to different distributions [Werndl 2015]. This worry can be addressed by tracing the development of the initial condition ensemble under actual external conditions. The climate at future time t then is the distribution of the climate variables that arises when the initial conditions ensemble is evolved forward for the actual path taken by the external conditions [for example, Daron and Stainforth 2013].
This definition faces a number of conceptual challenges. First, it makes the world’s climate dependent on our knowledge (via measurement accuracy), but this is counterintuitive because we think of climate as something objective that is independent of our knowledge. Second, the above definition is a definition of future climate, and it is difficult to see how the present and past climate should be defined. Yet without a notion of the present and past climate one cannot define climate change. A third problem is that ensemble distributions (and thus climate) do not relate in a straightforward way to the past time series of observations of the actual Earth and this would imply that the climate cannot be estimated from them [compare, Werndl 2015].
These considerations show that defining climate is nontrivial and there is no generally accepted or uncontroversial definition of climate.
3. Climate Models
A climate model is a representation of particular aspects of the climate system. One of the simplest climate models is an energy-balance model, which treats the Earth as a flat surface with one layer of atmosphere above it. It is based on the simple principle that in equilibrium the incoming and outgoing radiation must be equal (see Dessler [2011], Chapters 3-6, for a discussion of such models). This model can be refined by dividing the Earth into zones, allowing energy transfer between zones, or describing a vertical profile of the atmospheric characteristics. Despite their simplicity, these models provide a good qualitative understanding of the greenhouse effect.
Modern climate science aims to construct models that integrate as much as possible of the known science (for an introduction to climate modelling see [McGuffie and Henderson-Sellers 2005]). Typically, this is done by dividing the Earth (both the atmosphere and ocean) into grid cells. In 2020, global climate models have a horizontal grid scale of around 150 km. Climatic processes can then be conceptualised as flows of physical quantities such as heat or vapour from one cell to another. These flows are mathematically described by equations. These equations form the ‘dynamical core’ of a global circulation model (GCM). The equations typically are intractable with analytical methods, and powerful supercomputers are used to solve them. For this reason, they are often referred to as ‘simulation models’. To solve equations numerically, time is discretised. Current state-of-the-art simulations use time steps of approximately 30 minutes, taking weeks or months in real time on supercomputers to simulate a century of climate evolution.
In order to compute a single hypothetical evolution of the climate system (a ‘model run’), we also require an initial condition and boundary conditions. The former is a mathematical description of the state of the climate system (projected into the model’s own domain) at the beginning of the period being simulated. The latter are values for any variables which affect the system, but which are not directly output by the calculations. These include, for instance, the concentration of greenhouse gases, the amount of aerosols in the atmosphere at a given time, and the amount of solar radiation received by the Earth. Since these are drivers of climatic change, they are often referred to as external forcings or external conditions.
Where processes occur on a smaller scale than the grid, they may be included via parameterisation, whereby the net effect of the process is separately calculated as a function of the grid variables. For instance, cloud formation is a physical process that cannot be directly simulated because typical clouds are much smaller than the grid. So, the net effect of clouds is usually parameterised (as a function of temperature, humidity, and so forth) in each grid cell and fed back into the calculation. Sub-grid processes are one of the main sources of uncertainty in climate models.
There are now dozens of global climate models under continuous development by national modelling centres like NASA, the UK Met Office, and the Beijing Climate Center, as well as by smaller institutions. An exact count is difficult because many modelling centres maintain multiple versions based on the same foundation. As an indication, in 2020 there were 89 model-versions submitted to CMIP6 (Coupled Model Intercomparison Project phase 6), from 35 modelling groups, though not all of these should be thought of as being “independent” models since assumptions and algorithms are often shared between institutions. In order to be able to compare outputs of these different models, the Coupled Model Intercomparison Project (CMIP) defines a suite of standard experiments to be run for each climate model. One standard experiment is to run each model using the historical forcings experienced during the twentieth century so that the output can be directly compared against real climate system data.
Climate models are used in many places in climate science, and their use gives rise to important questions. These questions are discussed in the next three sections.
4. Detection and Attribution of Climate Change
Every empirical study of climate has to begin by observing the climate. Meteorological observatories measure a number of variables such as air temperature near the surface of the Earth using thermometers. But more or less systematic observations are available since about 1750, and hence to reconstruct the climate before then scientists have to rely on proxy data: data for climate variables that are derived from observing other natural phenomena such as tree rings, ice cores, and ocean sediments.
The use of proxy data raises a number of methodological problems centred around the statistical processing of such data, which are often sparse, highly uncertain, and several inferential steps away from the climate variable of interest. These issues were at the heart of what has become known as the Hockey Stick controversy, which broke at the turn of the century in connection with a proxy-based reconstruction of the Northern Hemisphere temperature record [Mann, Bradley and Hughes, 1998]. The sceptics pursued two lines of argument. They cast doubt on the reliability of the available data, and they argued that the methods used to process the data are such that they would produce a hockey-stick-shaped curve from almost any data [for example, McIntyre and McKitrick 2003]. The papers published by the sceptics raised important issues and stimulated further research, but they were found to contain serious flaws undermining their conclusions. There are now more than two dozen reconstructions of this temperature record using various statistical methods and proxy data sources. Although there is indeed a wide range of plausible past temperatures, due to the constraints of the data and methods, these studies do robustly support the consensus that, over the past 1400 years, temperatures during the late 20th century are likely to have been the warmest [Frank et al. 2010].
Do rising temperatures indicate that there is climate change, and if so, can the change be attributed to human action? These two problems are known as the problems of detection and attribution. The Intergovernmental Panel on Climate Change (IPCC) defines these as follows:
Detection of change is defined as the process of demonstrating that climate or a system affected by climate has changed in some defined statistical sense without providing a reason for that change. An identified change is detected in observations if its likelihood of occurrence by chance due to internal variability alone is determined to be small […]. Attribution is defined as ‘the process of evaluating the relative contributions of multiple causal factors to a change or event with an assignment of statistical confidence.’ [IPCC 2013]
These definitions raise a host of issues. The root cause of the difficulties is the clause that climate change has been detected only if an observed change in the climate is unlikely to be due to internal variability. Internal variability is the phenomenon that the values of climate variables such as temperature and precipitation would change over time due to the internal dynamics of the climate system even in the absence of a change in external conditions, because of fluctuations in the frequency of storms, ocean currents, and so on.
Taken at face value, this definition of detection has the consequence that there cannot be internal climate change. The ice ages, for instance, would not count as climate change if they occurred because of internal variability. This is not only at odds with basic intuitions about climate and with the most common definitions of climate as a finite distribution over a relatively short time period (where internal climate change is possible); it also leads to difficulties with attribution: if detected climate change is ipso facto change not due to internal variability, then from the very beginning it is excluded that particular factors (namely, internal climate dynamics) can lead to a change in the climate, which seems to be an unfortunate conclusion.
For the case of the ice ages, many researchers would stress that internal variability is different from natural variability. Since, say, orbital changes explain the ice ages, and orbital changes are natural but external, this is a case of external climate change. While this move solves some of the problems, there remains the problem that there is no generally accepted way to separate internal and external factors, and the same factor is sometimes classified as internal and sometimes as external. For instance, glaciation processes are sometimes treated as internal factors and sometimes as prescribed external factors. Likewise, sometimes the biosphere is treated as an external factor, but sometimes it is also internally modelled and treated as an internal factor. One could even go so far to ask whether human activity is an external forcing on the climate system or an internally-generated Earth system process. Research studies usually treat human activity as an external forcing, but it could consistently be argued that human activities are an internal dynamical process. The appropriate definition simply depends on the research question of interest. For a discussion of these issues, see Katzav and Parker [2018].
The effects of internal variability are present on all timescales, from the sub-daily fluctuations experienced as weather to the long-term changes due to cycles of glaciation. Since internal variability stems from processes in a highly complex nonlinear system, it is also unlikely that the statistical properties of internal variability are constant over time, which further compounds methodological difficulties. State-of-the-art climate models run with constant forcing show significant disagreements both on the magnitude of internal variability and the timescale of variations. (On http://www.climate-lab-book.ac.uk/2013/variable-variability/#more-1321 the reader finds a plot showing the internal variability of all CMIP5 models. The plot indicates that models exhibit significantly different internal variability, leaving considerable uncertainty.) The model must be deemed to simulate pre-industrial climate (including variability) sufficiently well before it can be used for such detection and attribution studies, but we do not have thousands of years of detailed observations upon which to base that judgement. Estimates of internal variability in the climate system are produced from climate models themselves [Hegerl et al. 2010], leading to potential circularity. This underscores the difficulties in making attribution statements based on the above definition, which recognises an observed change as climate change only if is unlikely to be due to internal variability.
Since the IPCC’s definitions are widely used by climate scientists, the discussion about detection and attribution in the remainder of this section is based on these definitions. Detection relies on statistical tests, and detection studies are often phrased in terms of the likelihood of a particular event or sequence of events happening in the absence of climate change. In practice, the challenge is to define an appropriate null hypothesis (the expected behaviour of the system in the absence of changing external influences), against which the observed outcomes can be tested. Because the climate system is a dynamical system with processes and feedbacks operating on all scales, this is a non-trivial exercise. An indication of the importance of the null hypothesis is given by the results of Cohn and Lins [2005], who compare the same data against alternate null hypotheses, with results differing by 25 orders of magnitude of significance! This does not in itself show that either null is more appropriate, but it demonstrates the sensitivity of the result to the null hypothesis chosen. This, in turn, underscores the importance of the choice of null hypothesis and the difficulty of making any such choice if the underlying processes are poorly understood.
In practice, the best available null hypothesis is often the best available model of the behaviour of the climate system, including internal variability, which for most climate variables usually means a state of the art GCM. This model is then used to perform long control runs with constant forcings in order to quantify the internal variability of the model (see discussion above). Climate change is then said to have been detected when the observed values fall outside a predefined range of the internal variability of the model. The difficulty with this method is that there is no single “best” model to choose: many such models exist, they are similarly well developed, but, as noted above, they have appreciably different patterns of internal variability.
The differences between different models are relatively unimportant for the clearest detection results such as recent increases in global mean temperature. Here, as stressed by Parker [2010], detection is robust across different models (for a discussion of robustness see Section 6), and, moreover, there is a variety of different pieces of evidence all pointing to the conclusion that the global mean temperature has increased beyond that which can be attributed to internal variability. However, the issues of which null hypothesis to use and how to quantify internal variability, can be important for the detection of subtler local climate change.
If climate change has been detected, then the question of attribution arises. This might be an attribution of any particular change (either a direct climatic change such as increased global mean temperature, or an impact such as the area burnt by forest fires) to any identified cause (such as increased CO2 in the atmosphere, volcanic eruptions, or human population density). Where an impact is considered, a two-step or multi-step approach may be appropriate. An example of this, taken from the IPCC Good Practice Guidance paper [Hegerl et al. 2010], is the attribution of coral reef calcification impacts to rising CO2 levels, in which an intermediate stage is used by first attributing changes in the carbonate ion concentration to rising CO2 levels, then attributing calcification processes to changes in the carbonate ion concentration. This also illustrates the need for a clear understanding of the physical mechanisms involved, in order to perform a reliable multi-step attribution in the presence of many potential confounding factors.
In the interpretation of attribution results, in particular those framed as a question of whether human activity has influenced a particular climatic change or event, there is a tendency to focus on whether the confidence interval of the estimated anthropogenic effect crosses zero. The absence of such a crossing indicates that change is likely to be due to human factors. This results in conservative attribution statements, but it reflects the focus of the present debate where, in the eyes of the public and media, “attribution” is often understood as confidence in ruling out non-human factors, rather than as giving a best estimate or relative contributions of different factors.
Statistical analysis quantifies the strength of the relationship, given the simplifying assumptions of the attribution framework, but the level of confidence in the simplifying assumptions must be assessed outside that framework. This level of confidence is standardised by the IPCC into discrete (though subjective) categories (“very high”, “high”, and so forth.), which aim to take account of the process knowledge, data limitations, adequacy of models used, and the presence of potential confounding factors. The conclusion that is reached will then have a form similar to the IPCC’s headline attribution statement:
It is extremely likely [³95% probability] that more than half of the observed increase in global average surface temperature from 1951 to 2010 was caused by the anthropogenic increase in greenhouse gas concentrations and other anthropogenic forcings together. [IPCC 2013; Summary for Policymakers, section D.3].
One attribution method is optimal fingerprinting. The method seeks to define a spatio-temporal pattern of change (fingerprint) associated with each potential driver (such as the effect of greenhouse gases or of changes in solar radiation), normalised relative to the internal variability, and then perform a statistical regression of observed data with respect to linear combinations of these patterns. The residual variability after observations have been attributed to each factor should then be consistent with the internal variability; if not, this suggests that an important source of variability remains unaccounted for. Parker [2010] notes that fingerprint studies rely on several assumptions. Chief among them is linearity, that is, that the response of the climate system when several forcing factors are present is equal to a linear combination of the effects of the forcings. Because the climate system is nonlinear, this is clearly a source of methodological difficulty, although for global-scale responses (in contrast to regional-scale responses) additivity has been shown to be a good approximation.
Levels of confidence in these attribution statements are primarily dependent on physical understanding of the processes involved. Where there is a clear, simple, well-understood mechanism, there should be greater confidence in the statistical result; where the mechanisms are loose, multi-factored or multi-step, or where a complex model is used as an intermediary, confidence is correspondingly lower. The Guidance Paper cautions that,
Where models are used in attribution, a model’s ability to properly represent the relevant causal link should be assessed. This should include an assessment of model biases and the model’s ability to capture the relevant processes and scales of interest. [Hegerl 2010, 5]
As Parker [2010] argues, there is also higher confidence in attribution results when the results are robust and there is a variety of evidence. For instance, the finding that late twentieth-century temperature increase was mainly caused by greenhouse gas forcing is found to be robust given a wide range of different models, different analysis techniques, and different forcings; and there is a variety of evidence all of which supports this claim. Thus our confidence that greenhouse gases explain global warming is high. (For further useful extended discussion of detection and attribution methods in climate science, see pages 872-878 of IPCC [2013] and in the Good Practice Guidance paper by Hegerl et al. [2010], and for a discussion of how such hypotheses are tested see Katzav [2013].)
In addition to the large-scale attribution of climate change, attribution of the degree to which individual weather events have become either more likely or more extreme as a result of increasing atmospheric greenhouse gas concentrations is now common. It has a particular public interest as it is perceived as a way both to communicate that climate impacts are happening already, perhaps quantifying risk numerically to price insurance, and offering a motivation for climate mitigation. There is therefore also an incentive to conduct these studies quickly, to inform timely news articles, and some groups have formed to respond quickly to reports of extreme weather and conduct attribution studies immediately. This relies on the availability of data, may suffer from unclear definitions of exactly what category of event is being analysed, and is open to criticism for publicity prior to peer review. There are also statistical implications of choosing to analyse only those events which have happened and not those that did not happen. For a discussion of event attribution see Lloyd and Oreskes [2019] and Lusk [2017].
5. Confirmation and Predictive Power
Two questions arise in connection with models: how are models confirmed and what is their predictive power? Confirmation concerns the question of whether, and to what degree, a specific model is supported by the data. Lloyd [2009] argues that many climate models are confirmed by past data. Parker [2009] objects to this claim. She argues that the idea that climate models per se are confirmed cannot be seriously entertained because all climate models are known to be wrong and empirically inadequate. Parker urges a shift in thinking from confirmation to adequacy for purpose: models can only be found to be adequate for specific purposes, but they cannot be confirmed wholesale. For example, one might claim that a particular climate model adequately predicts the global temperature increase that will occur by 2100 (when run from particular initial conditions and relative to a particular emission scenario). Yet, at the same time, one might hold that the predictions of global mean precipitation by 2100 by the same model cannot be trusted.
Katzav [2014] cautions that adequacy for purpose assessments are of limited use. He claims that these assessments are typically unachievable because it is far from clear which of the model’s observable implications can possibly be used to show that the model is adequate for the purpose. Instead, he argues that climate models can at best be confirmed as providing a range of possible futures. Katzav is right to stress that adequacy for purpose assessments are more difficult than appears at first sight. But the methodology of adequacy for purpose cannot be dismissed wholesale; in fact, it is used successfully across the sciences (for example, when ideal gas models are confirmed to be useful for particular purposes). Whether or not adequacy for purpose assessment is possible depends on the case at hand.
If one finds that one model predicts specific variables well and another model doesn’t, then one would like to know the reasons why the first model is successful and the second not. Lenhard and Winsberg [2010] argue that this is often very difficult, if not impossible: For complex climate models a strong version of confirmation holism makes it impossible to tell where the failures and successes of climate models lie. In particular, they claim that it is impossible to assess the merits and problems of sub-models and the parts of models. There is a question, though, whether this confirmation holism affects all models and whether it is here to stay. Complex models have different modules for the atmosphere, the ocean, and ice. These modules can be run individually and also together. The aim of the many new Model Intercomparison Projects (MIPs) is, by comparing individual and combined runs, to obtain an understanding of the performance and physical merits of separate modules, which it is hoped will identify areas for improvement and eventually result in better performance of the entire model.
Another problem concerns the use of data in the construction of models. The values of model parameters are often estimated using observations, a process known as calibration. For example, the magnitude of the aerosol forcing is sometimes estimated from data. When data have been used for calibration, the question arises whether the same data can be used again to confirm the model. If data are used for confirmation that have not already been used for calibration, they are use-novel. If data are used for both calibration and confirmation, this is referred to as double-counting.
Scientists and philosophers alike have argued that double-counting is illegitimate and that data have to be use-novel to be confirmatory [Lloyd 2010; Shackley et al. 1998; Worrall 2010]. Steele and Werndl [2013] oppose this conclusion and argue that on Bayesian and relative-likelihood accounts of confirmation double-counting is legitimate. Furthermore, Steele and Werndl [2015] argue that model selection theory presents a more nuanced picture of the use of data than the commonly endorsed positions. Frisch [2015] cautions that Bayesian as well as other inductive logics can be applied in better and worse ways to real problems such as climate prediction. Nothing in the logic prevents facts from being misinterpreted and their confirmatory power exaggerated (as in ‘the problem of old evidence’ which Frisch [2015] discusses). This is certainly a point worth emphasising. Indeed, Steele and Werndl [2013] stress that the same data cannot inform a prior probability for a hypothesis and also further (dis)confirm the hypothesis. But they do not address all the potential pitfalls in applying Bayesian or other logics to the climate and other settings. Their argument must be understood as a limited one: there is no univocal logical prohibition against the same data serving for calibration and confirmation. As far as non-Bayesian methods of model selection goes, there are two cases. First, there are methods such as cross-validation where the data are required to be use-novel. For cross-validation, the data are split up into two groups: the first group is used for calibration and the second for confirmation. Second, there are the methods such as the Akaike Information Criterion for which the data need not be use-novel, although information criteria methods are hard to apply in practice to climate models because the number of degrees of freedom is poorly defined.
This brings us to the second issue: prediction. In the climate context this is typically framed as the issue of projection. ‘Projection’ is a technical term in the climate modelling literature and refers to a prediction that is conditional on a particular forcing scenario and a particular initial conditions ensemble. The forcing scenario is specified either by the amount of greenhouse gas emissions and aerosols added to the atmosphere or directly by their atmospheric concentrations, and these in turn depend on future socioeconomic and technological developments.
Much research these days is undertaken with the aim of generating projections about the actual future evolution of the Earth system under a particular emission scenario, upon which policies are made and real-life decisions are taken. In these cases, it is necessary to quantify and understand how good those projections are likely to be. It is doubtful that this question can be answered along traditional lines. One such line would be to refer to the confirmation of a model against historical data (Chapter 9 of IPCC [2013] discusses model evaluation in detail) and argue that the ability of a model to successfully reproduce historical data should give us confidence that it will perform well in the future too. It is unclear at best whether this is a viable answer. The problem is that climate projections for high forcing scenarios take the system well outside any previously experienced state, and at least prima facie there is no reason to assume that success in low forcing contexts is a guide to success in high-forcing contexts; for example, a model calibrated on data from a world with the Arctic Sea covered in ice might no longer perform well when the sea ice is completely melted and the relevant dynamical processes are quite different. For this reason, calibration to past data has at most limited relevance for the assessment of a model’s predictive success [Oreskes et al. 1994; Stainforth et al. 2007a, 2007b, Steele and Werndl 2013].
This brings into focus the fact that there is no general answer to the question of the trustworthiness of model outputs. There is widespread consensus that predictions are better for longer time averages, larger spatial averages, low specificity and better physical understanding; and, all other things being equal, shorter lead times (nearer prediction horizons) are easier to predict than longer ones. Global mean temperature trends are considered trustworthy, and it is generally accepted that the observed upward trend will continue [Oreskes 2007], although the basis of this confidence is usually a physical understanding of the greenhouse effect with which the models are consistent, rather than a direct reliance on the output of models themselves. A 2013 IPCC report [IPCC 2013, Summary for Policymakers, section D.1] professes that modelled surface temperature patterns and trends are trustworthy on the global and continental scale, but, even in making this statement, assigns a probability of at least 66% (‘likely’) to the range within with 90% of model outcomes fall. In plainer terms, this is an expert-assigned probability of at least tens of percent that the models are substantially wrong even about global mean temperature.
There still are interesting questions about the epistemic grounds on which such assertions are made (and we return to them in the next section). A harder problem, however, concerns the use of models as providers of detailed information about the future local climate. The United Kingdom Climate Impacts Programme produces projections that aim to make high-resolution probabilistic projections of the local climate up to the end of the century, and similar projects are run in many other countries [Thompson et al. 2016]. The Programme’s set of projections known as UKCP09 [Sexton et al. 2012, Sexton and Murphy 2012] produces projections of the climate up to 2100 based on HadCM3, a global climate model developed at the UK Met Office Hadley Centre. Probabilities are given for events on a 25km grid for finely defined specific events such as changes in the temperature of the warmest day in summer, the precipitation of the wettest day in winter, or the change in summer-mean cloud amount, with projections blocked into overlapping thirty-year segments which extend to 2100. It is projected, for instance, that under a medium emission scenario the probability for a 20-30% reduction in summer mean precipitation in central London in 2080 is 0.5. There is a question of whether these projections are trustworthy and policy relevant. Frigg et al. urge caution on grounds that many of the UKCP09’s foundational assumptions seem to be questionable [2013, 2015] and that structural model error may have significant repercussions on small scales [2014]. Winsberg [2018] and Winsberg and Goodwin [2016] criticise these cautionary arguments as overstating the limitations of such projections. In 2019, the Programme launched a new set of projections, known as UKCP18 (https://www.metoffice.gov.uk/research/collaboration/ukcp). It is an open question whether these projections are open to the same objections, and, if so, how severe the limitations are.
6. Understanding and Quantifying Uncertainty
Uncertainty features prominently in discussions about climate models, and yet is a concept that is poorly understood and that raises many difficult questions. In most general terms, uncertainty is a lack of knowledge. The first challenge is to circumscribe more precisely what is meant by ‘uncertainty’ and what the sources of uncertainty are. A number of proposals have been made, but the discussion is still in a ‘pre-paradigmatic’ phase. Smith and Stern [2011] identify four relevant varieties of uncertainty: imprecision, ambiguity, intractability and indeterminacy. Spiegelhalter and Riesch [2011] consider a five-level structure with three within-model levels-event, parameter and model uncertainty-and two extra-model levels concerning acknowledged and unknown inadequacies in the modelling process. Wilby and Dessai [2010] discuss the issue with reference to what they call the cascade of uncertainty, studying how uncertainties magnify as one goes from assumptions about future global emissions of greenhouse gases to the implications of these for local adaption. Petersen [2012, Chapters 3 and 6] introduces a so-called uncertainty matrix listing the sources of uncertainty in the vertical and the sorts of uncertainty in the horizontal direction. Lahsen [2005] looks at the issue from a science studies point of view and discusses the distribution of uncertainty as a function of the distance from the site of knowledge production. And these are but a few of the many proposals.
The next problem is the one of measuring and quantifying uncertainty in climate predictions. Among the approaches that have been devised in response to this challenge, ensemble methods occupy centre stage. Current estimates of climate sensitivity and increase in global mean temperature under various emission scenarios, for instance, include information derived from ensembles containing multiple climate models. Multi-model ensembles are sets of several different models which differ in mathematical structure and physical content. Such an ensemble is used to investigate how predictions of relevant climate variables vary (or do not vary) according to model structure and assumptions. A special kind of multi-model ensemble is known as a “perturbed parameter ensemble”. It contains models with the same mathematical structure in which particular parameters assume different values, thereby effectively conducting a sensitivity analysis on a single model by systematically varying some of the parameters and observing the effect on the outcomes. Early analyses such as the climateprediction.net simulations and the UKCP09 results rely on perturbed parameter ensembles only, due to resource limitations; international projects such as the Coupled Model Intercomparison Projects (CMIP) and the work that goes into the IPCC assessments are based on multi-model ensembles containing different model structures. The reason to use ensembles is the acknowledged uncertainties in individual models, which concerns both the model structure and the values of parameters in the model. It is a common assumption that ensembles help understand the effects of these uncertainties either by producing and identifying “robust” predictions, or by providing estimates of this uncertainty about future climate change. (Parker [2013] provides an excellent discussion of ensemble methods and the problems that attach to them.)
A model-result is robust if all or most models in the ensemble show the same result; for general discussion of robustness analysis see Weisberg [2006]. If, for instance, all models in an ensemble show more than 4º increase in global mean temperature by the end of the century when run under a specific emission scenario, this result is robust across the specified ensemble. Does robustness justify increased confidence? Lloyd [2010, 2015] argues that robustness arguments are powerful in connection with climate models and lend credibility at least to core claims such as the claim that there was global warming in the 20th Century. Parker [2011], by contrast, reaches a more sober conclusion: ‘When today’s climate models agree that an interesting hypothesis about future climate change is true, it cannot be inferred […] that the hypothesis is likely to be true or that scientists’ confidence in the hypothesis should be significantly increased or that a claim to have evidence for the hypothesis is now more secure’ [ibid. 579]. One of the main problems is that if today’s models share the same technological constraints posed by today’s computer architecture and understanding of the climate system, then they inevitably share some common errors. Indeed, such common errors have been widely acknowledged (see, for instance, Knutti et al. [2010]) and studies have demonstrated and discussed the lack of model independence [Bishop and Abramowitz 2013; Jun et al. 2008a; 2008b]. But if models are not independent, then there is a question about how much epistemic weight agreement between them carries.
When ensembles do not yield robust predictions, then the spread of results within the ensemble is sometimes used to estimate quantitatively the uncertainty of the outcome. There are two main approaches to this. The first approach aims to translate the histogram of model results directly into a probability distribution: in effect, the guiding principle is that the probability of an outcome is proportional to the fraction of models in the ensemble which produce that result. The thinking behind this method seems to be to invoke some sort of frequentist approach to probabilities. The appeal to frequentism presupposes that models can be treated as exchangeable sources of information (in the sense that there is no reason to trust one ensemble member any more than any other). However, as we have previously seen, the assumption that models are independent has been questioned. There is a further problem: MMEs are ‘ensembles of opportunity’, grouping together existing models. Even the best ensembles such as CMIP6 are not designed to systematically explore all possibilities. It is therefore not clear why the frequency of ensemble projections should double as a guide to probability. The IPCC acknowledges this limitation (see discussion in Chapter 12 of IPCC [2013]) and thus downgrade the assessed likelihood of ensemble-derived ranges, deeming it only “likely” (³66%) that the real-world global mean temperature will fall within the 90% model range (for a discussion of this case see Thompson et al [2016]).
A more modest approach regards ensemble outputs as a guide to possibility rather than probability. In this view, the spread of an ensemble presents the range of outcomes that cannot be ruled out. The bounds of this set of results-often referred to as a ‘non-discountable envelope’-provide a lower bound of the uncertainty [Stainforth et al. 2007b]. In this spirit Katzav [2014] argues that a focus on prediction is misguided and that models ought to be used to show that particular scenarios are real possibilities.
While undoubtedly less committal than the probability approach, also non-discountable envelopes raise questions. The first is the relation between non-discountability and possibility. Non-discountable results are ones that cannot be ruled out. How is this judgment reached? Do results which cannot be ruled out indicate possibilities? If not, what is their relevance for estimating lower bounds? And, could the model, if pushed more deliberately towards “interesting” behaviours, actually make that envelope wider? Furthermore, it is important to keep in mind that the envelope just represents some possibilities. Hence it does not indicate the complete range of possibilities, making particular types of formalised decision-making procedures impossible. For a further discussion of these issues see Betz [2009, 2010].
Finally, a number of authors emphasise the limitations of model-based methods (such as ensemble methods) and submit that any realistic assessment of uncertainties will also have to rely on other factors, most notably expert judgement. Petersen [2012, Chapter 4] outlines the approach of the Netherlands Environmental Assessment Agency (PBL), which sees expert judgment and problem framings as essential components of uncertainty assessment. Aspinall [2010] suggests using methods of structured expert elicitation.
In light of the issues raised above, how should uncertainty in climate science be communicated to decision-makers? The most prominent framework for communicating uncertainty is the IPCC’s, which is used throughout the Fifth Assessment Report (AR5), is explicated in the ‘Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties’ and further explicated in [Mastrandrea et al. 2011]. The framework appeals to two measures for communicating uncertainty. The first, a qualitative ‘confidence’ scale, depends on both the type of evidence and the degree of agreement amongst experts. The second measure is a quantitative scale for representing statistical likelihoods (or more accurately, fuzzy likelihood intervals) for relevant climate/economic variables. The following statement exemplifies the use of these two measures for communicating uncertainty in AR5: ‘The global mean surface temperature change for the period 2016–2035 relative to 1986–2005 is similar for the four RCPs and will likely be in the range 0.3°C to 0.7°C (medium confidence). [IPCC 2013] A discussion of this framework can be found in Adler and Hirsch Hadorn [2014], Budescu et al. [2014], Mach et al. [2017], and Wüthrich [2017].
At this point, it should also be noted that the role of ethical and social values in relation to uncertainties in climate science is controversially debated. Winsberg [2012] appeals to complex simulation modelling to argue that it is infeasible for climate scientists to produce results that are not influenced by their ethical and social values. More specifically, he argues that assignments of probabilities to hypotheses about future climate change are influenced by ethical and social values because of the way these values come into play in the building and evaluating of climate models. Parker [2014] contends that pragmatic factors rather than social or ethical values often play a role in resolving these modelling choices. She further objects that Winsberg’s focus on precise probabilistic uncertainty estimates is misguided; coarser estimates like those used by the IPCC better reflect the extent of uncertainty and are less influenced by values. She concludes that Winsberg has exaggerated the influence of ethical and social values here but suggests that a more traditional challenge to the value-free ideal of science fits the climate case. Namely, one could argue that estimates of uncertainty are themselves always somewhat uncertain, and that the decision to offer a particular estimate of uncertainty thus might appropriately involve value judgments [compare, Douglas 2009].
7. Conceptualising Decisions Under Uncertainty
What is the appropriate reaction to climate change? How much should we mitigate? To what extent should we adapt? And what form should adaptation take? Should we build larger water reserves? Should we adapt houses, and our social infrastructure more generally, to a higher frequency of extreme weather events like droughts, heavy rainfalls, floods, and heatwaves, as well as the increased incidence of extremely high sea levels or the more frequent occurrence of particularly hot days are extreme weather events? The decisions that we make in response to these questions have consequences affecting both individuals and groups at different places and times. Moreover, the circumstances of many of these decisions involve uncertainty and disagreement that is sometimes both severe and wide-ranging, concerning not only the state of the climate (as discussed above) and the broader social consequences of any action or inaction on our part, but also the range of actions available to us and what significance we should attach to their possible consequences. These considerations make climate decision-making both important and hard. The stakes are high, and so too are the difficulties for standard decision theory—plenty of reason for philosophical engagement with this particular application of decision theory.
Let us begin by looking at the actors in the climate domain and the kinds of decision problems that concern them. When introducing decision theory, it is common to distinguish three main domains: individual decision theory (which concerns the decision problem of a single agent who may be uncertain of her environment), game theory (which focuses on cases of strategic interaction amongst rational agents), and social choice theory (which concerns procedures by which a number of agents may ‘think’ and act collectively). All three realms are relevant to the climate-change predicament, whether the concern is adapting to climate change or mitigating climate change or both.
Determining the appropriate agential perspective and type of engagement between agents is important, because otherwise decision-modelling efforts may be in vain. For instance, it may be futile to focus on the plight of individual citizens when the power to affect change really lies with states. It may likewise be misguided to analyse the prospects for a collective action on climate policy, if the supposed members of the group do not see themselves as contributing to a shared decision that is good for the group as a whole. It would also be misleading to exclude from an individual agent’s decision model the impact of others who perceive that they are acting in a strategic environment. This is not, however, to recommend a narrow view of the role of decision models-that they must always represent the decisions of agents as they see them, and can never be aspirational; the point is rather that we should not employ decision models with particular agential framings in a naïve way.
Getting the agential perspective right is just the first step in framing a decision problem so that it presents convincing reasons for action. There remains the task of representing the details of the decision problem from the appropriate epistemic and evaluative perspective. Our focus is individual decision theory, for reasons of space, and because most decision settings ultimately involve the decision of an individual, whether this be a single person or a group acting as an individual.
The standard model of (individual) decision-making under uncertainty used by decision theorists derives from the classic work of von Neumann and Morgenstern [1944] and Leonard Savage [1954]. It treats actions as functions from possible states of the world to consequences, these being the complete outcomes of performing the action in question in that state of the world. All uncertainty is taken to be uncertainty about the state of the world and is quantified by a single probability function over the possible states, where the probabilities in question measure either objective risk or the decision maker’s degrees of belief (or a combination of the two). The relative value of consequences is represented by an interval-scaled utility function over these consequences. Decision-makers are advised to choose the action with maximum expected utility (EU); where the EU for an action is the sum of the probability-weighted utility of the possible consequences of the action.
It is our contention that this model is inadequate for many climate-oriented decisions, because it fails to properly represent the multidimensional nature and severity of the uncertainty that decision-makers face. To begin with, not all the uncertainty that climate decision-makers face is empirical uncertainty about the actual state of the world (state uncertainty). There may be further empirical uncertainty about what options are available to them and what are the consequences of exercising each option for each respective state (option uncertainty). In what follows we use the term ‘empirical uncertainty’ to cover both state uncertainty and option uncertainty. Furthermore, decision-makers face a non-empirical kind of uncertainty-ethical uncertainty-about what values to assign to possible consequences.
Let us now turn to empirical uncertainty. As noted above, standard decision theory holds that all empirical uncertainty can be represented by a probability function over the possible states of the world. There are two issues here. The first is that confining all empirical uncertainty to the state space is rather unnatural for complex decision problems such as those associated with climate change. In fact, decision models are less convoluted if we allow the uncertainty about states to depend on the actions that might be taken (compare, Richard Jeffrey’s [1965] expected utility theory), and if we also permit further uncertainty about what consequence will arise under each state, given the action taken (an aspect of option uncertainty). For instance, consider a crude version of the mitigation decision problem faced by the global planner: it may be useful to depict the decision problem with a state-space partition in terms of possible increases in average global temperature over a given time period. In this case, our beliefs about the states (how likely they each are) would be conditional on the mitigation option taken. Moreover, for each respective mitigation option, the consequence arising in each of the states depends on further uncertain features of the world, for instance the extent to which, on average, regional conditions would be favourable to food production and whether social institutions would facilitate resilience in food production.
The second issue is that using a precise probability function to represent uncertainty about states (and consequences) can misrepresent the severity of this uncertainty. For instance, even if one assumes that the position of the scientific community may be reasonably well represented by a precise probability distribution over the state space, conditional on the mitigation option, precise probabilities over the possible food productions and other economic consequences, given this option and average global temperature rise, are less plausible. Note that the global social planner’s mitigation decision problem is typically analysed in terms of a so-called Integrated Assessment Model (IAM), which does indeed involve dependencies between mitigation strategies and both climate and economic variables. There is some disparity in the representation of empirical uncertainty: Nordhaus’s [2008] reliance on ‘best estimates’ for parameters like climate sensitivity can be compared with Stern’s [2007] use of ‘confidence intervals’. But these are relatively minor differences. Critics argue that all extant IAMs inadequately represent the uncertainty surrounding projections of future wealth under the status quo and alternative mitigation strategies [see Weitzman 2009, Frisch 2013, Stern 2013]. In particular, both Nordhaus [2008] and Stern [2007] controversially assume increasing wealth over time (or positive consumption growth rate) even for the status quo where nothing is done to mitigate climate change.
Popular among philosophers is the use of sets of probability functions to represent severe uncertainty surrounding decision states/consequences, whether the uncertainty is due to evidential limitations or due to evidential/expert disagreement. This is a minimal generalisation of the standard decision model, in the sense that probability measures still feature: roughly, the more severe the uncertainty, the more probability measures over the space of possibilities needed to conjointly represent the epistemic situation (see, for instance, Walley [1991]). For maximal uncertainty all possibilities are on a par-they are effectively assigned probability [0, 1]. Indeed it is a strength of the imprecise probability representation that it generalises the two extreme cases, that is, the precise probabilistic as well as the possibilistic frameworks. (See Halpern [2003] for a thorough treatment of frameworks, both qualitative and quantitative, for representing uncertainty.) In some contexts, it may be suitable to weight the possible probability distributions in terms of plausibility (as required for some of the decision rules discussed below). The weighting approach may in fact match the IPCC’s representation of the uncertainty surrounding decision-relevant climate and economic variables. Indeed, an important question is whether and how the IPCC’s representation of uncertainty can be translated into an imprecise probabilistic framework, as discussed here and in the next section. An alternative to the aforementioned proposal is that the IPCC’s confidence and likelihood measures for relevant variables should be combined to form an unweighted imprecise set of probability distributions, or even a precise probability distribution, suitable for input into an appropriate decision model.
Decision makers face uncertainty not only about what will or could happen, but also about what value to attach to these possibilities-in other words, they face ethical uncertainty. Such value or ethical uncertainty can have a number of different sources. The most important ones arise in connection with judgments about how to distribute the costs and benefits of mitigation and adaptation amongst different regions and countries, about how to take account of persons whose existence depends on what actions are chosen now, and about the degree to which future wellbeing should be discounted. (For discussion and debate about the ethical significance of various climate outcomes, particularly at the level of global rather than regional or national justice, see the articles in Gardiner et al.’s [2010] edited collection, Climate Ethics.) Of these, the latter has been the subject of the most debate, because of the extent to which (the global planner’s) decisions about how drastically to cut carbon emissions are sensitive to the discount rate used in evaluating the possible outcomes of doing so (as highlighted in Broome [2008]). Discounting thus provides a good illustration of the importance of ethical uncertainty.
In many economic models, a discount rate is applied to a measure of total wellbeing at different points in time (the ‘pure rate of time preference’), with a positive rate implying that future wellbeing carries less weight in the evaluations of options than present wellbeing. Note that the overall ‘social discount rate’ in economic models is the sum of the pure rate of time preference and a second term pertaining to the discounting of goods or consumption rather than wellbeing per se. See Broome [1992] and Parfit [1984] for helpful discussions of the reasons for discounting goods that do not imply discounting wellbeing. (The consumption growth rate is an important component of this second discounting term that is subject to empirical uncertainty, as discussed above; see Greaves [2017] for an examination of all the assumptions underlying the ‘social discount rate’ and its role in the standard economic method for evaluating policy options.) Many philosophers regard any pure discounting of future wellbeing as completely unjustified from an objective point of view. This is not to deny that temporal location may nonetheless correlate with features of the distribution of wellbeing that are in fact ethically significant. If people will be better off in the future, for instance, it is reasonable to be less concerned about their interests than those of the present generation, much as one might prioritise the less well-off within a single generation. But the mere fact of a benefit occurring at a particular time cannot be relevant to its value, at least from an impartial perspective.
Economists do nonetheless often discount wellbeing in their policy-oriented models, although they disagree considerably about what pure rate of time preference should be used. One view, exemplified by the Stern Review and representing the impartial perspective described above, is that only a very small rate (in the order of 0.5%) is justified, and this on the grounds of the small probability of the extinction of the human population. Other economists, however, regard a partial rather than an impartial point of view more appropriate in their models. A view along these lines, exemplified by Nordhaus [2007] and Arrow [1995a], is that the pure rate of time preference should be determined by the preferences of current people. But typical derivations of average pure time discounting from observed market behaviour are much higher than those used by Stern (around 3% by Nordhaus’s estimate). Although the use of this data has been criticised for providing an inadequate measure of people’s reasoned preferences (see, for example, Sen [1982], Drèze and Stern [1990], Broome [1992]), the point remains that any plausible method for determining the current generation’s attitude to the wellbeing of future generations is likely to yield a rate higher than that advocated by the Stern Review. To the extent that this debate about the ethical basis for discounting remains unresolved, there will be ethical uncertainty about the discount rate in climate policy decisions. This ethical uncertainty may be represented analogously to empirical uncertainty-by replacing the standard precise utility function with a set of possible utility functions.
8. Managing Uncertainty
How should a decision-maker choose amongst the courses of action available to her when she must make the choice under conditions of severe uncertainty? The problem that climate decision-makers face is that, in these situations, the precise utility and probability values required by standard EU theory may not be readily available.
There are, broadly speaking, three possible responses to this problem.
(1) The decision-maker can simply bite the bullet and try to settle on precise probability and utility judgements for the relevant contingencies. Orthodox decision theorists argue that rationality requires that decisions be made as if they maximise the decision maker’s subjective expectation of benefit relative to her precise degrees of belief and values. Broome [2012, 129] gives an unflinching defence of this approach: “The lack of firm probabilities is not a reason to give up expected value theory […] Stick with expected value theory, since it is very well-founded, and do your best with probabilities and values.” This approach may seem rather bold, not least in the context of environmental decision making. Weitzman [2009], for instance, argues that whether or not one assigns non-negligible probability to catastrophic climate consequences radically changes the assessment of mitigation options. Moreover, in many circumstances there remains the question of how to follow Broome’s advice: How should the decision-maker settle, in a non-arbitrary way, on a precise opinion on decision-relevant issues in the face of an effectively ‘divided mind’? There are two interrelated strategies: she can deliberate further and/or aggregate conflicting views. The former aims for convergence in opinion, while the latter aims for an acceptable compromise in the face of persisting conflict. (For a discussion of deliberation see Fishkin and Luskin [2005]; for more on aggregation see, for instance, Genest and Zidek [1986], Mongin [1995], Sen [1970], List and Puppe [2009]. There is a comparatively small formal literature on deliberation, a seminal contribution being Lehrer and Wagner’s [1981] model for updating probabilistic beliefs.)
(2) The decision-maker can try to delay making a decision, or at least postpone parts of it, in the hope that her uncertainty will become manageable as more information becomes available, or as disagreements resolve themselves through a change in attitudes. The basic motive for delaying a decision is to maintain flexibility at zero cost (see Koopmans [1962], Kreps and Porteus [1978], Arrow [1995b]). Suppose that we must decide between building a cheap but low sea wall or a high, but expensive, one, and that the relative desirability of these two courses of action depends on unknown factors, such as the extent to which sea levels will rise. In this case it would be sensible to consider building a low wall first but leave open the possibility of raising it in the future. If this can be done at no additional cost, then it is clearly the best option. In many adaptation scenarios, the analogue of the ‘low sea wall’ may in fact be social-institutional measures that enable a delayed response to climate change, whatever the details of this change turn out to be. In many cases, however, the prospect of cost-free postponement of a decision (or part thereof) is simply a mirage, since delay often decreases rather than increases opportunities due to changes in the background environment. This is often true for climate-change adaptation decisions, not to mention mitigation decisions.
(3) The decision-maker can employ a different decision rule to that prescribed by EU theory; one that is much less demanding in terms of the information it requires. A great many different proposals for such rules exist in the literature, involving more or less radical departures from the orthodox theory and varying in the informational demands they make. It should be noted from the outset that there is one widely-agreed rationality constraint on these non-standard decision rules: ‘(EU)-dominated options’ are not admissible choices, that is, if an option has lower expected utility than another option according to all permissible pairs of probability and utility functions, then the former dominated option is not an admissible choice. This is a relatively minimal constraint, but it may well yield a unique choice of action in some decision scenarios. In such cases, the severe uncertainty is not in fact decision relevant. For example, it may be the case that, from the global planner’s perspective, a given mitigation option is better than continuing with business as usual, whatever the uncertain details of the climate system. This is even more plausible to the extent that the mitigation option counts as a ‘win-win’ strategy [Maslin and Austin 2012], that is, to the extent that it has other positive impacts, say, on air quality or energy security, regardless of mitigation results. In many more fine-grained or otherwise difficult decision contexts, however, the non-EU-dominance constraint may exclude only a few of the available options as choice-worthy.
A consideration that is often appealed to in order to further discriminate between options is caution. Indeed, this is an important facet of the popular but ill-defined Precautionary Principle. (The Precautionary Principle is referred to in the IPCC [2014b] ARC-5 WGII report. See, for instance, Gardiner [2006] and Steele [2006] for discussion of what the Precautionary Principle does/could stand for.) Cautious decision rules give more weight to the ‘down-side’ risks; the possible negative implications of a choice of action. The Maxmin-EU rule, for instance, recommends picking the action with greatest minimum expected utility (see Gilboa and Schmeidler [1989], Walley [1991]). The rule is simple to use, but arguably much too cautious, paying no attention at all to the full spread of possible expected utilities. The α-Maxmin rule, in contrast, recommends taking the action with the greatest α-weighted sum of the minimum and maximum expected utilities associated with it. The relative weights for the minimum and maximum expected utilities can be thought of as reflecting either the decision maker’s pessimism in the face of uncertainty or else their degree of caution (see Binmore [2009]). (For a comprehensive survey of non-standard decision theories for handling severe uncertainty in the economics literature, see Gilboa and Marinacci [2012]. For applications to climate policy see Heal and Millner [2014])
A more informationally-demanding set of rules are those that draw on considerations of confidence and/or reliability. The thought here is that an agent is more or less confident about the various probability and utility functions that characterise her uncertainty. For instance, when the estimates derive from different models or experts, the decision maker may regard some models as better corroborated by available evidence than others or else some experts as more reliable than others in their judgments. In these cases, it is reasonable, ceteris paribus, to favour actions of which you are more confident that they will have beneficial consequences. One (rather sophisticated) way of doing this is to weight each of the expected utilities associated with an action in accordance with how confident you are about the judgements supporting them and then choose the action with the maximum confidence-weighted expected utility (see Klibanoff et al. [2005]). This rule is not very different from maximising expected utility and indeed one could regard confidence weighting as an aggregation technique rather than an alternative decision rule. But considerations of confidence may be appealed to even when precise confidence weights cannot be provided. Gärdenfors and Sahlin [1982/ 1988], for instance, suggest simply excluding from consideration any estimates that fall below a reliability threshold and then picking cautiously from the remainder. Similarly, Hill [2013] uses an ordinal measure of confidence that allows for stake-sensitive thresholds of reliability that can then be combined with varying levels of caution. This rule has the advantage of allowing decision-makers to draw on the confidence grading of scientific claims adopted by the IPCC (see Bradley et al [2017]).
One might finally distinguish decision rules that are cautious in a slightly different way-that compare options in terms of ‘robustness’ to uncertainty, relative to a problem-specific satisfactory level of expected utility. Better options are those that are more assured of having an expected utility that is good enough or regret-free, in the face of uncertainty. The ‘information-gap theory’ developed by Ben-Haim [2001] provides one formalisation of this basic idea that has proved popular in environmental management theory. Another prominent approach to robust decision-making is that developed by Lempert, Popper and Bankes [2003]. These two frameworks are compared in Hall et al. [2012]. Recall that the uncertainty in question may be multi-faceted, concerning probabilities of states/outcomes, or values of final outcomes. Most decision rules that appeal to robustness assume that a best estimate for the relevant variables is available, and then consider deviations away from this estimate. A robust option is one that has a satisfactory expected utility relative to a class of estimates that deviate from the best one to some degree; the wider the class in question, the more robust the option. Much depends on what expected utility level is deemed satisfactory. For mitigation decision making, one salient satisfactory level of expected utility is that associated with a 50% chance of average global temperature rise of 2 degrees Celsius or less. Note that one may otherwise interpret any such mitigation temperature target in a different way, namely as a constraint on what counts as a feasible option. In other words, mitigation options that do not meet the target are simply prohibited options, not suitable for consideration. For adaptation decisions, the satisfactory level would depend on local context, but roughly speaking, robust options are those that yield reasonable outcomes for all the inopportune climate scenarios that have non-negligible probability given some range of uncertainty. These are plausibly adaptation options that focus on resilience to any and all of the aforesaid climate scenarios, perhaps via the development of social institutions that can coordinate responses to variability and change. (Robust decision-making is endorsed, for instance, by Dessai et al. [2009] and Wilby and Dessai [2010], who indeed associate this kind of decision rule with resilience strategies. See also Linkov and others [2014] for discussion of resilience strategies vis-à-vis risk management.)
9. Conclusion
This article reviewed, from a philosophy of science perspective, issues and questions that arise in connection with climate science. Most of these issues are the subject matter of ongoing research, and they indeed deserve further attention. Rather than repeating these points, we would like to mention a topic that has not received the attention that it deserves: the epistemic significance of consensus in the acceptance of results. As the controversy over the Cook et al. [2013] paper shows, many people do seem to think that the level of expert consensus is an important reason to believe in climate change given that they themselves are not expert; and conversely, attacking the consensus and sowing doubt is a classic tactic of the other side. The role of consensus in the context of climate change deserves more attention than it has received hitherto, but for some discussions about consensus see (Inmaculada de Melo-Martín, Kristen Intemann, 2014).
10. Glossary
Attribution (of climate change): The process of evaluating the relative contributions of multiple causal factors to a change or event with an assignment of statistical confidence.
Boundary conditions: Values for any variable which affect the system but which are not directly output by the calculations.
Calibration: The process of estimating values of model parameters which are most consistent with observations.
Climate model: A representation of certain aspects of the climate system.
Detection (of climate change): The process of demonstrating that climate or a system affected by climate has changed in some defined statistical sense without providing a reason for that change.
Double counting: The use of data for both calibration and confirmation.
Expected utility (for an action): The sum of the probability-weighted utility of the possible consequences of the action.
External conditions (of the climate system): Conditions that influence the state of the Earth such as the amount of energy received from the sun.
Initial conditions: A mathematical descriptions of the state of the climate system at the beginning of the period being simulated.
Internal variability: The phenomenon that climate variables such as temperature and precipitation would change over time due to the internal dynamics of the climate system even in the absence of changing external conditions.
Null hypothesis: The expected behaviour of the climate system in the absence of changing external influences.
Projection: The prediction of a climate model that is conditional on a certain forcing scenario.
Proxy data: The data for climate variables that derived from observing natural phenomena such as tree rings, ice cores and ocean sediments.
Robustness (of a result): A result is robust if separate (ideally independent) models or lines of evidence lead to the same conclusion.
Use novel data: Data that are used for confirmation and have not been used for calibration.
11. References and Further Reading
Adler C. E. and G. Hirsch Hadorn. (2014). The IPCC and treatment of uncertainties: topics and sources of dissensus. Wiley Interdisciplinary Reviews: Climate Change 5.5, 663-676.
Arrow K. J. (1995b). A Note on Freedom and Flexibility. Choice, Welfare and Development. (eds. K. Basu, P. Pattanaik, and K. Suzumura), 7-15. Oxford: Oxford University Press.
Arrow K. J. (1995a). ‘Discounting Climate Change: Planning for an Uncertain Future. Lecture given at Institut d’Économie Industrielle, Université des Sciences Sociales, Toulouse.’ <http://idei.fr/doc/conf/annual/paper_1995.pdf>
Aspinall W. (2010). A route to more tractable expert advice. Nature 463, 294-295.
Ben-Haim Y. (2001). Information-Gap Theory: Decisions Under Severe Uncertainty, 330 pp. London: Academic Press.
Betz G. (2009). What range of future scenarios should climate policy be based on? Modal falsificationism and its limitations. Philosophia Naturalis 46, 133-158.
Betz G. (2010). What’s the worst case?. Analyse und Kritik 32, 87-106.
Binmore K. (2009). Rational Decisions, 216 pp. Princeton, NJ: Princeton University Press.
Bishop C. H. and G. Abramowitz. (2013). Climate model dependence and the replicate Earth paradigm. Climate Dynamics 41, 885-900.
Bradley, R, Helgeson, C. and B. Hill (2017). Climate Change Assessments: Confidence, Probability and Decision, Philosophy of Science 84(3): 500-522.
Bradley, R, Helgeson, C. and B. Hill (2018). Combining Probability with Qualitative Degree-of-Certainty Assessment. Climatic Change 149 (3-4): 517-525,
Broome J. (2012). Climate Matters: Ethics in a Warming World, 192 pp. New York: Norton.
Broome J. (1992). Counting the Cost of Global Warming, 147 pp. Cambridge: The White Horse Press.
Broome J. (2008). The Ethics of Climate Change. Scientific American 298, 96-102.
Budescu, D. V., H. Por, S. B. Broomell and M. Smithson. (2014). The interpretation of IPCC probabilistic statements around the world. Nature Climate Change 4, 508-512.
Cohn T. A. and H. F. Lins. (2005). Nature’s style: naturally trendy. Geophysical Research Letters 32, L23402.
Cook J. et al. (2013). Quantifying the consensus on the anthropogenic global warming in the scientific literature. Environmental Research Letters 8, 1-7.
Daron J. D. and D. Stainforth. (2013). On predicting climate under climate change. Environmental Research Letters 8, 1-8.
de Melo-Martín I., and K. Intemann (2014). Who’s afraid of dissent? Addressing concerns about undermining scientific consensus in public policy developments. Perspectives on Science 22.4, 593-615.
Dessai S. et al. (2009). Do We Need Better Predictions to Adapt to a Changing Climate? Eos 90.13, 111-112.
Dessler A. (2011). Introduction to Modern Climate Change. Cambridge: Cambridge University Press.
Drèze J., and Stern, N. (1990). Policy reform, shadow prices, and market prices. Journal of Public Economics 42.1, 1-45.
Douglas H. (2009). Science, Policy, and the Value-Free Ideal. Pittsburgh: Pittsburgh University Press.
Fishkin J. S., and R. C. Luskin. (2005). Experimenting with a Democratic Ideal: Deliberative Polling and Public Opinion. Acta Politica 40, 284-298.
Frank D., J. Esper, E. Zorita and R. Wilson. (2010). A noodle, hockey stick, and spaghetti plate: A perspective on high-resolution paleoclimatology. Wiley Interdisciplinary Reviews: Climate Change 1.4, 507-516.
Frigg R. P., D. A. Stainforth and L. A. Smith. (2013). The Myopia of Imperfect Climate Models: The Case of UKCP09. Philosophy of Science 80.5, 886-897.
Frigg R. P., D. A. Stainforth and L. A. Smith. (2015). An Assessment of the Foundational Assumptions in High-Resolution Climate Projections: The Case of UKCP09 2015, draft under review.
Frigg R. P., S. Bradley, H. Du and L. A. Smith. (2014a). Laplace’s Demon and the Adventures of His Apprentices. Philosophy of Science 81.1, 31-59.
Frisch M. (2013). Modeling Climate Policies: A Critical Look at Integrated Assessment Models. Philosophy and Technology 26, 117-137.
Frisch, M. (2015). Tuning climate models, predictivism, and the problem of old evidence. European Journal for Philosophy of Science 5.2, 171-190.
Gärdenfors P. and N.-E. Sahlin. [1982] (1988). Unreliable probabilities, risk taking, and decision making. Decision, Probability and Utility, (eds. P. Gärdenfors and N.-E. Sahlin), 313-334. Cambridge: Cambridge University Press.
Gardiner S. (2006). A Core Precautionary Principle. The Journal of Political Philosophy 14.1, 33-60.
Gardiner S., S. Caney, D. Jamieson, H. Shue (2010). Climate Ethics: Essential Readings. Oxford: Oxford University Press
Genest C. and J. V. Zidek. (1986). Combining Probability Distributions: A Critique and Annotated Bibliography. Statistical Science 1.1, 113-135.
Gilboa I. and M. Marinacci. (2012). Ambiguity and the Bayesian Paradigm. Advances in Economics and Econometrics: Theory and Applications, Tenth World Congress of the Econometric Society (eds. D. Acemoglu, M. Arellano and E. Dekel), 179-242 Cambridge: Cambridge University Press.
Gilboa I. and D. Schmeidler. (1989). Maxmin expected utility with non-unique prior. Journal of Mathematical Economics 18, 141-153.
Greaves, H. (2017). Discounting for public policy: A survey. Economics and Philosophy 33.3, 391-439.
Hall J. W., Lempert, R. J., Keller, K., Hackbarth, A., Mijere, C., McInerney, D. J. (2012). Robust Climate Policies Under Uncertainty: A Comparison of Robust Decision-Making and Info-Gap Methods. Risk Analysis 32.10, 1657-1672.
Halpern J. Y. (2003). Reasoning About Uncertainty, 483 pp. Cambridge, MA: MIT Press.
Heal. G. and A. Millner (2014) Uncertainty and Decision Making in Climate Change Economics.Review of Environmental Economics and Policy 8:120-137.
Hegerl G. C., O. Hoegh-Guldberg, G. Casassa, M. P. Hoerling, R. S. Kovats, C. Parmesan, D. W. Pierce, P. A. Stott. (2010). Good Practice Guidance Paper on Detection and Attribution Related to Anthropogenic Climate Change. Meeting Report of the Intergovernmental Panel on Climate Change Expert Meeting on Detection and Attribution of Anthropogenic Climate Change (eds. T. F. Stocker, C. B. Field, D. Qin, V. Barros, G.-K. Plattner, M. Tignor, P. M. Midgley and K. L. Ebi. Bern). Switzerland: IPCC Working Group I Technical Support Unit, University of Bern.
Held I. M. (2005). The Gap between Simulation and Understanding in Climate Modeling. Bulletin of the American Meteorological Society 80, 1609-1614.
Hill B. (2013). Confidence and Decision. Games and Economic Behavior 82, 675-692.
Hulme M., S. Dessai, I. Lorenzoni and D. Nelson. (2009). Unstable Climates: exploring the statistical and social constructions of climate. Geoforum 40, 197-206.
IPCC. (2013). Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge and New York: Cambridge University Press.
IPCC. (2014). Climate Change 2014: Impacts, Adaptation, and Vulnerability. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge and New York: Cambridge University Press.
Jeffrey R. (1965). The Logic of Decision, 231 pp. Chicago: University of Chicago Press.
Jun M., R. Knutti, D. W Nychka. (2008). Local eigenvalue analysis of CMIP3 climate model errors. Tellus A 60.5, 992-1000.
Katzav J. (2013). Severe testing of climate change hypotheses. Studies in History and Philosophy of Philosophy of Modern Physics 44.4, 433-441.
Katzav J. (2014). The epistemology of climate models and some of its implications for climate science and the philosophy of science. Studies in History and Philosophy of Modern Physics 46, 228-238.
Katzav, J. & W. S. Parker (2018). Issues in the Theoretical Foundations of Climate Science. Studies in History and Philosophy of Modern Physics 63, 141-149.
Klibanoff P., M. Marinacci and S. Mukerji. (2005). A smooth model of decision making under ambiguity. Econometrica 73, 1849-1892.
Klintman M. (2019). Knowledge Resistance: How We Avoid Insight From Others. Manchester: Manchester University Press.
Knutti R., R. Furrer, C. Tebaldi, J. Cermak, and G. A. Meehl. (2010). Challenges in Combining Projections from Multiple Climate Models. Journal of Climate 23.10, 2739-2758.
Koopmans T. C. (1962). On flexibility of future preference. Cowles Foundation for Research in Economics, Yale University, Cowles Foundation Discussion Papers 150.
Kreps D. M. and E. L. Porteus. (1978). Temporal resolution of uncertainty and dynamic choice theory. Econometrica 46.1, 185-200.
Lahsen M. (2005). Seductive Simulations? Uncertainty Distribution Around Climate Models. Social Studies of Science 35.6, 895-922.
Lehrer K. and Wagner, C. (1981). Rational Consensus in Science and Society, 165 pp. Dordrecht: Reidel.
Lempert R. J., Popper, S. W., Bankes, S. C. (2003). Shaping the Next One Hundred Years: New Methods for Quantitative Long-Term Policy Analysis, 208 pp. Santa Monica, CA: RAND Corporation, MR-1626-RPC.
Lenhard J. and E. Winsberg. (2010). Holism, entrenchment, and the future of climate model pluralism. Studies in History and Philosophy of Modern Physics 41, 253-262.
Linkov I. et al. (2014). Changing the resilience program. Nature Climate Change 4, 407-409.
List C. and C. Puppe. (2009). Judgment aggregation: a survey. Oxford Handbook of Rational and Social Choice (eds. P. Anand, C. Puppe and P. Pattanaik). Oxford: Oxford University Press.
Lorenz E. (1995). Climate is what you expect. Prepared for publication by NCAR. Unpublished, 1-33.
Lloyd E. A. (2010). Confirmation and robustness of climate models. Philosophy of Science 77, 971-984.
Lloyd E. A. (2015). Model robustness as a confirmatory virtue: The case of climate science. Studies in History and Philosophy of Science 49, 58-68.
Lloyd E. A. (2009). Varieties of Support and Confirmation of Climate Models. Proceedings of the Aristotelian Society Supplementary Volume LXXXIII, 217-236.
Lloyd, E., N. Oreskes (2019). Climate Change Attribution: When Does it Make Sense to Add Methods? Epistemology & Philosophy of Science 56.1, 185-201.
Lusk, G. (2017). The Social Utility of Event Attribution: Liability, Adaptation, and Justice-Based Loss and Damage. Climatic Change 143, 201–12.
Mach, K. J., M. D. Mastrandrea, P. T. Freeman, and C. B. Field (2017). Unleashing Expert Judgment in Assessment. Global Environmental Change 44, 1–14.
Mann M. E., R. S. Bradley and M.K. Hughes (1998). Global-scale temperature patterns and climate forcing over the past six centuries. Nature 392, 779-787.
Maslin M. and P. Austin. (2012). Climate models at their limit?. Nature 486, 183-184.
Mastrandrea M. D., K. J. Mach, G.-K. Plattner, O. Edenhofer, T. F. Stocker, C. B. Field, K. L. Ebi, and P. R. Matschoss. (2011). The IPCC AR5 guidance note on consistent treatment of uncertainties: a common approach across the working groups. Climatic Change 108, 675-691.
McGuffie K. and A. Henderson-Sellers. (2005). A Climate Modelling Primer, 217 pp. New Jersey: Wiley.
McIntyre S. and R. McKitrick. (2003). Corrections to the Mann et. al. (1998) proxy data base and northern hemispheric average temperature series. Energy & Environment 14.6, 751-771.
Mongin P. (1995). Consistent Bayesian Aggregation. Journal of Economic Theory 66.2, 313-51.
Nordhaus W. D. (2007). A Review of the Stern Review on the Economics of Climate Change. Journal of Economic Literature 45.3, 686-702.
Nordhaus W. C. (2008). A Question of Balance, 366 pp. New Haven, CT: Yale University Press.
Oreskes N. and E. M. Conway. (2012). Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming, 355 pp. New York: Bloomsbury Press.
Oreskes N. (2007) The Scientific Consensus on Climate Change: How Do We Know We’re Not Wrong? Climate Change: What It Means for Us, Our Children, and Our Grandchildren (eds. J. F. C. DiMento and P. Doughman), 65-99. Boston: MIT Press.
Oreskes N., K. Shrader-Frechette and K. Belitz. (1994). Verification, validation, and confirmation of numerical models in the Earth Science. Science New Series 263.5147, 641-646.
Parfit D. (1984). Reasons and Persons, 560 pp. Oxford: Clarendon Press.
Parker W. S. (2009). Confirmation and Adequacy for Purpose in Climate Modelling. Aristotelian Society Supplementary Volume 83.1 233-249.
Parker W. S. (2010). Comparative Process Tracing and Climate Change Fingerprints. Philosophy of Science 77, 1083-1095.
Parker W. S. (2011). When Climate Models Agree: The Significance of Robust Model Predictions. Philosophy of Science 78.4, 579-600.
Parker W. S. (2013). Ensemble modeling, uncertainty and robust predictions. Wiley Interdisciplinary Reviews: Climate Change 4.3, 213-223.
Parker W. S. (2014). Values and Uncertainties in Climate Prediction, Revisited. Studies in History and Philosophy of Science Part A 46, 24-30.
Petersen A. C. (2012). Simulating Nature: A Philosophical Study of Computer-Simulation Uncertainties and Their Role in Climate Science and Policy Advice, 210 pp. Boca Raton, Florida: CRC Press.
Resnik M. (1987). Choices: an introduction to decision theory, 221 pp. Minneapolis: University of Minnesota Press.
Savage L. J. (1954). The Foundations of Statistics, 310 pp. New York: John Wiley & Sons.
Sen A. (1982). Approaches to the choice of discount rate for social benefit–cost analysis. Discounting for Time and Risk in Energy Policy (ed. R. C. Lind), 325-353. Washington, DC: Resources for the Future.
Sen A. (1970). Collective Choice and Social Welfar. San Francisco: Holden-Day Inc.
Sexton D. M. H., J. M. Murphy, M. Collins and M. J. Webb. (2012). Multivariate Probabilistic Projections Using Imperfect Climate Models. Part I: Outline of Methodology. Climate Dynamics 38, 2513-2542.
Sexton D. M. H., and J. M. Murphy. (2012). Multivariate Probabilistic Projections Using Imperfect Climate Models. Part II: Robustness of Methodological Choices and Consequences for Climate Sensitivity. Climate Dynamics 38, 2543-2558.
Shackley S., P. Young, S. Parkinson and B. Wynne. (1998). Uncertainty, Complexity and Concepts of Good Science in Climate Change Modelling: Are GCMs the Best Tools? Climatic Change 38, 159-205.
Smith L. A. and N. Stern. (2011). Uncertainty in science and its role in climate policy. Phil. Trans. R. Soc. A 369.1956, 4818-4841.
Spiegelhalter D. J. and H. Riesch. (2011). Don’t know, can’t know: embracing deeper uncertainties when analysing risks. Phil. Trans. R. Soc. A 369, 4730-4750.
Stainforth D. A., M. R. Allen, E. R. Tredger and L. A. Smith. (2007a). Confidence, Uncertainty and Decision-support Relevance in Climate Predictions. Philosophical Transactions of the Royal Society A 365, 2145-2161.
Stainforth D. A., T. E. Downing, R. Washington, A. Lopez and M. New. (2007b). Issues in the Interpretation of Climate Model Ensembles to Inform Decisions. Philosophical Transactions of the Royal Society A 365, 2163-2177.
Steele K. (2006). The precautionary principle: a new approach to public decision-making?. Law Probability and Risk 5, 19-31.
Steele K. and C. Werndl. (2013). Climate Models, Confirmation and Calibration. The British Journal for the Philosophy of Science 64, 609-635.
Steele K. and C. Werndl. forthcoming (2015). The Need for a More Nuanced Picture on Use-Novelty and Double-Counting. Philosophy of Science.
Stern N. (2007). The Economics of Climate Change: The Stern Review, 692 pp. Cambridge: Cambridge University Press.
Stern, N. (2013). The Structure of Economic Modeling of the Potential Impacts of Climate Change: Grafting Gross Underestimation of Risk onto Already Narrow Scientific Models. Journal of Economic Literature 51.3, 838-859.
Thompson, Erica, Roman Frigg and Casey Helgeson. Expert Judgment for Climate Change Adaptation, Philosophy of Science 83(5), 2016, 1110-1121,
von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behaviour, 739 pp. Princeton: Princeton University Press.
Walley P. (1991). Statistical Reasoning with Imprecise Probabilities, 706 pp. New York: Chapman and Hall.
Weitzman M. L. (2009). On Modeling and Interpreting the Economics of Catastrophic Climate Change. The Review of Economics and Statistics 91.1, 1-19.
Werndl C. (2015). On defining climate and climate change. The British Journal for the Philosophy of Science, doi:10.1093/bjps/axu48.
Wilby R. L. and S. Dessai. (2010). Robust adaptation to climate change. Weather 65.7, 180-185.
Weisberg Michael. (2006). Robustness Analysis. Philosophy of Science 73, 730-742.
Winsberg E. (2012). Values and Uncertainties in the Predictions of Global Climate Models. Kennedy Institute of Ethics Journal 22, 111-127.
Winsberg, E. 2018. Philosophy and Climate Science. Cambridge: Cambridge University Press.
Winsberg, E and W. M. Goodwin (2016). The Adventures of Climate Science in the Sweet Land of Idle Arguments. Studies in History and Philosophy of Modern Physics 54, 9-17.
Worrall J. (2010). Error, Tests, and Theory Confirmation. Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability, and the Objectivity and Rationality of Science (eds. D. G. Mayo and A. Spanos), 125-154. Cambridge: Cambridge University Press.
Wüthrich, N. (2017). Conceptualizing Uncertainty: An Assessment of the Uncertainty Framework of the Intergovernmental Panel on Climate Change. In EPSA15 Selected Papers, 95-107. Cham: Springer.
Author Information
Richard Bradley
London School of Economics and Political Science
UK
Roman Frigg
London School of Economics and Political Science
UK
Katie Steele
Australian National University
Australia
Erica Thompson
London School of Economics and Political Science
UK
Charlotte Werndl
University of Salzburg
Austria
and
London School of Economics and Political Science
UK
Causation
The question, “What is causation?” may sound like a trivial question—it is as sure as common knowledge can ever be that some things cause another, that there are causes and they necessitate certain effects. We say that we know that what caused the president’s death was an assassin’s shot. But when asked why, we will most certainly reply that it is because the latter was necessary for the former—which is an answer that, upon close philosophical examination, falls short of veracity. In a less direct way, the president’s grandmother’s giving birth to his mother was necessary for his death too. That, however, we would not describe as this death’s cause.
The first section of this article states the reasons why we should care about causation, including those that are non-philosophical. Sections 2 and 3 define the axis of the division into ontological and semantic analyses, with the Kantian and skeptical accounts as two alternatives. Set out there is also Hume’s pessimistic framework for thinking about causation—since before we ask what causation is, it is vital to consider whether we can come to know it at all.
Section 4 examines the semantic approaches, which analyze what it means to say that one thing causes another. The first, the regularity theories, nonetheless turns out to be problematic when dealing with unrepeatable and implausibly enormous cases, among many. Some of these theories limit the ambitions of Lewis’s theory of causation as a chain of counterfactual dependence, and also suffer from the causal redundancy and causal transitivity objections. Although the scientifically-minded interventionists try to reconnect our will to talk in terms of causation with our agency, probability theories accommodate the indeterminacy of quantum physics and relax the strictness of exceptions-unfriendly regularity accounts. Yet they risk falling into the trap of confounding causation and probability.
The next section brings us back to ontology. Since causation is hardly a particular entity, nominalists define it with recurrence over and above instances. Realists bring forward the relation of necessitation, seemingly in play whenever causation occurs. Dispositionalism claims that to cause means to dispose to happen. Process theories base their analysis on the notions of process and transmission—for instance, of energy, which might capture well the nature of causation in the most physical sense.
Another historically significant family of approaches is the concern of Section 6, which examines how Kant removes causation from the domain of things-in-themselves to include it in the structure of consciousness. This has also inspired the agency views which claim agency is inextricably tied up with causal reasoning.
The last, seventh section, deals with the most skeptical work on causation. Some, following Bertrand Russell, have tried to get rid of the concept altogether, believing it a relic of a past and timeworn metaphysical speculation. Pluralism and thickism see the ill fate of any attempt at defining causation in that what the word can mean is in fact a bundle of different concepts, or not any single and meaningful one at all.
Causation is a live topic across a number of disciplines, due to factors other than its philosophical interest. The second half of the twentieth century saw an increase in the availability of information about the social world, the growth of statistics and the disciplines it enables (such as economics and epidemiology), and the growth of computing power. This led, at first, to the prospect of much-improved policy and individual choice through analysis of all this data, and especially in the early twenty-first century, to the advent of potentially useful artificial intelligence that might be able to achieve another step-change in the same direction. But in the background of all of this lurks the specter of causation. Using information to inform goal-directed action often seems to require more than mere extrapolation or projection. It often seems to require that we understand something of the causal nature of the situation. This has seemed painfully obvious to some, but not to others. Increasing quantities of information and abilities to process it force us to decide whether or not causation is part of this march of progress or an obstacle on the road.
So much for why people care about causation. What is this thing that we care about so much?
To paraphrase the great physicist Richard Feynman, it is safe to say that nobody understands causation. But unlike quantum physics, causation is not a useful calculating device yielding astoundingly accurate predictions, and those who wish to use causal reasoning for any actual purpose do not have the luxury of following Feynman’s injunction to “shut up and calculate”. The philosophers cannot be pushed into a room and left to debate causation; the door cannot be closed on conceptual debate.
The remainder of this section offers a summary of the main elements of disagreement. The next section presents a “family tree” of different historical and still common views on the topic, which may help to make some sense of the state of the debate.
Some philosophers have asked what causation is, that is, they have asked an ontological question. Some of these have answered that it is something over and above (or at least of a different kind from) its instances: that there is a “necessitation relation” that is a universal rather than a particular thing, and in which cause-effect pairs participate, or of which they partake, or something similar “in virtue” of which they instantiate causation (Armstrong, 1983). These are realists about causation (noting that others discussed in this paragraph are also realists in a more general sense, but not about universals). Others, perhaps a majority, believe that causation is something that supervenes upon (or is ultimately nothing over and above) its instances (Lewis, 1983; Mackie, 1974). These are nominalists. Yet others believe that it is something somewhat different from either option: a disposition, or a bundle of dispositions, which are taken to be fundamental (Mumford & Anjum, 2011). These are dispositionalists.
Second, philosophers have sought a semantic analysis of causation, trying to work out what “cause” and cognates mean, in some deeper sense of “meaning” than a dictionary entry can satisfy. (It is worth bearing in mind, however, that the ontological and semantic projects are often pursued together, and cannot always be separated.) Some nominalists believe it is a form of regularity holding between distinct existences (Mackie, 1974). These are regularity theorists. Others, counterfactual theorists, believe it is a special kind of counterfactual dependence between distinct existences (Lewis, 1973a), and others hold that causes raise the probability of their effects in a special way (Eells, 1991; Suppes, 1970). Among counterfactual theorists are various subsets, notably interventionists (for example, Woodward, 2003) and contrastivists (for example, Schaffer, 2007). There is also an overlapping subset of thinkers with a non-philosophical motivation, and sometimes background, who develop technical frameworks for the purpose of performing causal inference and, in doing so, define causation, thus straying into the territory of offering semantic analysis (Hernán & Robins, 2020; Pearl, 2009; Rubin, 1974). Out of kilter with the historical motivation of those approaching counterfactual theorizing from a philosophical angle, some of those coming from practical angles appear not to be nominalists (Pearl & Mackenzie, 2018). Yet others, who may or may not be nominalists, hold that causation is a pre-scientific or “folk science” notion which, like “energy”, should be mapped onto a property identified by our current best science, even if that means deviating from the pre-scientific notion (Dowe, 2000).
Third, there are those who take a Kantian approach. While this is an answer to ontological questions about causation, it is reasonably treated in a separate category, different from the ontological approach mentioned first above in this section, because the question Kant tried to answer is better summarized not as “What sort of thing is causation?” but “Is causation a thing at all?” Kant himself thought that causation is a constitutive condition of experience (Kant, 1781), thus not a part of the world, but a part of us—a way we experience the world, without which experience would be altogether impossible. Late twentieth-century thinkers suggested that causation is not a necessary precondition of all experience but, more modestly, a dispositional property of us to react in certain ways—a secondary property, like color—arising from the fact that we are agents (Menzies & Price, 1993).
The fourth approach to causation is, in a broad sense, skeptical. Thus some have taken the view that it is a redundant notion, one that ought to be dispensed with in favor of modern scientific theory (Russell, 1918). Such thinkers do not have a standard name but might reasonably be called republicans, following a famous line of Bertrand Russell’s (see the first subsection of section 7.). Some (pluralists) believe that there is no single concept of causation but a plurality of related concepts which we lump together under the word “causation” for some reason other than that there is such a thing as causation (Cartwright, 2007; Stapleton, 2008). Yet another view, which might be called thickism and which may or may not be a form of pluralism, holds that causal concepts are “thick”, as some have suggested for ethical concepts (Anscombe, 1958; although Anscombe did not use this term herself). That is, the fundamental referents of causal judgements are not causes, but kicks, pushes, and so forth, out of which there is no causal component to be abstracted, extracted, or meaningfully studied (Anscombe, 1969; Cartwright, 1983).
Cutting across all these positions is a question as to what the causal relata are, if indeed causation is a relation at all. Some say they are events (Lewis, 1973a, 1986); others, aspects (Paul, 2004); or others, facts (Mellor, 1995), among other ideas.
Disagreement about fundamentals is great news if you are a philosopher, because it gives you plenty to work on. It is a field of engagement that has not settled into trench warfare between a few big guns and their troops. It is indicative of a really fruitful research area, one with live problems, fast-paced developments, and connections with real life—that specter that lurks in the background of philosophy seminar rooms and lecture halls, just as causation lurks in the background of more practical engagements.
However, confusion about fundamentals is not great news if you are trying to make the best sense of the data you have collected, looking for guidance on how to convince a judge that your client is or is not liable, trying to make a decision about whether to ban a certain food additive or wondering how your investment will respond to the realization of a certain geopolitical risk. It is certainly not helpful if one is trying to decide what will be the most effective public health measure to slow the spread of an epidemic.
2. Hume’s Challenge
David Hume posed the questions that all the ideas discussed in the remainder of this article attempt to answer. He had various motivations, but a fair abridgement might be as follows.
Start with the obvious fact that we frequently have beliefs about what will happen, or is happening elsewhere right now, or has happened in the past, or, more grandly, what happens in general. One of Hume’s examples is that the sun will rise tomorrow. An example he gives of a general belief is that bread, in general, is nourishing. How do we arrive at these beliefs?
Hume argues that such beliefs derive from experience. We believe the sun rises because we have experienced it rising on all previous mornings. We believe bread is nourishing because it has always been nourishing when we have encountered it in our experience.
However, Hume argues that this is an inadequate justification on its own for the kind of inference in question. There is no contradiction in supposing that the sun will simply not rise tomorrow. This would not be logically incompatible with previous experience. Previous experience does not render it impossible. On the contrary, we can easily imagine such a situation, perhaps use it as the premise for a story, and so forth. Similar remarks apply to the nourishing effects of bread, and indeed to all our beliefs that cannot be justified logically (or mathematically, if that is different) from some indisputable principles.
In arguing thus, Hume might be understood as reacting to the rationalist component of the emerging scientific worldview, that component that emphasized the ability of the human mind to reach out and understand. Descartes believed that through the exercise of reason we could obtain knowledge of the world of experience. Newton believed that the world of experience was indeed governed by some kind of mathematical necessity or numerical pattern, which our reason could uncover, and thus felt able to draw universal conclusions from a little, local data. Hume rejected the confidence characteristic of both Descartes and Newton. Given the central role that this confidence about the power of the human mind played in the founding of modern science, Hume, and empiricists more generally, might be seen as offering not a question about common sense inferences, but a foundational critique of one of the central impulses of the entire scientific enterprise—perhaps not how twentieth and twenty-first-century philosophers in the Anglo-American tradition would like to see their ancestry and inspiration.
Hume’s argument was simple and compelling and instantiated what appears to be a reasonably novel argumentative pattern or move. He took a metaphysical question and turned it into an epistemological one. Thus he started with “What is necessary connection?” and moved on to “How do we know about necessary connection?”
The answer to the latter question, he claimed, is that we do not know about it at all, because the only kind of necessity we can make sense of is that of logical and mathematical necessity. We know about the necessity of logic and mathematics through examining the relevant “ideas”, or concepts, and seeing that certain combinations necessitate others. The contrary would be contradictory, and we can test for this by trying to imagine it. Gandalf is a wizard, and all wizards have staffs; we cannot conceive of these claims being true and yet Gandalf being staff-less. Once we have the ideas indicated in those claims, Gandalf’s staff ownership status is settled.
Experience, however, offers no necessity. Things happen, while we do not perceive their being “made” to happen. Hume’s argument to establish this is the flip side of his argument in favor of our knowledge of a priori truths. He challenges us to imagine causes happening without their usual effects: bread not to nourish, billiard balls to go into orbit when we strike them (this example is a somewhat augmented form of Hume’s own one), and so forth. It seems that we can do this easily. So we cannot claim to be able to access necessity in the empirical world in this way. We perceive and experience constant conjunction of cause and effect and we may find it fanciful to imagine stepping from a window and gently floating to the ground, but we can do it, and sometimes do so, both deliberately and involuntarily (who has not dreamed they can fly?). But Hume agrees with Descartes that we cannot even dream that two and two make five (if we clearly comprehend those notions in our dream—of course one can have a fuzzy dream in which one accepts the claim that two and two make five, without having the ideas of two, plus, equals and five in clear focus).
Hume’s skepticism about our knowledge of causation leads him to skepticism about the nature of causation: the metaphysical question is given an epistemological treatment, and then the answer returned to the metaphysical question is epistemologically motivated. His conclusion is that, for all we can tell, there is no necessary connection, there is only a series of constant conjunctions, usually called regularities. This does not mean that there is no causal necessity, only that there is no reason to believe that there is. For the Enlightenment project of basing knowledge on reason rather than faith, this is devastating.
The constraint of metaphysical speculation by epistemological considerations remains a central theme of twenty-first century philosophy, even if it has somewhat loosened its hold in this time. But Hume took his critique a step further, with further profound significance for this whole philosophical tradition. He asked what we even mean by “cause”, and specifically, by that component of cause he calls “necessary connection”. (He identifies two others: temporal order and spatiotemporal contiguity. These are also topics of philosophical and indeed physical debate, but are less prominent in early twenty-first century philosophy, and thus are not discussed in this article.) He argues that we cannot even articulate what it would be for an event in the world we experience to make another happen.
The argument reuses familiar material. We have a decent grasp on logical necessity; it is the incoherence of the denial of the necessity in question, which we can easily spot (in his view). But that is not the necessary connection we seek. But, a question remains open, what other kind of necessity could there be? If it does not involve the impossibility of what is necessitated, then in what sense is it necessitated? This is not a rhetorical question; it is a genuine request for explanation. Supplying one is, at best, difficult; at worst, it is impossible. Some have tried (several attempts are discussed throughout the remainder of the article) but most have taken the view that it is impossible. Hume’s own explanation is that necessary connection is nothing more than a feeling, the expectation created in us by endless experience of same cause followed by same effect. Granted, this is a meaning for “necessary connection”; but it is one that robs “necessary” of anything resembling necessity.
The move from “What is X?” to “What does our concept of X mean?” has driven philosophers even harder than the idea that metaphysical speculation must be epistemologically constrained—partly because philosophical knowledge was thought for a long time to be constrained to knowledge of meanings; but that is another story (see Ch 10 of Broadbent, 2016).
This is the background to all subsequent work on causation as rejuvenated by the Anglo-American tradition, and also to the practical questions that arise. The ideas that we cannot directly perceive causation, and that we cannot reason logically from cause to effect, have repeatedly given rise to obstacles in science, law, policy, history, sports, business, politics—more or less any “world-oriented” activity you can think of. The next section summarizes the ways that people have understood this challenge: the most important questions they think it raises and their answers to these questions.
3. A Family Tree of Causal Theories
Here is a diagram indicating one possible way of understanding the relationship between different historically significant and still influential theories of, and approaches to, and even understandings of the philosophical problems posed by causation—and indeed some approaches that do not think the problems are philosophical at all.
Figure 1. A “family tree” of theories of causation
At the top level are approaches to causation corresponding to the kinds of questions one might deem important to ask about it. At the second level are theories that have been offered in response to these questions. Some of these theories have sub-theories which do not really merit their own separate level, and are dealt with in this article as variations on a theme (each receiving treatment in its own subsection).
Some of these theories motivate each other, in particular, nominalism and regularity theories often go hand-in-hand. Others are relatively independent, while some are outright incompatible. These compatibility relationships themselves may be disputed.
Two points should be noted regarding this family tree. First, an important topic is absent: the nature of the causal relata. This is because any stance about their nature does not constitute a position about causation on its own; it cuts across this family tree and features importantly in some theories but not in others. While some philosophers have argued that it is very important (Mellor, 1995, 2004; Paul, 2004; Schaffer, 2007), and featured it centrally in their theories of causation (second level on the tree), it does not feature centrally in any approach to causation (top level on the tree), except that insofar as everyone agrees that the causal relata, whatever they are, must be distinct to avoid mixing up causal and constitutive facts. The topic is skipped in this article because, while it is interesting, it is somewhat orthogonal.
The second point to note about this family tree is that others are possible. There are many ways one might understand twenty-first-century work on causation, and thus there are other “family trees” implicit in other works, including other introductions to the topic. One might even think that no such family tree is useful. The one presented above is a tool only one that the reader might find useful, but it should ultimately be treated as itself a topic for debate, dispute, amendment, or rejection.
4. Semantic Analyses
Semantic analyses of causation seek to give the meaning of causal assertions. They typically take “c causes e” to be the exemplary case, where “c” and “e” may be one of a number of things: facts, events, aspects, and so forth. (Here, lower case letters c and e are used to denote some particular cause and effect respectively. Upper case letters C and E refer to classes and yield general causal claims, as in “smoking causes lung cancer”.) Whatever they are, they are universally agreed to be distinct, since otherwise we would wrongly confuse constitutive with causal relations. My T-shirt’s having yellow bananas might end up as a cause of its having yellow shapes on it, for example, which is widely considered to be unacceptable—because it is quite different from my T-shirt’s yellow bananas causing the waitress bringing me my coffee to stare.
The main three positions are regularity theories, probabilistic theories, and counterfactual theories.
a. Regularity Theories
The regularity theory implies that causes and effects are not usually one-off pairs, but recurrent. Not only is the coffee I just drank causing me to perk up, but drinking coffee often has this effect. The regularity view claims that two claims suffice to explain causation: the fact that causes are followed by their effects, plus the fact that cause-effect pairs happen a lot. On the other hand, coincidental pairings do not typically recur. I scratched my nose while drinking the coffee, and this scratching was followed by me perking up. But nose-scratching is not generally followed by perking up. Whereas coffee-drinking is. Coffee-drinking and perking up are part of a regularity; in Hume’s phrase they are constantly conjoined. Which cannot be said about nose-scratching and perking up.
Obviously, the tool needs sharpening. Most of the Cs that we encounter are not always followed by Es, and most of the Es that we encounter are not always (that is, not only) caused by Cs. The assassin shoots (c) the president, who dies (e). But assassins often miss. Moreover, presidents often die of things other than being shot.
David Hume is sometimes presented as offering a regularity theory of causation (usually on the basis of Pt 5 of Hume, 1748), but this is crude at best and downright false at worst (Garrett, 2015). More plausibly, he offered regularities as the most we can hope for in ontology of causation, that is, as the basis of any account of what there might be “in the objects” that most closely corresponds to the various causal notions we have. But his approach to semantics was revisionary; he took “cause” to express a feeling that the experience of regularity produces in us. Knowing whether such regularities continue in the objects beyond our experience requires that we know of some sort of necessary connection sustaining the regularity. And the closest thing to necessary connection that we know about is regularity. We are in a justificatory circle.
It was John Stuart Mill who took Hume’s regularity ontology and turned it into a regularity theory of causation (Mill, 1882). The first thing he did was to address the obvious point that causes and effects are not constantly conjoined, in either direction. He confined the direction of constancy to the cause-effect direction, so that causes are always followed by their effects, but effects need not be necessarily preceded by the same causes. He expanded the definition of “cause” to include the enormousness that suffices for the effect. So, if e is the president’s death, then to say that c caused e is not to say that Es are always preceded by Cs, but rather that Cs are always followed by Es. Moreover, when we speak of the president’s being shot as the cause, we are being casual and strictly inaccurate. Strictly speaking, c is not the cause, but c*, being the entirety of things that were in place, including the shot, such that this entirety of things is sufficient for the president’s death. Strictly speaking, c* is the cause of e. There is no mysterious necessitation invoked because “sufficient” here just means “is always followed by”. When the wind is as it was, and the president is where he was, and the assassin aims so, and the gun fires thus, and so on and so forth, the president always dies, in all situations of this kind.
Mill thought that exceptionless regularities could be achieved in this way. In fact, he believed that the “law of causality”, being the exceptionless regularity between causes (properly defined) and effects, was the only true law (Mill, 1882). All the laws of science, he believed, had exceptions: objects falling in air do not fall as Newton’s laws of motion say they should, for example (this example is not Mill’s own). But objects released just so, at just such a temperature and pressure, with just such a mass, shape and surface texture, always fall in this way. Thus, according to Mill, the law of causality was the fundamental scientific law.
This theory faces a number of objections, even setting aside the lofty claims about the status of the “law of causality”. The subsubsections below discuss four of them.
i. The Problem of Implausibly Enormous Cases
To be truly sufficient for an effect, a cause must be enormous. It must include everything that, if on another occasion it is different, yields an overall condition that is followed by a different effect. It is questionable that “cause” is reasonably understood as referring to such an enormousness.
Enormousness poses problems for more than just the analysis of the common meaning of “cause”. It also makes it unclear how we can arrive at and use knowledge of causes. These are such gigantic things that they are bound to be practically unknowable to us. What makes our merry inference from a shot by an ace assassin who has never yet missed to the imminent death of the president is not the fact that the assassin has never yet missed, since this constancy is incidental; the causal regularity is between huge preceding conditions. In the previous cases in this section where the assassin shot, these may well not have been at all the same.
It is not clear that such objections are compelling, however. The idea of Mill’s account concerns the nature of causation and not our knowledge of it, much less our casual inferences, which might well depend on highly contingent and local regularities, which might be underwritten by truly causal ones without instantiating them. Mill himself provides a lengthy discussion of the use of causal language to pick out one part of the whole cause. As for getting the details right, Mill’s central idea seems to admit of other implementations, and an advocate would want to try these.
There was a literature in the early-to-middle twentieth century trying, in effect, to mend Mill’s account so as to get the blend of necessity and sufficiency just right for correctly characterizing the semantics of “cause”, against a background assumption that Millian regularity was the ultimate ontological truth about causation. This literature took its final forms in Jonathan Mackie’s INUS analysis (Mackie, 1974).
Mackie offered more than one account of causation. His INUS analysis was an account of causation “in the objects”, that is, an account in the Humean spirit of offering the closest possible objective characterization of what we appear to mean by causal judgements, without necessarily supposing that causal judgements are ultimately meaningful or that they ultimately refer to anything objective.
Mackie’s view was that a cause was an insufficient but necessary part of an unnecessary but sufficient condition for the effect. Bear in mind that “necessary” and “sufficient” are to be understood materially, non-modally, as expressing regularities: “x is necessary for y” means “y is always accompanied (or in the causal case, preceded) by y” and “x is sufficient for y” means “x is always accompanied (or in the causal case, followed) by y”. If we set aside temporal order, necessity and sufficiency are thus inter-definable; for x to be sufficient for y is for y to be necessary for x, and vice versa.
Considering our assassin, how does his shot count as a cause, according to the INUS account?
Take the I of INUS first. The assassin’s shot was clearly Insufficient for the president’s death. The president might suddenly have dipped his head to bestow a medal on a citizen (Forsyth, 1971). All sorts of things can and do intervene on such occasions. Shots of this nature are not universally followed by deaths. c is Insufficient for e.
Second, take the N. The shot is clearly Necessary in some sense for the death. In that situation, without the shot, there would have been no death. In strict regularity-talk, such situations are not followed by deaths in the absence of a shot. At the same time, we can hardly say that shots are required for presidents to die; most presidents find other ways to discharge this mortal duty. Mackie explains this limited necessity by saying not that c is Necessary for e, but that c is a Necessary part of a larger condition that preceded e.
Moving to the U, this larger condition is Unnecessary for the effect. There are plenty of presidential deaths caused by things other than shots, as just discussed; this was the reason we saw for not saying that the shot is necessary for the death. c is an Insufficient part but Necessary part of an Unnecessary condition for e.
Finally, the S. The condition of which c is an unnecessary part (so far as the occurrence of e is concerned), but it is sufficient. e happens—and it is no coincidence that it does. In strict regularity talk, every such condition is followed by an E. There is no way for an assassin to shoot just so, in just those conditions, which include the non-ducking of the president, his lack of a bullet proof vest, and so forth, and for the president not to die. Thus c is an Insufficient but Necessary part of an Unnecessary but Sufficient condition for e. To state it explicitly:
c is a cause of e if and only if c is a necessary but insufficient part of an unnecessary but sufficient condition for e.
In essence, Mackie borrows Mill’s “whole cause” idea, but drops the implausible idea that “cause” strictly refers to the “whole cause”. Instead, he makes “cause” refer to a part of the whole cause, one that satisfies the special conditions.
As well as addressing the problem of enormousness, which is fundamentally a plausibility objection, Mackie intends his INUS account to address the further and probably more pressing objections which follow.
ii. The Problem of the Common Cause
An immediate problem for any regularity account of causation is that, just as effects have many causes, causes also have many effects, and these effects may accompany each other very regularly. Recall Mill’s clarification that effects need not be constantly preceded by the same causes, and that “constant conjunction” was in this sense directional: same causes are followed by same effects, but not vice versa. This is strongly intuitive—as the saying goes, there is more than one way to skin a cat. Essentially, Mill tells us that we do not have to worry that effects are not always preceded by the same causes.
However, we are still left in a predicament, even with this unidirectional constant conjunction of same-cause-same-effect. When combined with the fact that a single cause always has multiple effects, we seem to land up with the result that constant conjunctions will also obtain between these effects. Cs are always followed by E1s, and Cs are always followed by E2s. So, whenever there is a C, we have an E1 and an E2, meaning that whenever we have an E1, we have an E2, and vice versa.
How does a regularity theory get out of this without dropping the fundamental analytical tool it uses to distinguish cause from coincidence, the unfailing succession of same effect on same cause, knowing that the singular “effect” should actually be substituted with the plural “effects”?
Here is an example of the sort of problem for naïve regularity theories that Mackie’s account is supposed to solve. My alarm sounds, and I get out of bed. Shortly afterwards, our young baby starts to scream. This happens daily: the alarm wakes me up, and I get out of bed; but it also wakes the baby up. I know that it is not my getting out of bed that causes the baby to scream. How? Because I get out of bed in the night at various other times, and the baby does not wake up on those occasions; because my climbing out of bed is too quiet for a baby in another room to hear; and for other such reasons. Also, even when I sleep through the alarm (or try to), the baby wakes up. But what if the connections were as invariable as each other—there were no (or equally many) exceptions?
Consider this classic example. The air pressure drops, and my barometer’s needle indicates that there will be a storm. There is a storm. My barometer’s needle dropping obviously does not cause the storm. But, as a reliable storm-predictor, it is followed by a storm regularly—that is the whole point of barometers.
Mackie’s INUS theory supplies the following answer. The barometer’s falling is not an INUS condition for the storm’s occurrence, because situations that are exactly similar except for the absence of a barometer can and do occur. The falling of the barometer may be a part of a sufficient condition for the storm to occur, but it is not a necessary part of that condition. Storms happen even when no barometer is there to predict them. (Likewise, the storm is not an INUS condition for the barometer falling, in case that is a worry despite the temporal order, because barometers can be induced to fall in vacuum chambers.)
Thus the intuition I have in the alarm/baby case is the correct one; the regularity between alarm and baby waking is persistent regardless of my getting out of bed, and that between my getting out of bed and the baby waking fails in otherwise similar occasions where there is no alarm.
However, this all depends on a weakening of the initial idea behind the regularity theory, since it amounts to accepting that there are many cases of apparent causation without underlying regularity, which are therefore true, not in virtue of match-strikes being followed by flames, but for a more complicated reason, albeit one that makes use of the notion of regularity. Hume’s idea that we observe like causes followed by like effects suffers a blow, and together with it, the epistemological motivation of the regularity theory, as well as its theoretical elegance. It is to this extent a concession on the part of the regularity theory. There are other cases where we do want to say that c causes e even though Cs are not always followed by Es.
In fact, such is the majority of cases. Striking the match causes it to light even though many match-strikes fail to produce a spark, breaking the match, and so forth. There are similar scenarios in which the match is struck but there is no flame; yet the apparent conclusion that the match strike does not cause the flame cannot be accepted. Perhaps we must insist that the scenarios differ because the match is not struck exactly so, but now we are not analyzing the meaning of “striking the match caused it to light”, since we are substituting an unknown and complicated event for “striking the match”, for the sake of insisting that causes are always followed by their effects—which is a failing of the analytical tool.
Common cause situations thus present prima facie difficulties for the regularity account. Mackie’s account may solve the problem; nonetheless, if there were an account of causation that did not face the problem in the first place, or that dealt with the problem with less cost to the guiding idea of the regularity approach and with less complication, it would be even more attractive. This is one of the primary advantages claimed by the two major alternatives, counterfactual and probabilistic accounts, which are discussed in their two appropriate subsections below.
iii. The Problem of Overdetermination
As noted in the subsubsection on the problem of the common cause, many effects can be caused in more than one way. A president may be assassinated with a bullet or a poison. The regularity theory can deal with this easily by confining the relevant kind of regularity to one direction. In Mackie’s account, causes are not sufficient for their effects, which may occur in other ways. But the problem has another form. If an effect may occur in more than one way, what is to stop more than one of these ways from being present at the same time? Assassin 1 shoots the president, but Assassin 2’s on-target bullet would have done the job if Assassin 1 had missed. c causes e, but c’ would have caused e otherwise.
Such situations are referred to by various names. This article uses the term redundancy as a catch-all for any situation like this, in which a cause is “redundant” in the sense that the effect would have occurred without the actual cause. (Strictly, all that is required is that the cause might have occurred, because the negation of “would not” is “might” (Lewis, 1973b).) Within redundancy, we can distinguish symmetric from asymmetric overdetermination. Symmetric overdetermination occurs when two causes appear absolutely on a par. Suppose two assassins shoot at just the same time, and both bullets enter the president’s heart at just the same time. Either would have sufficed, but in the event, both were present. Neither is “more causal”. The example is not contrived. Such situations are quite common. You and I both shout “Look out!” to the pedestrian about to step in front of a car, and both our shouts are loud enough to cause the pedestrian to jump back. And so forth.
In asymmetric overdetermination, one of the events is naturally regarded as the cause, while the other is not, but both are sufficient in the circumstances for the effect. One is a back-up, which would have caused the effect had the actual cause not done so. For example, suppose that Assassin 2 had fired a little later than Assassin 1, and that the president was already dead by the time Assassin 2’s bullet arrived. Assassin 2’s shot did not kill the president, but had Assassin 1 not shot (or had he not shot accurately enough), Assassin 2’s shot would still have killed the president. Such cases are more commonly referred to as preemption, which is the terminology used in this article since it is more descriptive: the first cause preempts the second one. Again, preemption examples need not be contrived or far-fetched. Suppose I shout “Look out!” a moment after you, but still soon enough for the pedestrian to step back. Your shout caused the pedestrian to step back, but had you not shouted, my shout would have caused the pedestrian to step back. There is nothing outlandish about this; such things happen all the time.
The difficulty here is that there should be two INUS conditions where there is one. Assassin 1’s shot is a necessary part of a sufficient condition for the effect. But so is Assassin 2’s shot. However, Assassin 1’s shot is the true cause.
In the symmetric overdetermination case, one may take the view that they are both equally causes of the effect in question. However, there is still the preemption case, where Assassin 1 did the killing and not Assassin 2. (If you doubt this, imagine they are a pair of competitive twins, counting their kills, and thus racing to be first to the president in this case; Assassin 1 would definitely insist on chalking this one up as a win).
Causal redundancy has remained a thorn in the side of all mainstream analyses of causation, including the counterfactual account (see the appropriate subsection). What makes it so troubling is that we use this feature of causation all the time. Just as we exploit the fact that causes have multiple effects when we are devising measuring instruments, we exploit the fact that we can bring a desired effect about in more than one way every time we set up a failsafe mechanism, a Plan B, a second line of defense, and so forth. Causal redundancy is no mere philosopher’s riddle: it is a useful part of our pragmatic reasoning. Accounting for the fact that we use “cause” in situations where there is also a redundant would-be cause thus seems central to explicating “cause” at all.
iv. The Problem of Unrepeatability
This is less discussed than the problems of the common cause and overdetermination, but it is a serious problem for any regularity account. The problem was elegantly formulated by Bertrand Russell, who pointed out that, once a cause is specified so fully that its effect is inevitable, it is at best implausible and perhaps (physically) impossible that the whole cause occur more than once (Russell, 1918). The fundamental idea of the regularity approach is that cause-effect pairs instantiate regularities in a way that coincidences do not. This objection tells against this fundamental idea. It is not clear what the regularity theorist can reply. She might weaken the idea of regularity to admit of exceptions, but then the door is open to coincidences, since my nose-scratching just before the president’s death might be absent on another such occasion, and yet this might no longer count against its candidacy for cause. At any rate, the problem is a real one, casting doubt on the entire project of analyzing causation in terms of regularity.
We might respond by substituting a weaker notion than true sufficiency: something like “normally followed by”. Nose-scratchings are not normally followed by presidents’ deaths. However, this is not a great solution for regularity theories, because (a) the weaker notion of sufficiency is a departure from the sort of clarity that regularity theorists would otherwise celebrate, and (b) a similar battery of objections will apply: we can find events that, by coincidence, are normally followed by others, merely by chance. Indeed, if enough things happen, so that there are enough events, we can be very confident of finding at least some such patterns of events.
b. Counterfactual Theories
Mackie developed a counterfactual theory of the concept of causation, alongside his theory of causation in objects as regularity. However, at almost exactly the same time, a philosopher at the other end of his career (near the start) developed a theory sharing deep convictions about the fundamental nature of regularities, the priority of singular causal judgements, and the use of counterfactuals to supply their semantics, and yet setting the study of causation on an entirely new path. This approach dominated the philosophical landscape for nearly half a century since the time of writing, not only as a prominent theory of causation, but as an outstanding piece of philosophical work, and thus served as an exemplar for analytic metaphysicians, as a central part of the 1970s story of the emboldening of analytic metaphysics, following years in exile while positivism reigned.
David Lewis’s counterfactual theory of causation (Lewis, 1973a) starts with the observation that, commonly, if the cause had not happened, the effect would not have happened. To build a theory from this observation, Lewis had three major tasks. First, he had to explain what “would” means in this context; he had to provide a semantics for counterfactuals. Second, he had to deal with cases where counterfactuals appear to be true without causation being present, so that counterfactual dependence appears not to be sufficient for causation (since if it were, a lot of non-causes would be counted as causes). Third, he had to deal with cases where it appears that, if the cause had not happened, the effect would still have happened anyway: cases of causal redundancy, where counterfactual dependence appears not to be necessary for causation.
For a considerable period of time, the consensus was that Lewis had succeeded with the first two tasks but failed the third. In the early years of the twenty-first century, however, the second task—establishing that counterfactuals are sufficient for causation—also received critical scrutiny.
Lewis’s theory of causation does not state that effect counterfactually depends on cause, but rather, that c causes e if and only if there is a causal chain running from c to e whose links consist in a chain of counterfactual dependence. The reason for the use of chains is explained by the need to respond to the problem of preemption, as explained in the subsection covering the problem of causal redundancy. Counterfactual dependence is thus not a necessary condition for causation. However, it is a sufficient condition, since whenever we do find counterfactual dependence (of the “right sort”), we find causation. On his view, counterfactual dependence is thus sufficient but not necessary for causation; what is necessary is a chain of counterfactual dependence, but not necessarily the overarching dependence of effect on cause.
The best way to understand Lewis’s theory is through his responses to problems (as he himself sets it out). This is the approach taken in the remainder of this subsection.
i. The Problems of Common Cause, Enormousness and Unrepeatability
Lewis takes his theory to be able to deal easily with the problem of the common cause, which he parcels with another problem he calls the problem of effects. This is the problem that causes might be thought to counterfactually depend on their effects as well as the other way around. Not so, says Lewis, because counterfactual dependence is almost always forward-tracking (Lewis, 1973a, 1973b, 1979). The cases where it is not are easily identifiable, and these occurrences of counterfactual dependence are not apt for analyzing causation, just as counterfactuals representing constitutive relations (such as “If I were not typing, I would not be typing fast”) are not apt.
Lewis’s argument for the ban on backtracking is as follows. Suppose a spark causes a fire. We can imagine a situation where, with a small antecedent change, the fire does not occur. This change may involve breaking a law of nature (Lewis calls such changes “miracles”) but after that, the world may roll on exactly as it would under our laws (Lewis, 1979). This world is therefore very similar to ours, differing in one minor respect.
Now consider what we mean when we start a sentence with “If the fire had not occurred…” By saying so, we do not mean that the spark would not have occurred either. For otherwise, we would also have to suppose that the wire was never exposed, and thus that the careless slicing of a workman’s knife did not occur, and therefore that the workman was more conscientious, perhaps because his upbringing was different, and that of his parents before him, and…? Lewis says: that cannot be. When we assert a counterfactual, we do not mean anything like that at all. Rather, we mean that the spark still occurred, along with most other earlier events; but for some or other reason, the fire did not.
Why this is so is a matter of considerable debate, and much further work by Lewis himself. For these purposes, however, all that is needed is the idea that, by the time when the fire occurs, the spark is part of history, and there will be some other way to stop the fire—some other small “miracle”—that occurs later, and thus preserves a larger degree of historical match with the actual world, rendering it more similar.
The problem of the common cause is then solved by way of a simple parallel. It might appear that there is counterfactual dependence between the effects of a common cause: between barometer falling and storm, for example. Not so. If the barometer had not fallen, the air pressure, which fell earlier, would remain fallen; and the storm would have occurred anyway. If the barometer had not fallen, that would be because some tiny little “miracle” would have occurred shortly beforehand (even Lewis’s account requires at least this tiny bit of backtracking, and he is open about that.) This would lead to its not falling when it should. In a nutshell, if the barometer had not fallen, it would have been broken.
Put that way, the position does not sound so attractive; on the contrary, it sounds somewhat artificial. Indeed, this argument, and Lewis’s theory of causation as a whole, depend heavily on a semantics for counterfactuals according to which the closest world at which the antecedent is true determines the truth of the counterfactual. If the consequent is true at that world, the counterfactual is true; otherwise, not. (Where the antecedent is false, we have vacuous truth.) This semantics is complex and subject to many criticisms, but it is also an enormous intellectual achievement, partly because a theory of causation drops out of it virtually for free, or so it appears when the package is assembled. There is no space here to discuss the details of Lewis’s theory of counterfactuals (for critical discussions see in particular: Bennett, 2001, 2003; Elga, 2000; Hiddleston, 2005), but if we accept that theory, then his solution to the problem of effects follows easily.
Lewis deals even more easily with the problems of enormousness and unrepeatability that trouble regularity theories. The problem of enormousness is that, to ensure a truly exceptionless regularity, we must include a very large portion of the universe indeed into the cause (Mill’s doctrine of the “whole cause”). According to Mill, strictly speaking, this is what “cause” means. But according to common usage, it most certainly is not what “cause” means: when I say that the glass of juice quenched my thirst, I am not talking about the Jupiter, the Andromeda galaxy, and all the other heavenly bodies exerting forces on the glass, the balance of which was part of the story of the glass raising to my lips. I am talking about a glass of juice.
The counterfactual theory deals with this easily. If I had not drunk the juice, my thirst would not have been quenched. This is what it means to say that drinking the juice caused my thirst to be quenched; which is what I mean when I say that it quenched my thirst. There is no enormousness. There are many other causes, because there are many other things that, had they not been so, would have resulted in my thirst not being quenched. But, Lewis says, a multiplicity of causes is no problem; we may have all sorts of pragmatic reasons for singling some out rather than others, but these do not have implications for the underlying concept of cause, nor indeed for the underlying causal facts.
The problem of unrepeatability was that, once inflated to the enormous scale of a whole cause, it becomes incredible that such things recur at all, let alone regularly. Again, there is no problem here: ordinary events like the drinking of juice can easily recur.
While later subsubsections discuss the problems that have proved less tractable for counterfactual theories, we should firstly note that even if we set aside criticisms of Lewis’s theory of counterfactuals, his solution to the problem of the common cause is far less plausible on its own terms than Lewis and his commentators appear to have appreciated. It is at least reasonable to suggest that we use barometers precisely because they track the truth of what they predict (Lipton, 2000). It does not seem wild to think that if the barometer had not fallen, the storm would not after all have been going to occur: more naturally, the storm would not after all have been impending. Lewis’s theory implies that in the nearest worlds where the barometer does not fall, my picnic plans would have been rained out. If I believed that, I would immediately seek a better barometer.
Empirical evidence suggests that there is a strong tendency for this kind of reasoning in situations where causes and their multiple effects are suitably tightly connected (Rips, 2010). Consider a mechanic wondering why the car will not start. He tests the lights which also do not work. So he infers that it is probably the battery. It is. But in Lewis’s closest world where the lights do work, the battery is still flat: an outrageous suggestion for both the mechanic and any reasonable similarity-based semantics of counterfactuals (for another instance of this objection see Hausman, 1998). Or, if not, then he must accept that the lights’ not lighting causes the car not to start (or vice versa). Philosophers are not usually very practical and sometimes this shows up; perhaps causation is a particularly high-risk area in this regard, given its practical utility.
ii. The Problem of Causal Redundancy
If Assassin 1 had not shot (or had missed) then the president still would (or might) have died, because Assassin 2 also shot. Recall that two important kinds of redundancy can be distinguished (as discussed in the subsubsection on the problem of the common cause). One is symmetric overdetermination, where the two bullets enter the heart at the same time. Lewis says that in this case our causal intuitions are pretty hazy (Lewis, 1986). That seems right; imagine a firing squad—what would we say about the status of Soldier 1’s bullet, Soldier 2’s bullet, Soldier 3’s, … when they are all causally sufficient but none of them causally necessary? We would probably want to say that it was the whole firing squad that was the cause of the convict’s death. So we should in those residual overdetermination cases that cannot be dealt with in other ways, says Lewis. Assassin 1 and Assassin 2 are together the cause. The causal event is the conjunction of these two events. Had that event not occurred, the effect would not have occurred. Lewis runs into some trouble with the point that the negation of a conjunction is achieved by negating just one of its conjuncts, and thus Assassin 1’s not shooting is enough to render the conjunctive event absent—even if Assassin 2 had still shot and the president would still have died. Lewis says that we have to remove the whole event when we are assessing the relevant counterfactuals.
This starts to look less than elegant; it lacks the conviction and sense of insight that characterize Lewis’s bolder propositions. However, our causal intuitions are so unclear that we should take the attitude that spoils go to the victor (meaning, the account that has solved all the cases where our intuitions are clear). Even if this solution to symmetric overdetermination is imperfect, which Lewis does not admit, the unclarity of our intuitions would mean that there is no basis to contest the account that is victorious in other areas.
Preemption is the other central kind of causal redundancy, and it has proved a persistent problem for counterfactual approaches to causation. It cannot be set aside as a “funny case” in the way of symmetric overdetermination, because we do have clear ideas about the relevant causal facts, but they do not play nicely with counterfactual analysis. Assassins 1 and 2 may be having a competition as to who can chalk up more “kills”, in which case they will be deeply committed to the truth of the claim that the preempting bullet really did cause the death, despite the subsequent thudding of the loser’s bullet into the presidential corpse. A second later or a day later—it would not matter from their perspective.
Lewis’s attempted solution to the problem of preemption seeks, once again, to apply features of his semantics for counterfactuals. The two features applied are non-transitivity and, once again, non-backtracking.
Counterfactuals are unlike indicative conditionals in not being transitive (Lewis, 1973b, 1973c). For indicatives, the pattern If A then B, if B then C, therefore if A then C is valid. But not so for counterfactuals. If Bill had not gone to Cambridge (B), he would have gone to Oxford (C); and if Bill had been a chimpanzee (A), he would not have gone to Cambridge (B). If counterfactuals are transitive, then it can be concluded that, if Bill had been a chimpanzee (A), he would have gone to Oxford (C). Notwithstanding its prima facie appeal, this argument has not been found compelling, and the moral usually drawn is that transitivity fails for counterfactuals.
Lewis thus suggests that causation consists in a chain of counterfactual dependence, rather than a single counterfactual. Suppose we have a cause c and an effect e, connected by a chain of intermediate events d1, … dn. Lewis says: it can be false that if c had not occurred then e would not have occurred, yet true that c causes e, provided that there are events d1, … dn such that if c had not occurred then d1 would not have occurred, and… if dn (henceforth dnis simply called d for readability) had not occurred, then e would not have occurred.
This is one step of the solution, because it provides for the effect to fail to counterfactually depend upon Assassin 1’s shot, yet Assassin 1’s shot to still be the cause. Provides for, but does not establish. The obvious remaining task is to establish that there is a chain of true counterfactuals from Assassin 1’s shot to the president’s death—and, if there is, that there is not also a chain from Assassin 2’s shot.
This is where the second deployment of a resource from Lewis’s semantics for counterfactuals comes into play (and this is sometimes omitted from explanations of how Lewis’s solution to preemption is supposed to work). His idea is that, at the time of the final event in the actual causal chain, d, the would-be causal chain has already been terminated, thanks to something in the actual causal chain. d* has already failed to happen, so to speak: its time has passed. So “~d → ~e” is true, because d* would not occur in the absence of d. ~d-worlds where d* occurs are further than some worlds where ~d* as in actuality.
This solution may work for some cases; these have become known as early preemption cases. But it does not work for others, and these have become known as late preemption. Consider the moment when Assassin 1’s bullet strikes the president, and suppose that this is the last event, d, in the causal chain from Assassin 1’s shot c to the president’s death e. Then ask what would have happened if this event had not happened—by a small miracle the bullet deviated at the last moment, for example. At that moment, Assassin 2’s bullet was speeding on its lethal path towards the president. On Lewis’s view, after the small miracle by which Assassin 1’s bullet does not strike (after ~d), the world continues to evolve as if the actual laws of nature held. So Assassin 2’s bullet strikes the president a moment later, killing him.
Various solutions have been tried. We might specify the president’s death very precisely, as the death that occurred just then, a moment earlier than the death that would have occurred had Assassin 2’s bullet struck; and the angle of the bullet would have been a bit different; and so forth. In short: that death would not have occurred, but for Assassin 1, even if some other, similar death, involving the same person and a similar cause, would have occurred in its place. But Lewis himself provides a compelling response, which is simply that this is not at all what phrases like “the president died” or “the death of the president” refer to when we use them in a causal statement. Events may be more or less tightly specified, and there can be a distinction drawn between actual and counterfactual deaths, tightly specified. But that is not the tightness of specification we actually use in this causal judgement, as in many others.
A related idea is to accept that the event of the president’s death is the same in the actual and counterfactual cases, but to appeal to small differences in the actual effect that would have happened if the actual cause had been a bit different. Therefore, in very close worlds, where Assassin 1 shot just a moment earlier or later, but still soon enough to beat Assassin 2, or where a fly in the bullet’s path had caused just a miniscule deviation, or similar ones, the death would have been just minutely different. It still counts as the same death-event, but with just slightly different properties. Influence is what Lewis calls this counterfactual co-variance of event properties, and he suggests that a chain of influence connects cause and effect, but not preempted cause and effect (Lewis, 2004a).
However, there even seem to be cases where influence fails, notably the trumping cases pressed in particular by Jonathan Schaffer (2004). Merlin casts a spell to turn the prince into a frog at the stroke of midnight. Morgana casts the same spell, but at a later point in the day. It is the way of magic, suppose, that the first spell cast is the one that operates; had Morgana cast a spell to turn the prince into a toad instead, the prince would nevertheless have turned into a frog, because Merlin’s earlier spell takes priority. Yet she in fact specified a frog. If Merlin had not cast his spell, the prince would still have turned into a frog—and there would have been no difference at all in the effect. There is no chain of influence.
We do not have to appeal to magic for such instances. I push a button to call an elevator, which duly illuminates, but even so, an impatient or unobservant person arriving directly after me pushes it again. The elevator arrives. It does so in just the same way and in just the same time as if I had not pushed it, or had pushed it just a tiny moment earlier or later, more or less forcefully, and so forth. In today’s world, where magic is rarely observed, electrical mediation of cause and effects is a fruitful hunting ground for cases of trumping.
There is a large literature on preemption, because the generally accepted conclusion is that, despite Lewis’s extraordinary ingenuity, the counterfactual analysis of causation cannot be completed. Many philosophers are still attracted to a counterfactual approach: indeed it is an active area of research outside philosophy (as in interdisciplinary work), offering as it does a framework for technical development and thus for operationalization in the business of inferring causes. But for analyzing causation—for providing a semantic analysis, for saying what “causation” means—there is general acceptance that some further resource is needed. Counterfactuals are clearly related to causation in a tight way, but the nature of that connection still appears frustratingly elusive.
iii. A New Problem: Causal Transitivity
Considerably more could be said about counterfactual analysis of causation; it dominated philosophical attention for decades, and drew more attention than any other approach after superseding the regularity theories in the 1970s. Since discussions of preemption dried up, attention has shifted to the less controversial claim that counterfactual dependence is sufficient for causation. One is briefly introduced here: transitivity.
In Lewis’s account, and more broadly, causation is often supposed to be transitive, even though counterfactual dependence is not. This is central to Lewis’s response to the problem of preemption. It also seems to tie with the “non-discriminatory” notion of cause, according to which my grandmother’s birth is among the causes, strictly speaking, of my writing these words, even if we rarely mention it.
To say that a relation R is transitive is to say that if R(x,y) and R(y,z) then R(x,z). There seem to be cases showing that causation is not like this after all. Hiker sees a boulder bounding down the slope towards him, ducks, and survives. Had the boulder not bounded, he would not have ducked, and had he not ducked, he would have died. There is a chain of counterfactual dependence, and indeed a chain of causation. But there is not an overarching causal relation. The bounding boulder did not cause Hiker’s survival.
Cases of this kind, known as double prevention, have provoked various solutions, not all of which involve an attempt to “fix” the Lewisian approach. Ned Hall suggests that there are two concepts of causation, which conflict in cases like this (Hall, 2004). Alex Broadbent suggests that permitting backtracking counterfactuals in limited contexts permits introducing as a necessary condition on causation the dependence of cause on effect, which cases of this kind fail (Broadbent, 2012). But their significance remains unclear.
c. Interventionism
There is a very wide range of other approaches to the analysis of causation, given the apparent dead ends that the big ideas of regularity and counterfactual dependence have reached. Some develop the idea of counterfactual dependence, but shift the approach from conceptual analysis to something less purely conceptual, more closely related to causal reasoning, in everyday and scientific contexts, and perhaps more focused on investigating and understanding causation than producing a neat and complete theory. Interventionism is the most well-known of these approaches.
Interventionism starts with the idea that causation is fundamentally connected to agency: to the fact that we are agents who make decisions and do things in order to bring about the goals we have decided upon. We intervene in the world in order to make things happen. James Woodward sets out to remove the anthropocentric component of this observation, to devise a characterization of interventions in broadly speaking objective terms, and to use this as the basis for an account of how causal reasoning works—meaning, it manages to track how the world works, and thus enables us to make things happen (Woodward, 2003, 2006).
Woodward’s interests are thus focused on causal explanation in particular, trying to answer the questions of what causal explanations amount to, what information they carry, what they mean. The notion of explanation he arrived at is analyzed and unpacked in detail. The target of analysis shifts from “c causes e”, not merely to “c explains e” (which was the target of much previous work in the philosophy of explanation), but to a full paragraph of explanation of why and how the temperature in a container increases when the volume is reduced in terms of the ideal gas law and the kinetic theory of heat.
Interventionism offers a different approach to thinking about causation, and perhaps the most difficult thing for someone approaching it from the perspective of the Western philosophical canon is to work out what exactly it achieves, or aims to achieve. It does not tell us precisely what causation itself is. It may help us understand causation; but if it does, the upshot does not fall short of being problematic—being a series of interesting observations, akin to those of J. L. Austin and the ordinary language philosophers, or an operationalizable concept of causation, one that might be converted into a fully automatic causal reasoning “module” to be implemented in a robot. The latter appears to be the goal of some in the non-philosophical world, such as Judea Pearl. Such efforts are ambitious and interesting, potentially illuminating the nature of causal inference, even if this potential is yet to be fully realized, and even if a question of significance so long as implementation remains hard to conceive.
Perhaps what interventionist frameworks offer is a language for talking about causation more precisely. So it is with Pearl, who is also a kind of interventionist, holding that causal facts can be formally represented in diagrams called Directed Acyclic Graphs displaying counterfactual dependencies between variables (Pearl, 2009; Pearl & Mackenzie, 2018). These counterfactual dependencies are assessed against what would happen if three was an intervention, a “surgical”, hypothetical one, to alter the value of only a (or some) specified variable(s). Formulating causal hypotheses in this way is meant to offer mathematical tools for analyzing empirical data, and such tools have indeed been developed by some, notably in epidemiology. In epidemiology, the Potential Outcomes Approach, which is a form of interventionism and a relative of Woodward’s philosophical account, attracts a devoted following. The primary insistence of its followers is on the precise formulation of causal hypotheses using the language of interventions (Hernán, 2005, 2016; Hernán & Taubman, 2008), which is a little ironic, given that a basis for Woodward’s philosophical interventionism was the idea of moving away from the task of strictly defining causation. The Potential Outcomes Approach constitutes a topic of intense debate in epidemiology (Blakely, 2016; Broadbent, 2019; Broadbent, Vandenbroucke, & Pearce, 2016; Krieger & Davey Smith, 2016; Vandenbroucke, Broadbent, & Pearce, 2016; VanderWeele, 2016), and its track record of actual discoveries remains limited; its main successes have been in re-analyzing old data which was wrongly interpreted at the time, but where the mistake is either already known or no longer matters.
If this sounds confusing, that is because it is. This is a very vibrant area of research. Those interested in interventionism are strongly advised not to confine themselves to the philosophical literature but to read at least a little of Judea Pearl’s (albeit voluminous) corpus, and engage with the epidemiological debate on the Potential Outcomes Approach. Even if it is yet to see its most concise and conceptually organized formulation on which work is ongoing, the initial lack of organization of a field of study is indicative of its ongoing development—exactly the kind of field one who is looking to make a mark, or at least a contribution, should take an interest in. Once the battle lines are drawn up, and the trenches are dug, the purpose of the entire war is called into question.
d. Probabilistic Theories
Probabilistic theories (for example: Eells, 1991; Salmon, 1993; Suppes, 1970) start with the idea that causes raise the probability of their effects. Striking a match may not always be followed by its lighting, but certainly makes it more likely; whereas coincidental antecedents, such as my scratching my nose, do not.
Probabilistic theories originate in part as an attempt to soften the excesses of regularity theories, given the absence of observable exceptionless regularities. More importantly, however, they are motivated by the observation that the world itself may be fundamentally indeterministic, if quantum physics withstands the test of time. A probabilistic theory could cope with a deterministic world as well as an indeterministic one, but a regularity theory could not. Moreover, given the shift in odds towards an indeterministic universe, the fights about regularity start to look scholastic, concerning the finer details of a superstructure whose foundations, never before critically examined, have crumbled upon exposure to the fresh air of empirical science.
Probabilistic approaches may be combined with other accounts, such as agency approaches (Price, 1991). Alternatively, probability may be taken as the primary analytical tool, and this approach has given rise to its own literature on probabilistic theories.
The first move of a probabilistic theory is to deal with the problem that effects raise the probability of other effects of a shared cause. To do so, the notion of screening off is introduced (Suppes, 1970). A cause has many effects, and conditionalizing on the cause alters their probabilities even if we hold the other effects fixed. But not so if we conditionalize on an effect. The probability of the storm occurring, given that the air pressure does not fall, is lower than the probability given that the air pressure does fall, even if we hold fixed the falling of the barometer; and vice versa. But if we hold fixed the air pressure falling (at, say 1 atmosphere, as in actuality) while conditionalizing on the barometer, we do not see any difference in the probability of the storm in case the barometer falls than in case it does not.
To unpack this a bit, consider all the occasions on which air pressure has fallen, all those on which barometers have fallen, and all those on which storms have occurred (and barometers have been present). The problem could then be stated like this. When air pressure falls, storm occurrences are very much more common than when it does not. Moreover, storm occurrences are very much more common in cases where barometers have fallen than in cases where they (have been present but) have not. Thus it appears that both air pressure and barometers cause storms. But, a question prompts, do they truly do so? Or is this one a case of spurious causation?
The screening-off solution says you should proceed as follows. First, consider how things look when you hold the barometer status fixed. In cases where the barometer does not fall, but air pressure does, storm occurrences are more frequent than in cases where neither the barometer falls nor does air pressure. Likewise in cases where barometers do fall. Now hold fixed air pressure status, considering first those cases where air pressure does not fall, but barometers do—storms are not more common there. Among cases where air pressure does fall, storms are not more common in cases where barometers do fall than in cases where they do not.
Thus, air pressure screens off the barometer falling from the storm. Once you settle on the behavior of the air pressure, and look only at cases where the air pressure behaves in a certain way, the behavior of the barometer is irrelevant to how commonly you find storms. On the other hand, if you settle on a certain barometer behavior, the status of the air pressure remains relevant to how commonly you encounter storms.
This asymmetry determines the direction of causation. Effects raise the probability of their causes, and indeed of other effects—that is why we can perform causal inference, and can infer the impending storm from the falling barometer. But causes “screen off” their effects from each other, while effects do not: the probability of the storm stops tracking the behavior of the barometer as soon as we fix the air pressure, which screens the storm from the barometer; whereas the probability of the storm continues to track the air pressure even when we fix the barometer (and likewise for the barometer when we fix the storm).
One major source of doubt about probabilistic theories is simply that probability and causation are different things (Gillies, 2000; Hesslow, 1976; Hitchcock, 2010). Causes may indeed raise probabilities of effects, but that is because causes make things happen, not because making things happen and raising their probabilities are the same thing. This general objection may be motivated by various counterexamples, of which perhaps the most important are chance-lowering causes.
Chance-lowering causes reduce the probability of their effects, but nonetheless cause them (Dowe & Noordhof, 2004; Hitchcock, 2004). Taking birth control pills reduces the probability of pregnancy. But it is not always a cause of non-pregnancy. Suppose that, as it happens, reproductive cycles are the cause. Or suppose that there is an illness causing the lack of pregnancy. Or suppose a man takes the pills. In such cases, provided the probability of pregnancy is not already zero, the pill may reduce the probability of pregnancy (albeit slightly), while the cause may be something else. In another well-worn example, a golfer slices a ball which veers off the course, strikes a tree, and bounces in for a hole in one. Slicing the ball lowered the probability of a hole in one but nonetheless caused it. Many attempts to deal with chance-lowering causes have been made, but none has secured general acceptance.
5. Ontological Stances
Ontological questions concern the nature of causation, meaning, in a phrase that is perhaps equally obscure, the kind of thing it is. Typically, ontological views of causation seek not only to explain the ontological status for its own sake, but to incorporate causation into a favored ontological framework.
There is a methodological risk in starting with, for example, “I’m a realist…” and then looking for a way to make sense of causation from this perspective. The risk is similar to that of a scientist who begins committed to a hypothesis and looks for a way to confirm it. This approach can be useful, leading to ingenuity in the face of discouraging evidence, and has led to some major scientific breakthroughs (such as Newtonian mechanics and germ theory, to take two quite different examples). It does not entail confirmation bias; indeed, the breakthrough cases are characterized by an obsession with the evidence that does not seem to fit, and by dissatisfaction with a weight of extant confirming evidence that might have convinced a lesser investigator. (Darwin’s sleepless nights about the male peacock’s tail amount to an example; the male peacock’s tail is a cumbersome impediment to survival, and Darwin had not rest until he found an explanation in terms of a mechanism differing from straightforward natural selection, namely, sexual selection.) However, in less genius hands, setting out to show how your theory can explain the object of investigation carries an obvious risk of confirmation bias; indeed, sometimes it turns the activity into something that does not deserve to be called an investigation at all. Moreover it can make for frustrating discussions.
One question about “the nature of causation” is whether causation is something that exists over and above particular things that are causally related, in any sense at all. Nominalism says no, realism says yes, and dispositionalism seeks to explain causation by realism about dispositions, which are things that nominalists would not countenance, but that are different from universals (or at least from the necessitation relation that realists endorse). Process theories offer something different again, seeking to identify a basis for causation in our current best science, thus remaining agnostic (within certain bounds) on larger metaphysical matters, and merely denying the need for causal theory to engage metaphysical resources (as do causal realism and dispositionalism) or to commit to a daunting reductive project (as does nominalism).
a. Nominalism
Nominalists believe that there is nothing (or very little) other than what Lewis calls “distinct existences” (Lewis, 1983, 1986). According to nominalism, causation is obviously not a particular thing because it recurs. So it is not a thing at all, existing over and above its particular instances.
The motivation for nominalism is the same as the motivation for regularity theories, that is, David Hume’s skeptical attack on necessary connection. The nominalist project is to show that sense can be made of causation, and knowledge of it obtained, without this notion. Ultimately, the goal is to show that (or at least to show to what extent) the knowledge that depends on causal knowledge is warranted.
Nominalism thus depends fundamentally on the success of the semantic project, which is discussed in the previous section. Attacks on those projects amount to attacks on, or challenges for, nominalism. They are not rehearsed here. The remainder of this section considers alternatives to nominalism.
b. Realism
Realists believe that there are real things, usually called universals, that exist in abstraction from particulars. Nominalists deny this. The debate is one of the most ancient in philosophy and this article is not the place to introduce it. Here, the topic is realism and nominalism about causation.
Realists believe that there is something often called the necessitation relation which holds between causes and effects, but not between non-causal pairs. Nominalists think that there is no such thing, but that causation is just some sort of pattern among causes and effects, for instance, that causes are always followed by their effects, distinguishing them from mere coincidences (see the subsection on regularity theories).
Before continuing, a note on the various meanings of “realism” is necessary. It is important not to confuse realism about causation (and, similarly, about laws of nature) with metaphysical realism. To be realist about something is to assert its mind-independent existence. In the case of universals, the debate is about whether they exist aside from particulars. The emphasis is on existence. In debates about metaphysical realism, the emphasis is on mind-independence. The latter is contrasted with relativist positions such as epistemic relativism, according to which there are no facts independent of a knower (Bloor, 1991, 2008), or Quine’s ontological relativity, according to which reference is relative to a frame of reference (Quine, 1969), which is best understood as either being or arising from a conceptual framework.
Nominalists may or may not be metaphysical anti-realists of one or another kind. In fact, unlike Quine (a nominalist, that is, an anti-realist about universals, and also a metaphysical anti-realist), the most prominent opponents of nominalism about causation (which is a kind of causal anti-realism) are metaphysical realists. For instance, the nominalist David Lewis believes that there is nothing (or relatively little) other than what he calls distinct existences, but he is realist about these existences (Lewis, 1984). In this area of the debate about causation, however, broad metaphysical realism is a generally accepted background assumption. The question is then whether or not causation is to be understood as some pattern of distinct existences, whether actual or counterfactual, or whether on the contrary it is to be understood as a universal: the “necessitation relation”.
The classic statements of realism about causation are by David Armstrong and Michael Tooley (Heathcote & Armstrong, 1991; Tooley, 1987). These also concern laws of nature, which, on their accounts, underlie causal relations. The core of such accounts of laws and causation is the postulation of a kind of necessity that is not logical necessity. In other words, they refuse to accept Hume’s skeptical arguments about the unintelligibility or unknowability of non-logical necessity (which are presented in the subsection on regularity theories). On Armstrong’s view, there is a second-order universal he calls the necessitation relation which relates first order universals, which are regular properties and relations such as being a massive object or having a certain velocity relative to a given frame of reference. If it is a law that sodium burns with a yellow flame, that means that the necessitation relation holds between the universals (or complexes of them) denoted by the predicates “is sodium” and “burns with a yellow flame”. Being sodium and burning necessitate a yellow flame.
Causal relations derive from the laws. The burning sodium causes there to be a yellow flame, because of the necessitation relation that holds between the universals. Where there is sodium, and it burns, there must be a yellow flame. The kind of necessity is not logical, and nor is it strictly exceptionless. But there is a kind of necessity, nonetheless.
How, exactly, are the laws meant to underlie causal relations? Michael Tooley considers the relation between causation and laws, on the realist account of both, in detail (Tooley, 1987). But even if it can be answered, the most obvious question for realism about universals is what exactly they are (Heathcote & Armstrong, 1991).
For the realist account of causation, saying what universals are is particularly important. That is because the necessitation relation seems somewhat unlike other universals. Second order universals such as, for example, shape, of which particular shapes partake, are reasonably intelligible. I have a grasp on what shape is, even if I struggle to say what it is apart from giving examples of actual shapes. At least, I think I know what “shape” means. But I do not know what “necessitates” means. David Lewis puts the point in the following oft-cited passage:
The mystery is somewhat hidden by Armstrong’s terminology. He uses ‘necessitates’ as a name for the lawmaking universal N; and who would be surprised to hear that if F ‘necessitates’ G and a has F, then a must have G? But I say that N deserves the name of ‘necessitation’ only if, somehow, it really can enter into the requisite necessary connections. It can’t enter into them just by bearing a name, any more than one can have mighty biceps just by being called ‘Armstrong’. (Lewis, 1983, p. 366)
Does realism help with the problems that nominalist semantic theories encounter? One advantage of realism is that it makes semantics easy. Causal statements are made true by the obtaining, or not, of the necessitation relation between cause and effect. This relation holds between the common cause of two effects, but not between the effects; between the preempting, but not the preempted, cause and the effect. Classic problems evaporate; they are an artefact of the need arising from nominalism to analyze causation in terms of distinct events, a project that realists are too wise to entertain.
But that may, in a way, appear as cheating. For it hardly sounds any different from the pre-theoretic statement that causes cause their effects, while effects of a common cause do not cause each other, and that preempted events are not causes.
One way to press this objection is to look at whether realism assists people who face causal problems, outside of philosophical debate. When people other than metaphysicians encounter difficulties with causation, they do not typically find themselves assisted by the notion of a relation of necessitation. Lawyers may apply a counterfactual “but for” test: but for the defendant’s wrongful act, would the harm have occurred? In doing so, they are not adducing more empirical evidence, but offering a different way to approach, analyze, or think through the evidence. They do not, however, find it useful to ask whether the defendant’s wrongful act necessitated the harm. In cases where the “but for” test fails, other options have been tried, including asking whether the wrongful act made the harm more probable; and scientific evidence is sometimes adduced to confirm that there is, in general, a possibility that the wrongful act could have caused the harm. But lawyers never ask anything like: did the defendant’s act necessitate the harm? Not only would this seem far too strong for any prosecution in its right mind to introduce; it would also not seem particularly helpful. The judge would almost certainly want help in understanding “necessitate”, which in non-obvious cases sounds as obscure and philosophical-sounding as “cause”, and then we would be back with the various legal “tests” that have been constructed.
The realist might reply that metaphysics is not noted for its practical utility, and that the underlying metaphysical explanation for regularities and counterfactuals is the existence of a necessitation relation. Fair enough, but it is interesting that offering counterfactuals or probabilities in place of causal terms is thought to elucidate them, and that there is not a further request to elucidate the counterfactuals or probabilities; whereas there would be a further request (presumably) to explicate necessitation. Realists seem to differ not just from nominalists but from everyone else in seeing universals as explaining all these things, while not seeing any need for further explication of universals.
c. Dispositionalism
Dispositionalism is a relatively newly explored view, aiming to take a different tack from nominalism and realism (Mumford & Anjum, 2011). Dispositions are fundamental constituents of reality on this view (Mumford, 1998). Counterfactuals are to be understood in terms of dispositions (and not the other way round (Bird, 2007)). Causation may also be explained in this way, and without dog-legging through counterfactuals, which averts the problems attendant on counterfactual analyses of causation.
To cause an effect is, in essence, to dispose the effect to happen. Effects do not have to happen. But causes dispose them to. This is how their probabilities are raised. This is why, had the cause not occurred, the effect would not have occurred.
The literature on dispositionalism is relatively new and developing in the 21st century, with the position receiving a book-length formulation only in the 2010s (see Mumford & Anjum, 2011). Interested readers are invited to consult that work, which offers a much more useful introduction to the subtleties of this new approach than is effected here.
d. Process Theories
A further approach which has developed an interesting literature but which is not treated in detail in this article is the process approach. Wesley Salmon suggested that causation be identified with some physical quantity or property, which he characterized as the transmission of a “mark” from cause to effect (Salmon, 1998). This idea was critiqued and then developed by Phil Dowe, who suggested that the transmission of energy should be identified as the underlying physical quantity (Dowe, 2000). Dowe’s approach has the merits of freeing itself from the restrictions of conceptual analysis, while at the same time solving some familiar problems. Effects of a common cause transmit no energy to each other. Preempted events transmit no energy to the effects of the preempting causes, which, on the other hand, do so.
The attraction of substituting a scientific concept, or a bundle of concepts, for causation is obvious. Such treatments have proved fruitful for other pre-theoretic notions like “energy” and offer to fit causation into a worldview which, arguably (see the subsection on Russellian Republicanism), does not appear in our best scientific accounts of reality.
On the other hand, the account does face objections. Energy is in fact transmitted from Assassin 2’s shot to the president, as light bounces off and travels faster than a speeding bullet. Accounts like Dowe’s must be careful to specify the right physical process in order to remain plausible as accounts of causation and then to justify the choice of this particular process on some objective, and ultimately scientific, basis. There is also the problem that, in ordinary talk, we often regard absences or lacks as causes. It is my lack of organizational ability that caused me to miss the deadline. Whether absences can cause is a contested topic (Beebee, 2004; Lewis, 2004b; Mellor, 2004) and one reason for this is that they appear to be a problem for this account of causation.
6. Kantian Approaches
a. Kant Himself
Kant responded to Hume by taking further the idea that causation is not part of the objective world (Kant, 1781).
Hume argued that the only thing in the objects was regularity, and that this fell far short of fulfilling any notion of necessary connection. He further argued that our idea of necessary connection was merely a feeling of expectation. But Hume was (arguably) a realist about the world, and about the regularities it contains, even if he doubted our justification for believing in regularities and doubted that causation was anything in the world beyond a feeling we sometimes get.
Kant, however, took a different view of the world itself, of which causation is meant to be a part. His view is transcendental idealism, the view that space and time are ways in which we experience the world, but not features of the world itself. According to this view, the world exists but it is wholly unknowable. The world constrains what we experience, but what we experience does not tell us about what it is like in itself, that is, independent of how we experience it.
Within this framework, Kant was an empirical realist. That is to say, given the constraints that the noumenal world imposes on what we experience, there are facts about how the phenomenal world goes. Facts about this world are not simply “up to us”. They are partly determined by the noumenal world. But they are also partly determined by the ways we experience things, and thus we are unable to comprehend those things in themselves, apart from the ways we experience them. A moth bangs into a pane of glass, and cannot simply fly through it; the pane of glass constrains it. But clearly the moth’s perceptual modalities also constrain what kind of thing it takes the pane of glass to be. Otherwise, it would not keep flying into the glass.
Kant argued that causation is not an objective thing, but a feature of our experience. In fact, he argued that causation is essential to any kind of experience. The ordering of events in time only amounts to experience if we can distinguish within the general flow of events or of sensory experiences, some streams that are somehow connected. We see a ship on a river. We look away, and look back a while later, to see the ship further down the river (the example is discussed in the Second Analogy in Kant’s Critique of Pure Reason). Only if we can see this as the same ship, moved further along the river, can we see this as a ship and a river at all. Otherwise it is just a series of frames, no more comprehensible than a row of impressionist paintings in an art gallery.
Kant used causation as the exemplar of a treatment he extended to shape, number, and various other apparent features of reality which, in his view, are actually fundamental elements of the structure of experience. His argument that causation is a necessary component of all experience is no longer compelling. It seems that very young children have experience, but not much by way of a concept of causation. Some animals may be able to reason causally, but some clearly cannot, or at least cannot to any great extent. It is a separate question whether they have experience, and some seem to. Thus he seems to have over-extended his point. On the other hand, the insight that there is a fundamental connection between causation and some aspect of us and our engagement with the world may have something to it, and this has subsequently attracted considerable attention.
b. Agency Views
On agency views of causation, the fact that we are agents is inextricably tied up with the fact that we have a causal concept, think causally, make causal judgements, and understand the world as riddled with causality. Agents have goals, and seek to bring them about, through exercising what at least to them seems like their free will. They make decisions, and they do something about them.
Agency theories have trouble providing answers to certain key questions which renders them very unpopular. If a cause is a human action, then what of causes that are not human actions, like the rain causing the dam to flood—if such events are causes by analogy, then that prompts the troublesome questions for agency theories—in what respect are things like rain analogous to human actions? Did someone or something decide to “do” the rain? If not, then in what does the analogy consist?
The most compelling response to these questions lies in the work of Huw Price, beginning with a paper he co-wrote with Peter Menzies (Menzies & Price, 1993). They argue that causation is (or is like) a secondary property, like color. Light comes in various wavelengths, some of which we can perceive. We are able to differentiate among wavelengths to some level of accuracy. This differentiated perception is what we call “color”. We see color, not “as” wavelengths of light (whatever exactly that would be), but as a property of the things off which light bounces or from which it emanates. Color is thus not just a wavelength of light: it is a disposition that we have to react in a certain way to a wavelength of light; alternatively, it is a disposition of light to provoke a certain reaction in us.
Causation, they suggest, is a bit like this. It has some objective basis in the world; but is also depends on us. It is mediated not by our being conscious beings, as in the case of color, but by our being agents. Certain patterns of events in the world, or at least certain features of the world, produce a “cause-effect” response in us. We cannot just choose what causes what. At the same time, this response is not reducible to features of the world alone; our agency is part of the story.
This approach deals with the anthropomorphism objection by retaining the objective basis of causes and effects, while confirming that the interpretation of this objective basis as causal is contributed by us due to the fact that we are agents.
This approach is insufficiently taken up in the literature, and there is not a well-developed literature of objections and responses, beyond the point that the approach remains suggestive and not completely made out. Price has argued for a perspectivalism about causation, arguing that entropic or other asymmetries account for the asymmetries that we project onto time and counterfactual dependence.
Yet this is a sensible direction of exploration, given our inability to observe causation in objects, and our apparent failure to find an objective substitute. It departs from the kind of realism that is dominant in early twenty-first century philosophy of causation, but perhaps that departure is due.
7. Skepticism
a. Russellian Republicanism
Bertrand Russell famously argued that causation was “a relic of a bygone age, surviving, like the monarchy, only because it is erroneously supposed to do no harm” (Russell, 1918). He advanced arguments against the Millian regularity view of causation that was dominant at the time, one of which is the unrepeatability objection discussed above in this article. But his fundamental point is a simple one: our theories of the fundamental nature of reality have no place for the notion of cause.
One response is simply to deny this, and to point out that scientists do use causal language all the time. It is however doubtful that this defense deflects the skeptical blow. Whether or not physicists use the word “cause”, there is nothing like causation in the actual theories which are expressed by equations. As Judea Pearl points out, mathematical equations are symmetrical (Pearl & Mackenzie, 2018). You can rearrange them to solve for different variables. They still say the same thing, in all their arrangements. They express a functional relationship between variables. Causation, on the other hand, is asymmetric. The value of the causal variable(s) sets the value of the effect variable(s). In a mathematical equation, however, “setting” is universal. If one changes the value of the pressure in a fixed mass of gas, then, according to the ideal gas law, either volume or temperature must change (or both). But there is no way to increase the pressure except through adjusting the volume or temperature. The equations do not tell us that.
A better objection might pick up on this response by saying that this example shows that there are causal facts. If physics does not capture them, then it should.
This response is not particularly plausible at a fundamental level, where the prospect of introducing such an ill-defined notion as cause into the already rather strange world of quantum mechanics is not appealing. But it might be implemented through a reductionist strategy. Huw Price offers something like this, suggesting that certain asymmetries, notably the arrow of time, might depend jointly upon our nature as agents, and the temporal asymmetry of the universe. Such an approach is compatible with Russell’s central insight but dispenses with his entertaining, overly enthusiastic, dismissal of the utility of causation. It remains useful, despite being non-fundamental; and its usefulness can be explained. This is perhaps the most promising response to Russell’s observation, and one that deserves more attention and development in the literature.
b. Pluralism and Thickism
Pluralists believe that there is no single concept of causation, but a plurality of related concepts which we lump together under the word “causation” (Anscombe, 1971; Cartwright, 1999). This view tends to go hand-in-hand with a refusal to accept a basic premise of Hume’s challenge, which is that we do not observe causation. We do observe causation, say the objectors. We see pushes, kicks, and so forth. Therefore, they ask, in what sense are we not observing causation?
This line of thought is compelling to some but somewhat inscrutable to many, who remain convinced that pushes and kicks look just the same as coincidental sequences like the sun coming out just before the ball enters the goal or the shopping cart moves—until we have learned, from experience, that there is a difference. Thus, most remain convinced that Hume’s challenge needs a fuller answer. Most also agree with Hume that there is something that causes have in common, and that one needs to understand this if one is to distinguish the kicks and pushes of the world from the coincidences.
A related idea, a form of pluralism, one might call thickism. In an ethical context, some have proposed the existence of “thick” ethical concepts characterized by their irreducibility into an evaluative and factual component. (This is a critique of another Humean doctrine, the fact-value distinction.) Thus, generosity is both fundamentally good and fundamentally an act of giving. It is not a subset of acts of giving defined as those which are good; some of these might be rather selfish, but better than nothing; others might be gifts of too small a kind to count as generous; others might be good for other reasons, because they bring comfort rather than because they are generous (bringing a bunch of flowers to a sick person is an act of kindness but not really generosity). Generosity is thick.
The conclusion one might draw from the existence of thick concepts is that there is not (or not necessarily) a single property binding all the thick concepts together, and thus that it is fruitless to try to identify or analyze it. Similar remarks might be applied to causes. Transitive verbs are commonly causal. To push the cart along is not analyzable into, say, to move forward and at the same time to cause the cart to move. One could achieve this by having a companion push the cart when you move forward, and stop when you stop. Pushes (in the transitive sense) are causal, but the causal element cannot be extracted for analysis.
Against this contention is the point that, in a practical context, the extraction of causation seems exactly what is at issue. In the statistics-driven sciences, in law, in policy-decisions, the non-causal facts seem clear, but the causal facts not. The question is exactly whether the non-causal facts are accompanied by causation. There does seem to be an important place in our conceptual framework for a detached concept of cause, because we apply that concept beyond the familiar world of kicks and pushes. As for those familiar causes, the tangling up of a kind of action with a cause hardly shows that there is no distinction between causes and non-causes. If we do not call a push-like action a cause on one occasion (when my friend pushes the trolley according to my movements) while we do on another (when I push the trolley), this could just as easily be taken to show that we need a concept of causation to distinguish pushing from mere moving forward.
8. References and Further Reading
Anscombe, G. E. M. (1958). Modern moral philosophy. Philosophy, 33(124), 1–19.
Anscombe, G. E. M. (1969). Causality and Extensionality. The Journal of Philosophy, 66(6), 152–159.
Anscombe, G. E. M. (1971). Causality and Determination. Cambridge: Cambridge University Press.
Armstrong, D. (1983). What Is a Law of Nature? Cambridge: Cambridge University Press.
Beebee, H. (2004). Causing and Nothingness. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 291–308). Cambridge, Massachusetts: MIT Press.
Bennett, J. (2001). On Forward and Backward Counterfactual Conditionals. In G. Preyer, & F. Siebelt (Eds.), Reality and Humean Supervenience (pp. 177–203). Maryland: Rowman and Littlefield.
Bennett, J. (2003). A Philosophical Guide to Conditionals. Oxford: Oxford University Press.
Bird, A. (2007). Nature’s Metaphysics. Oxford: Oxford University Press.
Blakely, T. (2016). DAGs and the restricted potential outcomes approach are tools, not theories of causation. International Journal of Epidemiology, 45(6), 1835–1837.
Bloor, D. (1991). Knowledge and Social Imagery (2nd ed.). Chicago: University of Chicago Press.
Bloor, D. (2008). Relativism at 30,000 feet. In M. Mazzotti (ed.), Knowledge as Social Order: Rethinking the Sociology of Barry Barnes (pp. 13–34). Aldershot: Ashgate.
Broadbent, A. (2012). Causes of causes. Philosophical Studies, 158(3), 457–476. https://doi.org/10.1007/s11098-010-9683-0
Broadbent, A. (2016). Philosophy for graduate students: Core topics from metaphysics and epistemology. In Philosophy for Graduate Students: Core Topics from Metaphysics and Epistemology. https://doi.org/10.4324/9781315680422
Broadbent, A. (2019). The C-word, the P-word, and realism in epidemiology. Synthese. https://doi.org/10.1007/s11229-019-02169-x
Broadbent, A., Vandenbroucke, J. P., & Pearce, N. (2016). Response: Formalism or pluralism? A reply to commentaries on “causality and causal inference in epidemiology.” International Journal of Epidemiology, 45(6), 1841–1851. https://doi.org/10.1093/ije/dyw298
Cartwright, N. (1983). Causal Laws and Effective Strategies. Oxford: Clarendon Press.
Cartwright, N. (1999). The Dappled World: A Study of the Boundaries of Science. Cambridge: Cambridge University Press.
Cartwright, N. (2007). Hunting Causes and Using Them: Approaches in Philosophy and Economics. New York: Cambridge University Press.
Dowe, P. (2000). Physical Causation. Cambridge: Cambridge University Press.
Dowe, P., & Noordhof, P. (2004). Cause and Chance: Causation in an Indeterministic World. London: Routledge.
Eells, E. (1991). Probabilistic Causality. Cambridge: Cambridge University Press.
Elga, A. (2000). Statistical Mechanics and the Asymmetry of Counterfactual Dependence. Philosophy of Science (Proceedings), 68(S3), S313–S324.
Forsyth, F. (1971). The Day of the Jackal. London: Hutchinson.
Garrett, D. (2015). Hume’s Theory of Causation. In D. C. Ainslie, & A. Butler (Eds.), The Cambridge Companion to Hume’s Treatise (pp. 69–100). https://doi.org/10.1017/CCO9781139016100.006
Gillies, D. (2000). Philosophical Theories of Probability. London: Routledge.
Hall, N. (2004). Two Concepts of Causation. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 225–276). Cambridge, Massachusetts: MIT Press.
Hausman, D. (1998). Causal Asymmetries. Cambridge: Cambridge University Press.
Heathcote, A., & Armstrong, D. M. (1991). Causes and Laws. Noûs, 25(1), 63–73. https://doi.org/10.2307/2216093
Hernán, M. A. (2005). Invited Commentary: Hypothetical Interventions to Define Causal Effects—Afterthought or Prerequisite? American Journal of Epidemiology, 162(7), 618–620.
Hernán, M. A. (2016). Does water kill? A call for less casual causal inferences. Annals of Epidemiology, 26(10), 674–680.
Hernán, M. A., & Robins, J. M. (2020). Causal Inference: What If. Retrieved from https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/
Hernán, M. A., & Taubman, S. L. (2008). Does obesity shorten life? The importance of well-defined interventions to answer causal questions. International Journal of Obesity, 32, S8–S14.
Hesslow, G. (1976). Two Notes on the Probabilistic Approach to Causality. Philosophy of Science, 43(2), 290–292.
Hiddleston, E. (2005). A Causal Theory Of Counterfactuals. Australasian Journal of Philosophy, 39(4), 632–657.
Hitchcock, C. (2004). Routes, processes and chance-lowering causes. In P. Dowe, & P. Noordhof (Eds.), Cause and Chance (pp. 138–151). London: Routledge.
Hitchcock, C. (2010). Probabilistic Causation. Stanford Encyclopedia of Philosophy. Retrieved from https://plato.stanford.edu/archives/fall2010/entries/causation-probabilistic/
Hume, D. (1748). An Enquiry Concerning Human Understanding (1st ed.). London: A. Millar.
Kant, I. (1781). The Critique of Pure Reason (1st ed.).
Krieger, N., & Davey Smith, G. (2016). The ‘tale’ wagged by the DAG: broadening the scope of causal inference and explanation for epidemiology. International Journal of Epidemiology, 45(6), 1787–1808. https://doi.org/10.1093/ije/dyw114
Lewis, D. (1973a). Causation. Journal of Philosophy, 70 (17), 556–567.
Lewis, D. (1973b). Counterfactuals. Cambridge, Massachusetts: Harvard University Press.
Lewis, D. (1973c). Counterfactuals and Comparative Possibility. Journal of Philosophical Logic, 2(4), 418–446.
Lewis, D. (1979). Counterfactual Dependence and Time’s Arrow. Noûs, 13(4), 455–476.
Lewis, D. (1983). New Work for a Theory of Universals. Australasian Journal of Philosophy, 61(4), 343–377.
Lewis, D. (1984). Putnam’s Paradox. Australasian Journal of Philosophy, 62(3), 221–236.
Lewis, D. (1986). Philosophical Papers (vol. II). Oxford: Oxford University Press.
Lewis, D. (2004a). Causation as Influence. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 75–106). Cambridge, Massachusetts: MIT Press.
Lewis, D. (2004b). Void and Object. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 277–290). Cambridge, Massachusetts: MIT Press.
Lipton, P. (2000). Tracking Track Records. Proceedings of the Aristotelian Society ― Supplementary Volume, 74(1), 179–205.
Mackie, J. (1974). The Cement of the Universe. Oxford: Oxford University Press.
Mellor, D. H. (1995). The Facts of Causation. Abingdon: Routledge.
Mellor, D. H. (2004). For Facts As Causes and Effects. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 309–324). Cambridge, Massachusetts: MIT Press.
Menzies, P., & Price, H. (1993). Causation as a Secondary Quality. The British Journal for the Philosophy of Science, 44(2), 187–203.
Mill, J. S. (1882). A System of Logic, Ratiocinative and Inductive (8th ed.). New York and Bombay: Longman’s, Green, and Co.
Mumford, S. (1998). Dispositions. Oxford: Oxford University Press.
Mumford, S., & Anjum, R. L. (2011). Getting Causes from Powers. London: Oxford University Press.
Paul, L. A. (2004). Aspect Causation. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 205–223). Cambridge, Massachusetts: MIT Press.
Pearl, J. (2009). Causality: Models, Reasoning and Inference (2nd ed.). Cambridge: Cambridge University Press.
Pearl, J., & Mackenzie, D. (2018). The Book of Why. New York: Basic Books.
Price, H. (1991). Agency and Probabilistic Causality. The British Journal for the Philosophy of Science, 42(2), 157–176.
Quine, W. V. (1969). Ontological Relativity and Other Essays. New York: Columbia University Press.
Rips, L. J. (2010). Two Causal Theories of Counterfactual Conditionals. Cognitive Science, 34(2), 175–221. https://doi.org/10.1111/j.1551-6709.2009.01080.x
Rubin, D. (1974). Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology, 66(5), 688–701.
Russell, B. (1918). On the Notion of Cause. London: Allen and Unwin.
Salmon, W. C. (1993). Probabilistic Causality. In E. Sosa, & M. Tooley (Eds.), Causation (pp. 137-153). Oxford: Oxford University Press.
Salmon, W. C. (1998). Causality and Explanation. Oxford: Oxford University Press.
Schaffer, J. (2004). Trumping Preemption. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 59–74). Cambridge, Massachusetts: MIT Press.
Schaffer, J. (2007). The Metaphysics of Causation. Stanford Encyclopedia of Philosophy. Retrieved from https://plato.stanford.edu/archives/win2007/entries/causation-metaphysics/
Stapleton, J. (2008). Choosing What We Mean by “Causation” in the Law. Missouri Law Review, 73(2), 433–480. Retrieved from https://scholarship.law.missouri.edu/mlr/vol73/iss2/6
Suppes, P. (1970). A Probabilistic Theory of Causality. Amsterdam: North-Holland.
Tooley, M. (1987). Causation: A Realist Approach. Oxford: Clarendon Press.
Vandenbroucke, J. P., Broadbent, A., & Pearce, N. (2016). Causality and causal inference in epidemiology: the need for a pluralistic approach. International Journal of Epidemiology, 45(6), 1776–1786. https://doi.org/10.1093/ije/dyv341
VanderWeele, T. J. (2016). Commentary: On Causes, Causal Inference, and Potential Outcomes. International Journal of Epidemiology, 45(6), 1809–1816.
Woodward, J. (2003). Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press.
Woodward, J. (2006). Sensitive and Insensitive Causation. The Philosophical Review, 115(1), 1–50.
Author Information
Alex Broadbent
Email: abbroadbent@uj.ac.za
University of Johannesburg
Republic of South Africa
Kit Fine (1946—)
Kit Fine is an English philosopher who is among the most important philosophers of the turn of the millennium. He is perhaps most influential for reinvigorating a neo-Aristotelian turn within contemporary analytic philosophy. Fine’s prolific work is characterized by a unique blend of logical acumen, respect for appearances, ingenious creativity, and originality. His vast corpus is filled with numerous significant contributions to metaphysics, philosophy of language, logic, philosophy of mathematics, and the history of philosophy.
Although Fine is well-known for favoring ideas familiar from the neo-Aristotelian tradition (such as dependence, essence, and hylomorphism), his work is most distinctive for its methodology. Fine’s general view is that metaphysics is not best approached through the study of language Roughly put, Fine’s approach focuses on providing a rigorous account of the apparent phenomena themselves, and not just how we represent them in language or thought, prior to any attempt to discern the reality underlying them. Furthermore, a strong and ecumenical respect for the intelligible options, demands patience for the messy details, even when they resist tidying or systematization. All this leads to a steadfastness in refusing to allow epistemic qualms about how we know what we seem to know interferes with our attempts to clarify just what it is that we seem to know.
This article surveys the wide variety of Fine’s rich and creative contributions to philosophy, and it conveys what Fine’s distinctive methodology is and how it informs his contributions to philosophy.
Fine was born in England on March 26, 1946. He earned a B.A. in Philosophy, Politics, and Economics at the University of Oxford in 1967. He was then appointed to a position at the University of Warwick. There he was mentored by Arthur Prior. Although Fine was never enrolled in a graduate program, his Ph.D. thesis For Some Proposition and So Many Possible Worlds was examined and accepted by William Kneale and Dana Scott just two years later.
Since then, Fine has held numerous academic appointments, including at: University of Warwick; St John’s College, University of Oxford; University of Edinburgh; University of California, Irvine; University of Michigan, Ann Arbor; and University of California, Los Angeles. Fine joined New York University’s philosophy department in 1997, where he is now Silver Professor and University Professor of Philosophy and Mathematics. He is currently also a Distinguished Research Professor at the University of Birmingham. Fine also held visiting positions at: Stanford University; University of Toronto; University of Arizona; Australian National University; University of Melbourne; Princeton University; Harvard University; New York University at Abu Dhabi; University of Aberdeen; and All Souls College, University of Oxford.
He has served the profession as an editor or an editorial board member of Synthese; The Journal of Symbolic Logic; Notre Dame Journal of Formal Logic; and Philosophers’ Imprint.
Fine’s contributions to philosophy have been recognized by numerous awards, including a Guggenheim Foundation Fellowship, American Council of Learned Societies Fellowship, Fellow of the American Academy of Arts and Sciences, Fellow at the National Center for the Humanities, Corresponding Fellow at the British Academy, an Anneliese Maier Research Award from the Alexander von Humboldt Foundation, and a Leibowitz Award (with Stephen Yablo).
Fine’s corpus is enormous. By mid-2020 he had published over 130 journal articles, 5 books. At least half a dozen articles and 8 monographs are forthcoming. His work is at once of both great breadth and depth, spanning many core areas of philosophy and engaging its topics with great erudition and technical sophistication. His trailblazing work is highly original, rarely concerned with wedging into topical or parochial debates but rather with making novel advances to the field in creative and unexpected ways. This article de-emphasizes his technical contributions, and it focuses upon his more distinctive or influential work.
2. Fine Philosophy
When engaging with the work of any prolific philosopher exhibiting great breadth and originality, it is tempting to look for some core “philosophical attractors” that animate, unify, or systematize their work. These attractors may then serve as useful aids to understanding their work and highlighting its most distinctive features.
Perhaps the most familiar form a philosophical attractor might take is that of a doctrine. These “doctrinal attractors” are polarized, pulling in some views while repelling others. Their “magnetic” tendencies are what systematize a thinkers’ thought. In the history of modern philosophy, two obvious examples are Spinoza and Leibniz. Their commitment to the principle of sufficient reason, the doctrine that everything has a reason or cause, underwrites vast swaths of their respective philosophies (Spinoza 1677; Leibniz 1714). A good example in the twentieth century is David Lewis. One can scarcely imagine understanding Lewis’s philosophy without placing at its core the doctrines of Humean supervenience and modal realism (Lewis 1986).
Another form a philosophical attractor might take is that of a methodology. These methodological attractors are also polarized, but they exert their force less on views and more on which data to respect and which to discard, which distinctions to draw and which to ignore, how weighty certain considerations should be or not, and the like. Hume is an example in the history of modern philosophy. His commitment to respecting only that which makes an observable difference guides much of his philosophy (Hume 1739). Saul Kripke is an example in the twentieth century. One can scarcely imagine understanding his philosophy without placing at its core a respect for common sense and intuitions about what we should say of actual and counterfactual situations (Kripke 1972).
There is no question that Fine is well-known for his association with certain doctrines or topics. These include: actualism, arbitrary objects, essentialism, ground, hylomorphism, modalism, procedural postulationism, semantic relationism, (formerly) supervaluationism, three-dimensionalism, and truthmaker semantics. But as important as these may be to understanding Fine’s work, they do not serve individually or jointly as doctrinal attractors in the way that, for example, Humean supervenience or modal realism did so vividly for Lewis.
Instead, Fine’s work is better understood in terms of a distinctive “Finean” cluster of methodological attractors. Fine himself has not spelled out the details of the cluster explicitly. But some explicit discussion of it can be found in his early work (1982c: §A2). There are also discussions suggestive of the cluster scattered across many of his later works. But perhaps the strongest impression emerges by osmosis from sustained engagement with a range of his work.
The Finean cluster may be roughly summarized by the following methodological “directives”:
Provide a rigorous account of the appearances first before trying to discern the reality underlying them.
Focus on the phenomenon itself and not just how we represent or express it in language or thought.
Respect what’s at issue by not allowing worries about what we can mean from preventing us from accepting the intelligibility of notions that strike us as intelligible.
Be patient with the messy details even when they resist tidying or systematization.
Don’t allow epistemic worries about how we know what we seem to know interfere with or distract us from clarifying what it is that we seem to know.
Some of these directives interact or overlap. Even so, separating them helps highlight their different emphases. Bearing them in mind both individually and jointly is crucial to understanding Fine’s distinctive approach to the vast array of topics covered in his work.
Sometimes the influence of the directives is rather explicit. For example, the first directive clearly influences Fine’s views on realism and the nature of metaphysics. Implicit in this directive is a distinction between appearance and reality. Fine suggests that each is the focus of its own branch of metaphysics. Naïve metaphysics studies the appearances whereas foundational metaphysics studies their underlying reality. Because we have not yet achieved rigorous clarification of the appearances, Fine believes it would be premature to investigate the reality underlying them.
Other times, however, the directives exert their influence in more implicit ways. To illustrate, consider the first directive’s emphasis on providing a rigorous account of the appearances. Although Fine’s tremendous technical skill is clear in his work in formal logic, it also suffuses his philosophical work. Claims or ideas are often rigorously formalized in appendices or sometimes in the main text. Even when Fine’s prose is informal at the surface, it is evident that his technical acuity and logical rigor support it from beneath.
The second directive is perhaps most evident in Fine’s focus on the phenomena. Even in our post-positivistic times, some philosophers still lose their nerve when attempting to do metaphysics and, instead, retreat to our language or thought about it. An aversion to this is implicit throughout Fine’s work. Sometimes Fine makes his aversion explicit (2003a: 197):
…in this paper…I have been concerned, not with material things themselves, but with our language for talking about material things. I feel somewhat embarrassed about writing such a strongly oriented linguistic paper in connection with a metaphysical topic, since it is my general view that metaphysics is not best approached through the study of language.
Behind Fine’s remarks is a view that the considerations relevant to language often differ from those relevant to its subject matter. Only confusion can result from this sort of mismatch. So Fine’s apology is perhaps best explained by his unapologetic insistence that our interest is in the phenomena. However esoteric or unruly they may be, we should boldly resist swapping them out for the pale shadows they cast in language or thought.
The third directive is implicit in Fine’s frequent objections to various doctrines for not properly respecting the substantiveness, or even intelligibility, of certain positions. To illustrate, Fine defends his famous counterexamples against modal conceptions of essence by applying the third directive (1994b: 5):
Nor is it critical to the example that the reader actually endorse the particular modal and essentialist claims to which I have made appeal. All that is necessary is that he should recognize the intelligibility of a position which makes such claims.
Even if the claims are incorrect, their intelligibility is still enough to establish that there is a genuine non-modal conception of essence. Considerations like these illustrate Fine’s ecumenical approach. But this ecumenicity does not imply that anything goes, as Fine makes clear elsewhere when discussing fundamentality (2013a: 728):
Of course, we do not want to be able to accommodate any old position on what is and is not fundamental. The position should be coherent and it should perhaps have some plausibility. It is hard to say what else might be involved, but what seems clear is that we should not exclude a position simply on the grounds that it does not conform to our theory…
There appears to be a sort of humility driving Fine’s applications of the fourth directive. Philosophy aspires to the highest standards of clarity, precision, and rigor. This is why philosophical progress is so hard to achieve, and so modest when it is achieved. Thus, at least at this early stage of inquiry, there is a sort of arrogance in justifying one’s disregard for certain positions by appealing to one’s doctrinal commitments. Perhaps this also explains the scarcity of doctrinal attractors in Fine’s work.
The fourth directive often manifests in Fine’s work as an openness—perhaps even a fondness—for drawing many subtle distinctions. To some extent, this is explained by Fine’s keen eye for detail and his respect for nuance. But a deeper rationale derives from an interaction between the first two directives. For if these subtle distinctions belong to the appearances, then we must ultimately expect a rigorous account of the latter to include the former. This helps explain Fine’s patient and sustained interest in these distinctions, even when they resist analysis, raise difficulties of their own, or are just unpopular.
The fifth directive helps explain what might otherwise seem like a curious gap in Fine’s otherwise broad corpus. With only a few exceptions (2005d; 2018a), Fine has written little directly on epistemology. When Fine’s work indirectly engages epistemology, it is often with ambivalence. And epistemic considerations rarely play any serious argumentative role. For example, one scarcely finds him ever justifying a claim by arguing that it would be easier to know than its competitors. Fine’s distance from epistemic concerns does not stem from any disdain for them. It rather stems from the influence of the other directives. It would be premature to attempt to account for our knowledge of the appearances prior to providing a rigorous account of what they are. As Fine has sometimes quipped in conversation, “Metaphysics first, epistemology last”.
3. Metaphysics
Fine is widely regarded as having played a pivotal role in the recent surge of interest in broadly neo-Aristotelian metaphysics. It is, however, not easy to say just what neo-Aristotelian metaphysics is. One might characterize it as a kind of resistance to the “naturalistic” approaches familiar in much of late 20th century metaphysics. Granted, it is not straightforward how those approaches fit within the Aristotelian tradition. But the complexities of Aristotle’s own approach to metaphysics and the natural world suggest that any such characterization is, at best, clumsy and oversimplified. Another characterization of neo-Aristotelian might associate it with certain distinctive topics, including essence, substance, change, priority, hylomorphism, and the like. Granted, these topics do animate typical examples of neo-Aristotelian metaphysics. But it is also clear that these topics are not its exclusive property. Perhaps the best way to characterize neo-Aristotelian metaphysics is to engage with the metaphysics of one of its most influential popularizers and practitioners in contemporary times: Kit Fine.
What is metaphysics? Fine believes it is the confluence of five features (2011b). First, the subject of metaphysics is the nature of reality. But physics, mathematics, aesthetics, epistemology, and many other areas of inquiry are also concerned with the nature of reality. What distinguishes metaphysics from them is its aim, its methods, its scope, and its concepts. The aim of metaphysics is to provide a foundation for what there is. The method of metaphysics is characteristically apriori. The scope of metaphysics is as general as can be. And the concepts of metaphysics are transparent in the sense that there is no “gap” between the concept itself and what it is about.
The distinction between appearance and reality plays a prominent role in Fine’s conception of metaphysics (1982c: §A2; 2017b). Given such a distinction, one aim of metaphysics is to characterize how things are in reality. In Aristotelian fashion, this begins with the appearances. We start with how things appear, and the task is then to vindicate the appearances as revelatory of the underlying reality, or else to explain away the appearances in terms of some altogether different underlying reality. Both the revelatory and the reductionist projects presuppose the appearances, and so it is vital to get straight on what they are first. Fine calls this project naïve metaphysics. Only once adequate progress has been made on the naïve metaphysics of a subject will we be in a position to consider how it relates to fundamental reality. Fine calls this second project foundational metaphysics. Much of Fine’s work in metaphysics is best regarded as contributing to the naïve metaphysics of various topics (modality, part/whole, persistence) or to clarifying what conceptual tools (essence, reality, ground) will be needed to relate naïve metaphysics to foundational metaphysics. As Fine puts it (2017b: 108):
In my own view, the deliverances of foundational metaphysics should represent the terminus of philosophical enquiry; and it is only once we have a good handle on the corresponding questions within naïve metaphysics, with how things appear, that we are in any position to form an opinion on their reality.
Fine often suggests doubts about our having made anywhere near enough progress in naïve metaphysics to embark yet on foundational metaphysics. Because Fine suspects it would be premature to pursue foundational metaphysics at this early (for philosophy!) stage of inquiry, one should resist interpreting his work as pronouncing upon the ultimate nature of reality or the like. These sentiments are microcosmic embodiments of the five directives characterizing Fine’s philosophical approach.
a. Modality
Much of Fine’s earliest work focused on technical questions within formal logic, especially modal logic. But in the late 1970’s, Fine’s work began increasingly to consider applications of formal methods—especially the tools of modal logic—to the philosophy of modality. This shift produced a variety of influential contributions.
One of Fine’s earliest contributions to modality was to develop an ontological theory of extensional and intensional entities (1977b). The theory assumes a familiar possible worlds account of its intensional entities: properties are sets of world-individual pairs, propositions are sets of worlds, and so on. This approach is often taken to disregard any internal “structure” in the entities for which it accounts. But Fine resourcefully argues that a great deal of “structure” may still be discerned, including existence, being qualitative, being logical, having individual constituents, and being essentially modal. This work, together with Fine’s developments of Prior’s form of actualism (1977a), prefigured the recent debate between necessitists who assert that necessarily everything is necessarily something and contingentists who deny this (Williamson 2013b).
Fine continued exploring the applications of modal logic in the work that followed. The technical development of first-order modal theories is explored in one trio of papers (1978a; 1978b; 1981b). A second trio of papers explores applications of first-order modal theories to the formalization of various metaphysical theories of sets (1981a), propositions (1980), and facts (1982b). The second trio contains a wealth of distinctions and arguments. Some of them, with the benefit of hindsight, prefigure what would later become some of Fine’s more influential ideas.
For one example, the formalizations in 1981a are explicitly intended to capture plausible essentialist views about the identity or nature of sets. It is not difficult to view some of Fine’s remarks in this paper as anticipating his later celebrated set-theoretic counterexamples to the modal theory of essence (1994b).
For another example, 1982b argues against the still-common view that identifies facts with true propositions. The proposition that dogs bark exists regardless of whether dogs bark, whereas the fact that dogs bark exists only if they do.
In discussing these and related topics, Fine also introduced a general argumentative strategy against a variety of controversial metaphysical views. To illustrate, consider a modal variant of the preceding view that identifies possible facts with possibly true propositions. Suppose possible objects are abstracta. If a possible object is thus-and-so, then possibly it is actually thus-and-so. In particular, a possible donkey is possibly an actual donkey. Now, an actual donkey is a concrete object. So, we then have an abstract object—a possible donkey—that is possibly concrete. But no abstracta is possibly concrete. And so not all possible objects are abstracta. This sort of argument can also be used to show that possible facts are not propositions and that possible worlds are not abstract.
Fine’s work on modality is animated by a commitment to modal actualism (see his introduction to 2005b). This combines two theses. The first, modalism, is that modal notions are intelligible and irreducible to non-modal notions. The second, actualism, is that actuality is prior to mere possibility.
One of modalism’s most infamous detractors was Quine. Fine provides detailed reconstructions of Quine’s arguments against the intelligibility of de re modality and equally detailed criticisms of them (1989c; 1990). Quine’s arguments and Fine’s criticisms involve disentangling delicate issues concerning the modal question of de re modality, the semantic question of singular (or direct) reference, and the metaphysical question of transworld identity. These issues, according to Fine, have often been conflated in the literature (2005e).
One of the main problems facing actualism is to explain how to make sense of discourse about the merely possible, or “possibilist discourse”, given that mere possibilia are ultimately unreal. Fine takes up the challenge of reducing possibilist discourse to actualist discourse in a series of articles (1977a; 1985c; 2002b). A notable theme of Fine’s reductive strategy is a resistance to “proxy reduction”. Roughly, a proxy reduction attempts to reduce items of a target domain by associating them one-by-one with items from a more basic domain. In this case, a proxy reduction of possibilist discourse would reduce a merely possible object by associating it with an actual object. Although it is often assumed that reduction must proceed in this way by “proxy”, Fine argues that it needn’t. Instead, Fine pursues a different approach. The idea is to reduce the claim that a possible object has a feature to the claim that possibly some object (actually) has that feature. Thus, the claim that Wittgenstein’s possible daughter loathed philosophy is reduced to the claim that possibly Wittgenstein’s daughter (actually) loathed philosophy. This is not a proxy reduction because it does not associate Wittgenstein’s possible daughter with any actual object. Criticisms of the approach from Williamson 2013b and others recently prompted Fine to develop a new “suppositional” approach (2016c).
Although modalists often distinguish between various kinds of modality, they have often thought that the varieties can ultimately be understood in terms of a single kind of modality. Fine, however, argues against this sort of “monism” about modality (2002c). Modality is, instead, fundamentally diverse. There are, argues Fine, at least three diverse and irreducible modal domains: the metaphysical, the normative, and the nomological.
In addition to this diversity in the modal domains, Fine also argues that there is diversity within a given modal domain (2005c). This emerges in considering a puzzle of how it is possible that Socrates is a man but does not exist, given that it is necessary that Socrates is a man but possible that Socrates does not exist. Just as there is a distinction between sempiternal truths that hold at each time (for example, ‘Trump lies or not’) and eternal truths that hold regardless of the time (for example, ‘2+2=4’), so too there are worldly necessities that hold at each world (for example, ‘Trump lies or not’) and unworldly or transcendent necessities that hold regardless of the world (for example, ‘2+2=4’). The puzzle can then be resolved by taking ‘Socrates is a man’ to be an unworldly necessity while taking ‘Socrates does not exist’ to be a worldly (contingent) possibility. The distinction between worldly and unworldly necessities provides for three grades of modality. The unextended grade concerns the purely worldly necessities, the extended grade concerns the purely worldly necessities and the purely unworldly necessities, and the superextended grade concerns “hybrids” of the first two grades. Fine argues that the puzzle’s initial appeal depends upon confusing these three grades of modality.
b. Essence
Perhaps one of Fine’s most well-known contributions to metaphysics is to rehabilitate the notion of essence. A notable antecedent was Kripke 1972. Positivism’s antipathy to metaphysics was still exerting much influence on philosophy when Kripke powerfully advocated for the legitimacy of a distinctively metaphysical notion of modality. Kripke used this notion to suggest various essentialist theses. Among them were that a person’s procreative origin was essential to them and that an artifact’s material origin was essential it. These essentialist theses, however, were usually taken to be theses of metaphysical necessity. The implicit background conception of essence was accordingly modal. On one formulation of it, an item has some feature essentially just in case it is necessary that it has that feature. Thus, Queen Elizabeth’s procreative origin is essential to her just in case it is necessary that she have that origin.
One of Fine’s distinctive contributions to rehabilitating essence was to argue against the modal conception of it (1994b). To do so, Fine introduced what is now a famous example. Consider the singleton set {Socrates} (the set whose sole member is Socrates). It is necessary that, if this set exists, then it has Socrates as a member. And so, by the modal conception, the set essentially has Socrates as a member. But, Fine argues, on plausible assumptions, it is also necessary that Socrates is a member of {Socrates}. And so, by the modal conception, it follows that Socrates is essentially a member of {Socrates}. This, however, is highly implausible: it is no part of what Socrates is that he should be a member of any set whatsoever. Fine raises a battery of similar counterexamples to the modal conception. His diagnosis of where it goes awry is that it is insensitive to the source of necessity. It lies in the nature of the singleton {Socrates}, not Socrates, that it has Socrates as a member. This induces an asymmetry in essentialist claims: {Socrates} essentially contains Socrates, but it is not the case that Socrates is essentially contained by {Socrates}. No modal conception of essence can capture this asymmetry because the two claims are both equally necessary.
Even if the modal conception of essence fails, it is not as if essence and modality are unconnected. Indeed, Fine provocatively suggests a reversal of the traditional connection. Whereas the modal approach attempted to characterize essence in terms of modality, Fine suggests instead that metaphysical necessities hold in virtue of the essences of things (1994b).
Whether or not this suggestion is correct, separating essence from modality already implies that the study of essence cannot be subsumed under the study of modality. Instead, it would seem essence must be studied as a subject in its own right. Toward this end, Fine discusses a wealth of distinctions involving essence including the distinctions between constitutive and consequential essence, immediate and mediate essence, and more (1994d).
An especially important application of essence is to the notion of ontological dependence. What something is may depend upon what another thing is. In this ontological sense of dependence, a set may depend on its members, or an instance of a feature may depend upon the individual bearing it. Fine has explored this notion of ontological dependence in detail and used to provide a characterization of substance (1995b). Additionally, he has also developed the formal logic and semantics of essence (1995a; 2000c).
c. Ontology
Ontology is often taken to concern what there is, or what exists. Some, however, have argued that there is a significant difference between being (what there is) and what exists. When being and existence are distinguished, it is often to claim that some things that have being nevertheless do not exist.
A recurring theme in Fine’s work is an openness to consider the being or nature of items regardless of whether they exist (1982b: §1; 1982c: §E1). This is most evident in the case of items that we are convinced do not exist. Like many others, Fine believes that, ultimately, there are no non-existents. But, perhaps unlike many others, Fine also believes that this is no obstacle to exploring their status or their nature (1982c). Fine’s explorations of this are rich in distinctions. The three most prominent are between Platonism and empiricism, literalism and contextualism, and internalism and externalism. The Platonist says non-existents do not depend on us or our activities, whereas the empiricist says they do. The literalist says non-existents literally have the properties they are said to have (for example, Sherlock Holmes literally lives in London), whereas the contextualist says instead that these properties are at most only had in a relevant context (namely, the Holmes stories). The internalist individuates non-existents solely in terms of the properties they have “internally” to the contexts in which they occur, whereas the externalist does not. Fine believes that all eight combinations of views are possible. But he focuses on developing and arguing against the four internalist views. A notable counterexample Fine gives to internalism is a story in which we imagine twins Dum and Dee who are indiscernible internally to the story but are nevertheless distinct. Two follow-up papers developing and defending externalism (Fine’s own favored combination conjoins empiricism, contextualism, and externalism) and comparing it to alternatives were planned but have not yet appeared (although 1984a further discusses related issues in the context of a critical review).
Behind Fine’s openness to considering the being or nature of items regardless of whether they exist is a general conception of ontology (2009). At least since Quine 1948, the dominant view has been that ontology’s central question, “What exists?”, should be understood as the question “What is there?”, and that this in turn should be understood as a quantificational question. Thus, to ask “Do numbers exist?” is to ask “Is there an x such that x is a number?”. Fine argues against this approach. One difficulty is that it seems to mischaracterize the logical form of ontological claims. Suppose we wish to answer “Yes, numbers exist”. It does not seem adequate to the answer that merely some number, say 13, exists. But that is all that is required for the quantificational answer to be correct. Instead, it seems our answer must be that all the numbers exist. This answer has the form “For every x, if x is a number, then x exists”. If ‘x exists’ is understood in the Quinean way in terms of a quantifier (namely: x exists =df. ∃y(x = y)), then it expresses a triviality that fails to capture the intended significance of the ontological question. Fine suggests that the intended significance can be restored by appealing to the notion of reality. The ontological, as opposed to quantificational question “Do numbers exist?” asks whether it is true that “For every x, if x is a number, then there is some feature that, in reality, x has”. This question is not answered by basic mathematical facts, but instead by whether numbers are part of the facts constituting reality.
Many ontologies are “constructional”. Some of their objects are accepted for being constructs of other accepted objects (perhaps with some objects as “given”: accepted but not on the basis of anything else). For example, we may accept subatomic particles into our ontology because they “construct” atoms, and we may also accept forests into our ontology because they are “constructed by” trees. Fine pursues an abstract study of constructional ontologies (1994e). The theory Fine develops can distinguish between actual and possible ontologies, as well as between absolute and relativist ontologies.
Relations have long puzzled philosophers. An especially difficult class of relations are those that appear to be non-symmetric. Unrequited love provides an example: although Scarlett loves Rhett, Rhett does not love Scarlett. It may seem that the relation loves is “biased” in that its first relatum is the lover and the second relatum the beloved. But it seems we must also recognize a converse is loved by relation “biased” in that its first relatum is the beloved and the second relatum the lover. Now, when Scarlett loves Rhett, is this because Scarlett and Rhett in this order stand in the loves relation, or because Rhett and Scarlett in that order stand in the is loved by relation? It seems we must say at least one, but either alone is arbitrary and both together is profligate. Fine develops a solution in terms of unbiased or “neutral” relations (2000b).
d. Mereology
Fine has made a variety of important contributions to abstract mereology (the theory of part and whole) as well as to its applications to various sorts of objects. Sometimes the term ‘mereology’ is used for a specific theory of mereology, namely classical extensional mereology. But an important theme in Fine’s work on mereology is to argue that this theory, and indeed much other thinking on mereology, is unduly narrow. Instead, Fine believes there is a highly general mereological framework that may accommodate a plurality of notions of part-whole (2010c). Different notions of part-whole correspond to different operations that may compose wholes from their parts. The notion of fusion from classical extensional mereological is but one of these compositional operations (and not a uniquely interesting one, he thinks). But there are other compositional operations that may apply even to abstract objects outside space and time. For example, the set-builder operation may be regarded as building a whole (the set) from its parts (its members). (Unlike Lewis 1991’s similar suggestion, Fine does not take the set-builder operation to be the fusion operation.) Fine contends that the general mereological framework for the plurality can be developed in abstraction from any of these particular applications of it.
Much of Fine’s work on mereology, however, has concerned its application to the objects of ordinary life and, in particular, to material things. Many have wanted to regard a material thing as identical with its matter. Perhaps the main objection to this view is the sheer wealth of counterexamples. A statue may be well-made although its matter is not. Fine has defended counterexamples like these at length (2003a). Even if a material thing and its matter are not identical, it may still seem as if they can occupy the same place at the same time. After all, the statue is now where its matter is. And some, including Locke, in 1689, have claimed that it is impossible for any two things (at least of the same sort) to occupy the same place at the same time. But Fine presents counterexamples even to this Lockean thesis (2000a). One can imagine, for instance, two letters being written on two sides of the same sheet of paper (or even written using the same words but which have dual meanings). The two letters then coincide but are distinct.
Even if material things are not identical to their matter, it may still be maintained that they are somehow aggregates of their matter. An aggregate of objects exists at a place or at a time exactly whenever or wherever some of those objects do too. If a quantity of gold, for example, is an aggregate of its left and right parts, then the quantity will exist whenever its left or right parts exist and wherever its left or right parts exist. But, Fine argues, if the left part is destroyed, the quantity will cease to exist although the aggregate will not. In general, then, ordinary material things are not aggregates but are instead compounds (1994a).
These considerations extend to how material things persist through time. A common view is that they persist by having (material) temporal parts. This view takes the existence of objects in time to be essentially like their extension in space: aggregative. Objects persist through time in much the same way as events unfold. But Fine argues, partly on the basis of mereological considerations, that this delivers highly implausible results, and suggests that instead we must recognize that the existence of objects in time is fundamentally different than their extension in space (Fine [1994a]; 2006a).
The lesson Fine draws from the preceding considerations is that a material thing neither is identical with, nor a mere aggregation of, its matter. Instead, Fine believes that the correct mereology of material things will be a version of hylomorphism: a material thing will be a compound of matter and form (2008a). Fine’s first applications of hylomorphism to acts, objects, and events provides an early glimpse of its broad scope (1982a). But the full breadth of its scope only emerged with Fine’s development of a general hylomorphic theory (1999). Its key notion is that of an embodiment. An embodiment may be either timeless (rigid) or temporal (variable). A rigid embodiment r = a,b,c,…/R is the object resulting from the objects a,b,c,… being in the relation R. A rigid embodiment is a hylomorphic compound that exists timelessly just in case its “matter” (the objects a,b,c,…) is in the requisite “form” (the relation R). So, for example, the statue (r) is identical with the hylomorphic compound of its clay parts (a,b,c,…) in the form of a statue (R). By contrast, a variable embodiment corresponds to a principle uniting its manifestations across times. Thus, a variable embodiment v = /V/ is a function V from times to things (which may themselves be rigid embodiments). Thus, for example, the statue over time (v) is a series of states at a time.
e. Realism
Fine has made influential contributions to debates about realism (2001). In general, the realist claims that some domain (for example, the mental or the moral) is real, whereas the antirealist claims that it is unreal. Although debates between realists and antirealists are common throughout philosophy, a precise and general characterization of their debate has been elusive. Fine argues against a variety of approaches familiar from the literature before settling on a distinctively metaphysical approach. What makes it distinctively metaphysical is its essential appeal to a metaphysical (as opposed to epistemic, conceptual, or semantic) notion of reality as well as to relatedly metaphysical notions of factuality and ground.
We may illustrate Fine’s approach by example. Set aside the moral error-theorist who believes that there are no moral facts whatsoever. Suppose, instead, that there are moral facts. One of them might be, we may suppose, that pointless torture is morally wrong. Moral realists and antirealists alike may agree that this fact is moral for containing some moral constituents (such as the property moral wrongness). And, unlike the error-theorist, they may agree that this fact obtains. What they dispute, however, is the fact’s status as real or unreal. Antirealism may come in either of two forms. The antirealist reductionist may, for example, accept the moral fact but insist that it is grounded in non-moral, naturalist facts that do not contain any moral constituents. The moral fact is unreal because it is grounded in non-moral facts. And the antirealist nonfactualist may, for example, accept the moral fact but insist that it is “nonfactual” in the sense that it does not represent reality but is rather a sort of “projection” of our attitudes, expressions, activities, or practices. The moral fact is unreal because it is neither real nor grounded in what is real. By contrast, the realist position consists in taking the moral fact as neither reducible nor nonfactual. The dispute between the realist, the antirealist reductionist, and the antirealist nonfactualist therefore turns on considerations of what grounds the moral facts. And, in general, debates over realism are, in effect, debates over what grounds what and therefore may be settled by determining what grounds what.
The framework Fine devised for debates over realism has proven rich in its implications. For one illustration, the metaphysical notion of reality figures prominently in other parts of Fine’s philosophy. Fine believes that the notion of reality plays a prominent role in ontological questions. And Fine uses the notion of reality to characterize the debate in the philosophy of time over the reality or unreality of tense. But the notion of ground provides an even more vivid illustration. In addition to ground’s central role in realist debates, it has itself become a topic of intense interest of its own.
f. Ground
Ground, as Fine conceives of it, is a determinatively explanatory notion. To say that Aristotle’s being rational and his being animal grounds his being a rational animal is to say that Aristotle is a rational animal becauseof, or in virtue of, his being rational and his being animal. Not only do questions of ground enjoy a prominent place in realist debates, but also within philosophy as a whole. Are moral facts grounded in naturalist facts? Are mental facts grounded in physical facts? Are facts of personal identity grounded in facts of psychological continuity? These and other questions of ground are among the biggest and most venerable questions in philosophy.
It is therefore a curiosity of recent times that ground has become a “hot topic” with a rapidly-expanding literature (Raven 2020). This is perhaps partly explained by the anti-metaphysical sentiments that swept over 20th century analytic philosophy. Although philosophers did not entirely turn their backs on questions of ground, the anti-metaphysical sentiments created a climate in which many felt the need to reinterpret them as questions of another sort (such as conceptual analysis, supervenience, or truthmaking). Fine, however, played a highly influential role in changing this climate. This is partly because Fine’s work not only discussed ground in its application to other topics (such as realism), but also treated ground as a topic worthy of study in its own right (see Raven 2019 for further discussion). Fine provided a detailed exploration of ground, introducing many now familiar distinctions of ground and its connections to related topics, such as essence (2012c). Additionally, Fine has developed the so-called “pure logic” of ground (2012d). He also problematized ground by discovering some puzzles involving ground and its relation to classical logic (2010b).
Although Fine had recognized certain similarities between essence and ground, he was initially inclined to separate them (2012c: 80):
The two concepts [essence and ground] work together in holding up the edifice of metaphysics; and it is only by keeping them separate that we can properly appreciate what each is on its own and what they are capable of doing together.
But not long after, Fine changed his view (2015b: 297):
I had previously referred to essence and ground as the pillars upon which the edifice of metaphysics rests…, but we can now see more clearly how the two notions complement one another in providing support for the very same structure.
The unification appeals to a conception of constitutively necessary and sufficient conditions on arbitrary objects (1985d). For example, for true belief to be essential to knowledge is for it to be a constitutively necessary condition on an arbitrary person’s knowing something that they truly believe it. And, for another example, for a set’s having no members to ground its being identical with the null set is for it to be a constitutively sufficient condition on an arbitrary set’s having no members that it is identical with the null set.
This previous example illustrates an identity criterion: a statement of the conditions in virtue of which two items are the same. Many philosophers have been tempted to reject identity criteria for being pointless, trivial, or unintelligible. But Fine argues against such rejections and, instead, defends the intelligibility and, indeed, the potential substantivity of identity criteria by appealing to ground and arbitrary objects (2016b). Roughly, an identity criterion states that, given two arbitrary objects, they are the same when the fact that they are identical is grounded in the fact that they satisfy a specified condition. For example, given two arbitrary sets, they are the same when their identity is grounded in their having the same members.
g. Tense
One striking application of Fine’s work on realism and ground is to the philosophy of time. McTaggart 1908 notoriously argued for the unreality of time. Although McTaggart’s argument generated considerable discussion, the general impression has been that whatever challenge it posed to the reality of time can somehow be met. Fine argues that the challenge lurking within McTaggart’s argument is more formidable than usually thought (2005f, of which 2006b is an abridgement). Taking inspiration from McTaggart, Fine formulates his own argument against the reality of tense. The argument relies on four assumptions that each make essential appeal to the notion of reality:
Realism
Reality is constituted (at least, in part) by tensed facts.
Neutrality
No time is privileged, the tensed facts that constitute reality are not oriented towards one time as opposed to another.
Absolutism
The constitution of reality is an absolute matter, not relative to a time or other form of temporal standpoint.
Coherence
Reality is not contradictory; it is not constituted by facts with incompatible content.
Reality contains some tensed facts (Realism). Because things change, these will be diverse. Although you are reading, you aren’t always reading. So, one of these tensed facts is that you are reading whereas another of them is that you are not reading. None of these tensed facts are oriented toward any particular time (Neutrality). Nor do they obtain relative to any particular time (Absolutism). So reality is constituted by incompatible facts. But reality cannot be incoherent like that (Coherence). And so the four assumptions conflict. The antirealist reaction is to reject Realism, and so the reality of time. The realist accepts Realism, and so must reject another assumption. The challenge is to explain which. The “standard” realist denies Neutrality by privileging the present time. But Fine argues that there are two overlooked “nonstandard” responses. The relativist denies Absolutism, and so takes the constitution of reality to be irreducibly relative to a time. The fragmentalist denies Coherence, and so takes reality to divide into incompatible temporal “fragments”. Fine argues that the nonstandard realisms (and, in particular, fragmentalism) are, despite their obscurity, more defensible than standard realism.
Fine relates these considerations to the vexing case of first-personal realism. Standard realism about first-personal facts implausibly privileges a first-personal perspective. Overlooking nonstandard realisms, one may then draw the antirealist conclusion that there are no first-personal facts. But Fine’s apparatus reveals two nonstandard realist options: relativism and fragmentalism. According to Fine, these options (and, in particular, fragmentalism) are especially intuitive in the first-personal case. Indeed, Fine suggests that the question of the reality of tense might have more in common with the question of the reality of the first-personal, despite its more familiar association with the question of the reality of the modal.
4. Philosophy of Language
Fine has made four main contributions to the philosophy of language. The first two are in support of the referentialist tradition. One is to bolster arguments against the competing Fregean tradition. The other is to develop a novel version of referentialism, semantic relationism, that is superior to its referentialist competitors. The third contribution is to the nature of vagueness. And the fourth contribution is the development of an original approach to semantics, truthmaker semantics.
a. Referentialism
The referentialist tradition takes certain terms, especially names, to refer without the mediation of any Fregean sense or other descriptive information. Fine has made two main contributions in support of referentialism.
Fine’s first contribution to referentialism is to bolster arguments against Fregeanism. This includes a variety of supporting arguments scattered throughout his book Semantic Relationism (2007b). Perhaps the most notable of these is a thought experiment against the existence of the senses the Fregean posits (2007b: 36). The scenario involves a person in a universe that is perfectly symmetrically arranged around her center of vision. Her visual field therefore perfectly duplicates whatever is visible on the left to the right, and on the right to the left. When she is approached by two identical twins, she may name each ‘Bruce’. It seems she may refer by name to each. The Fregean can agree only if there is a pair of senses, one for the left ‘Bruce’ and the other for the right ‘Bruce’. But given the symmetry of the scenario, it seems there is no possible basis for thinking that the pair exists.
b. Semantic Relationism
Fine’s second contribution to referentialism is to introduce and develop what he argues is its most viable form: semantic relationism. The view is developed in his book Semantic Relationism which expands on his John Locke Lectures delivered at University of Oxford in 2003 (2007b).
Semantic relationism is representational in that it aims to account for the meanings of expressions in terms of what they represent (objects, properties, states of affairs, and so on). But it differs significantly from other representational approaches. These have typically (and implicitly) assumed that the meaning of an expression is intrinsic to it and so one is never required to consider any other expressions in accounting for the meaning of a given expression. Semantic relationism denies this. Instead, the meaning of (at least some) expressions at least partly consists in its “coordinative” relations to other meaningful expressions. This is different from typical kinds of semantic holism which usually characterize an expression’s meaning in non-representational terms and, instead, in terms of its inferential role.
One of the main benefits of semantic relationism is that it provides solutions to a variety of vexing puzzles, including the antinomy of the variable (2003b), Frege’s puzzle (Frege 1892), and Kripke’s puzzle about belief (Kripke 2011). To illustrate, Frege observed that an identity statement, like ‘Cicero is Cicero’, could be uninformative whereas another, like ‘Cicero is Tully’, could be informative despite the names ‘Cicero’ and ‘Tully’ being coreferential. Frege’s own solution was to bifurcate semantics into a level of sense and a level of reference. This enabled him to claim that the names ‘Cicero’ and ‘Tully’ differ in sense but not in reference. But powerful arguments from Kripke 1972 and others convinced many that the semantics of names only involve reference, not sense. How could one reconcile this referentialism about the semantics of names with Frege’s observation? Semantic relationism offers a novel answer. The pair ‘Cicero’,’Cicero’ in ‘Cicero is Cicero’ are coordinated: it is a semantic requirement that they co-refer. By contrast, the pair ‘Cicero’,‘Tully’ in ‘Cicero is Tully’ are uncoordinated: it is not a semantic requirement that they co-refer. This difference in coordination among the pairs of expressions explains the difference in their informativeness. But it is only by considering the pairs in relation to one another that this difference can even be recognized. The notion of semantic requirement involves a distinctive kind of semantic modality that Fine argues should play a significant role in semantic theorizing (2010a).
c. Vagueness
Fine provided what is widely considered to be the locus classicus for the so-called supervaluationist approach to vagueness (1975d). On this approach, vagueness is a kind of deficiency in meaning. What makes the deficiency specific to vagueness is that it gives rise to “borderline cases”. For example, the vague predicate ‘is bald’ admits of borderline cases. These are cases in which the predicate’s meaning does not settle whether it applies or does not apply to, say, a man with a receding hairline and thinning hair. Borderline cases pose an initial problem for classical logic. For if the predicate ‘is bald’ neither truly applies nor falsely applies in such cases, how could it be true to say ‘That man is bald or is not bald’? Supervaluationism answers by considering the admissible ways in which a vague predicate can be completed or made more precise. The sentence ‘That man is bald’ is “super-true” if true under every such “precisification”, “super-false” if false under every “precisification”, and neither otherwise. It can then be argued that ‘That man is bald or is not bald’ will be super-true because it will be true under every precisification, despite neither disjunct being super-true. This in turn helps supervaluationism provide a response to the Sorites Paradox.
In more recent work, Fine has given up on supervaluationism and instead developed an alternative approach. Fine’s reasons for rejecting supervaluationism are not specific to it but rather derive from a more far-reaching argument. Fine presents an apparent proof of the impossibility of vagueness (2008b). The challenge is to explain where the proof goes awry, since there is no question that vagueness is possible. But, Fine argues, standard accounts of vagueness, including especially supervaluationism, cannot satisfactorily meet this challenge. So, an alternative account is needed.
Fine develops such an alternative account that relies on a distinction between global and local vagueness (2015a). Global vagueness is vagueness over a range of cases, such as a series of indiscernible but distinct color tiles arranged incrementally from orange to red. Local vagueness is vagueness in a single case, such as in a single vermilion tile midway between the orange and red tiles. Given the distinction, there is a strong temptation to reduce global vagueness to local vagueness. But Fine argues against this. His own “globalist” approach, he argues, not only is able to meet the challenge of explaining the possibility of vagueness, but also why it does not succumb to the Sorites Paradox.
d. Truthmaker Semantics
In a series of articles, Fine develops a novel semantic approach he calls truthmaker semantics. The approach is in some ways like the more familiar possible-worlds semantics and, especially, situation semantics. But truthmaker semantics diverges from both. The contrast with possible-worlds semantics is especially vivid. On the latter approach, the truth-value of a sentence is evaluated with respect to a possible world in its entirety, no matter how irrelevant parts of that world might be to making the sentence true. Thus, ‘Fido barks’ will be true with respect to an entire possible world just in case it is a world in which Fido barks. Such a world includes much that is irrelevant to Fido’s barking, including sea turtle migration, weather patterns in sub-Saharan Africa, and distant galaxies. Truthmaker semantics departs in two ways from this. First, and like situation semantics, it replaces worlds with states which may, to a first approximation, be regarded as parts of worlds. So, for example, it is not the entire world—sea turtle migration, sub-Saharan weather, and distant galaxies included—that verifies or makes ‘Fido barks’ true, but rather instead just the state of Fido’s barking. What’s more, this state, unlike the entire world itself, does not verify any truths about sea turtles, sub-Saharan weather, and distant galaxies. Second, and unlike situation semantics, it is required that a state verifying a sentence must be wholly or exactly relevant to its truth. So, for example, the state that Fido barks and it’s raining in Djibouti will not verify ‘Fido barks’ because it includes an irrelevant part about Djibouti’s weather.
The general framework of truthmaker semantics is developed over the course of numerous articles (but see 2017c for an overview). An important feature of it is its abstractness. The semantics is specified in terms of a space of states, or a state space. The state space is assumed to have some mereological structure. But the assumptions are minimal and, in particular, no assumptions are made about the nature of the states themselves. This makes the framework highly abstract. This in turn grants the framework enormous flexibility in its potential range of applications. Indeed, Fine believes the main benefits of the general framework emerge from its wealth of applications to a wide variety of topics. These include: analytic entailment (2016a), counterfactuals (2012a; 2012b), ground (2020a), intuitionistic logic (2014b), semantic content (Fine [2017a,2017b]), the is-ought gap (2018b), verisimilitude (2019d; 2020b), impossible worlds (2019c), deontic and imperative and imperative statements (2014a; 2019a; 2019b), and more. This is not the place for a comprehensive survey of these applications. Still, one may get a sense of them by considering three applications in more detail.
First, consider counterfactuals. The standard semantics for counterfactuals derives from Stalnaker 1968 and Lewis 1973. According to Lewis’ version of it, the counterfactual ‘If A then it would be that C’ is true just in case no possible world in which A but not C is true is closer to actuality than any in which both A and C are true. Fine’s opposition to this semantics is evident from his critical notice (1975a) of Lewis’s book. There Fine introduced the so-called “future similarity objection”. It takes the form of a counterexample showing that small changes can make for great dissimilarities. Fine’s celebrated case was the counterfactual ‘If Nixon had pressed the button, then there would have been a nuclear holocaust’. Although it seems true, the standard semantics struggles to validate it. The great dissimilarities of a world where Nixon pressed the button causing nuclear holocaust ensure it is further from actuality than a world where Nixon pressed the button without nuclear holocaust. Fine’s critical notice also contained the seeds of ideas that later emerged in his work on truthmaker semantics. There he also objects that the standard semantics is committed to unsound implications because it permits the substitution of tautologically equivalent statements. This objection was prescient for anticipating a similar difficulty later developed in greater detail against the standard semantics (2012a; 2012b). Fine argues that the difficulty can be avoided by providing a truthmaker semantics for counterfactuals. Roughly, ‘If A then it would be that C’ is true just in case any possible outcome of a state verifying A also contains a state verifying C.
Second, consider intuitionistic logic. Realists and antirealists alike tend to agree that certain technical aspects of intuitionistic logic provide a natural home for antirealism. This would be a mistake, however, if intuitionistic logic could be given a realist semantic foundation. Fine shows how truthmaker semantics can be used to provide just such a realist semantics for intuitionistic logic (2014b).
Third, consider the is-ought gap. Hume 1739 famously argued for a gap between ‘is’ and ‘ought’ statements: one cannot validly derive any statement about what ought to be from any statements about what is. Despite the appeal of such a gap, it has not been easy to formulate it clearly. What’s more, standard formulations are vulnerable to superficial but resilient counterexamples (Prior 1960). Fine shows how truthmaker semantics can be used to formulate the gap in a way that avoids such superficial counterexamples (2018b).
5. Logics and Mathematics
Fine has made a variety of seminal technical contributions to formal logic as well as to philosophical logic and the philosophy of mathematics. These contributions may be organized, respectively, into three major groups: formal logic (especially modal logics), arbitrary objects, and the foundations of mathematics (broadly construed so as to include the theory of sets and classes).
a. Logics
Most of Fine’s earliest work focused on technical questions within formal logic, especially on modal logics. A detailed synopsis of Fine’s technical work is beyond the scope of this article. But a very brief summary of them can be given here:
various results about modal logics with propositional quantifiers (1970 which presents results from Fine’s Ph.D. dissertation 1969);
a completeness proof for a predicate logic without identity but with primitive numerical quantifiers (1972a);
early developments of graded modal logic (1972b);
various results about S4 logics (those with reflexive and transitive Kripke frames) and certain extensions of them (1971; 1972c; 1974a; 1974b);
the application of normal forms to a general completeness proof for “uniform” modal logics (1975b);
a seminal “canonicity theorem” for modal logics (1975c);
completeness results for logics containing K4 (those with transitive Kripke frames) (1974c; 1985a);
failure of Craig’s interpolation lemma for various quantified modal logics (1979);
the underivability of a quantifier permutation principle in certain modal systems without identity (1983b);
an exploration into whether truth can be defined without the notion of satisfaction (joint work with McCarthy 1984b);
incompleteness results for standard semantics for quantified relevance logic and an alternative semantics for it that is complete (1988; 1989a);
the development of stability (or “felicitous”) semantics for the conception of “negation as failure” in logic programming and computer science (1989b); and
general results about how properties of “monomodal” logics containing a single modal operator may transfer to a “multimodal” logic joining them (joint work with Schurz 1996).
In addition, Fine also wrote several articles in economic theory (1973a; 1972d), including two with his brother, economist Ben Fine (1974d; 1974e).
b. Arbitrary Objects
We often speak of arbitrary objects—an arbitrary integer, an arbitrary American, and so on. But at least since Berkeley 1710, the notion of an arbitrary object has been thought to be dispensable, if not outright incoherent. But in Fine’s book Reasoning with Arbitrary Objects, he argued that familiar opposition to arbitrary objects is misplaced and that they can, contrary to received wisdom, be given a rigorous theoretical foundation (1985d and its abridgements 1983a; 1985b).
The matter is not a mere intellectual curiosity. For it turns out, according to Fine, that arbitrary objects have various important applications. One salient application is to natural deduction and, especially, the logic of generality (1985d; 1985b). To illustrate, consider how one might explain the rule of universal generalization to students of a first formal logic course. One might say that if one can show that an arbitrary item a satisfies some condition f, then one may deduce that every item whatsoever satisfies that condition: “xf(x). Standard glosses on the rule ultimately attempt to avoid any appeal to the arbitrary item in favor of some alternative construal. But given Fine’s defense of arbitrary objects, there is no need to avoid appealing to them, and, in fact, it may be argued that they provide a more direct and satisfying account of the rule than alternative accounts do. Fine has also explored other applications to mathematical logic, the philosophy of language, and the history of ideas are also explored (1985d).
More recently, Fine has found new applications for arbitrary objects. One is to Cantor’s abstractionist constructions of cardinal numbers and order types. The constructions have faced formidable objections. But, according to Fine, the objections can be overcome by appealing to the theory of arbitrary objects (1998). In a belated companion article, Fine argues that his theory of arbitrary objects combined with the Cantorian approach can be extended to provide a general theory of types or forms, of which structural universals end up being a special case (2017a). And Fine also puts arbitrary objects to use when attempting to provide a paradox-free construction of sets or classes that allows for the existence of a universal class and for the Frege-Russell cardinal numbers (2005a), characterizing identity criteria (2016b) as well as unified foundations for essence and ground (2015b). Fine is currently preparing a revised version of Reasoning with Arbitrary Objects.
c. Philosophy of Mathematics
Most of Fine’s contributions to the philosophy of mathematics concern various foundational issues. Much recent interest in these issues derives from Frege’s infamous attempt to secure the foundations of mathematics by deriving it from logic alone. Frege’s attempt foundered in the early 1900s with the discovery of the set-theoretic paradoxes. Much of Fine’s work in the philosophy of mathematics concern the prospects for reviving Frege’s project without paradox.
At the heart of Frege’s own attempt was the notion of abstraction. Just as we may abstract the direction of two lines from their being parallel, so too we may abstract the number of two classes from their equinumerosity. Frege’s own use of abstraction ultimately led to paradox. But since then, neo-Fregeans (such as Fine’s colleague Crispin Wright and Bob Hale) have attempted to salvage much of Frege’s project by refining the use of abstraction in various ways. Fine has provided a detailed exploration of a general theory of abstraction as well as its prospects for sustaining neo-Fregean ambitions (2002a).
The discovery of the set-theoretic paradoxes generated turmoil within the foundations of mathematics and for associated philosophical programs. Since then, there have been a variety of attempts to provide a paradox-free construction of sets or classes. These attempts usually assume a notion of membership in their construction of the ontology. But Fine reverses the direction and constructs notions of membership in terms of the assumed ontology. This, Fine argues, has various advantages over standard constructions (2005a).
Many have thought that a central lesson of the aforementioned set-theoretic paradoxes is that quantification is inevitably restricted. Were it possible to quantify unrestrictedly over absolutely everything, then paradox would result. Instead, we may indefinitely extend the range of quantification without ever paradoxically quantifying over absolutely everything. So, it seems, quantification is always restricted, albeit indefinitely extendible. A persistent difficulty in sustaining this point of view, however, is the apparent arbitrariness of any restriction. Fine argues that the difficulty can be avoided (2006c). Quantification’s being absolute and its being unrestricted are often conflated. But Fine argues that they are distinct. Distinguishing them allows us to conceive of the possibility of quantification that is unrestricted but not absolute.
A recurring theme in some of the preceding papers is an approach to mathematics that Fine calls procedural postulationism. Traditional versions of postulationism take the existence of mathematical items and the truths about them to derive from certain propositions we postulate. But Fine’s procedural postulationism takes these postulates to be imperatival instead (e.g. “For each item in the domain that is a number, introduce another number that is its successor”). Fine believes this one difference helps postulationism provide a more satisfactory metaphysics, semantics, and epistemology of mathematics. Although procedural postulationism is hinted at in the previous articles, it is discussed in more detail in the context of discussing knowledge of mathematical items (2005d). Fine has indicated that he believes the core ideas of procedural postulationism may extend more generally, and briefly discusses their application to the metaphysics of material things (2007a).
6. History
It is not hard to find Aristotle’s influence in much of Fine’s work. But in addition to developing various Aristotelian themes, Fine has also directly contributed to more exegetical scholarship on Aristotle’s own work. These contributions have primarily focused on developing an account of Aristotle’s views on substance and what we may still learn from them. This begins with an attempt to formalize Aristotle’s views on matter (1992). Fine later raises a puzzle for Aristotle (and other neo-Aristotelians) concerning how the matter now composing one hylomorphic compound, say Callias, could later come to compose another hylomorphic compound, say Socrates (1994c). According to Aristotle, the world contains elements that may compose mixtures, and these mixtures in turn compose substances. Fine argues against conceptions of mixtures that take them to be at the same level as the elements composing them and, instead, defends a conception on which they are at a higher level (1995d). Finally, Fine argues that the best interpretation of a vexing discussion in Metaphysics Theta.4 is that Aristotle was attempting to introduce a novel conception of modality (2011a).
Additionally, Fine has written on Husserl’s discussions from the Logical Investigations on part and whole and the related topics of dependence, necessity, and unity (1995c). Fine also has work in preparation on Bolzano’s conception of ground.
7. References and Further Reading
Berkeley, George. 1710. A Treatise Concerning the Principles of Human Knowledge.
Fine, Kit. 1969. For Some Proposition and So Many Possible Worlds. University of Warwick.
Fine, Kit. 1970. “Propositional Quantifiers in Modal Logic.” Theoria 36 (3): 336-46.
Fine, Kit. 1971. “The Logics Containing S4.3.” Zeitschrift für Mathematische Logik und Grundlagen der Mathematik 17 (1): 371-76.
Fine, Kit. 1972a. “For So Many Individuals.” Notre Dame Journal of Formal Logic 13 (4): 569-72.
Fine, Kit. 1972b. “In So Many Possible Worlds.” Notre Dame Journal of Formal Logic 13 (4): 516-20.
Fine, Kit. 1972c “Logics Containing S4 without the Finite Model Property.” In Conference in Mathematical Logic–London ’70, edited by W. Hodges. New York: Springer-Verlag.
Fine, Kit. 1972d. “Some Necessary and Sufficient Conditions for Representative Decision on Two Alternatives.” Econometrica 40 (6): 1083-90.
Fine, Kit. 1973a. “Conditions for the Existence of Cycles under Majority and Non-minority Rules.” Econometrica 41 (5): 889-99.
Fine, Kit. 1974a. “An Ascending Chain of S4 Logics.” Theoria 40 (2): 110-16.
Fine, Kit. 1974c. “Logics Containing K4 – Part I.” The Journal of Symbolic Logic 39 (1): 31-42.
Fine, Kit. 1975a. “Critical Notice: Counterfactuals, by David Lewis.” Mind 84 (335): 451-58. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1975b. “Normal Forms in Modal Logic.” Notre Dame Journal of Formal Logic 16 (2): 229-34.
Fine, Kit. 1975c. “Some Connections Between Elementary and Modal Logic.” In Proceedings of the Third Scandinavian Logic Symposium, edited by S. Kanger. Amsterdam: North-Holland.
Fine, Kit. 1975d. “Vagueness, Truth and Logic.” Synthese 30: 265-300.
Fine, Kit. 1977a “Prior on the Construction of Possible Worlds and Instants.” In Worlds, Times and Selves, edited by A. N. Prior and K. Fine. London: Duckworth. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1977b. “Properties, Propositions and Sets.” Journal of Philosophical Logic 6: 135-91.
Fine, Kit. 1978a. “Model Theory for Modal Logic – Part I: The De Re/De Dicto Distinction.” Journal of Philosophical Logic 7 (1): 125-56.
Fine, Kit. 1978b. “Model Theory for Modal Logic – Part II: The Elimination of De Re Modality.” Journal of Philosophical Logic 7 (1): 277-306.
Fine, Kit. 1979. “Failures of the Interpolation Lemma in Quantified Modal Logic.” The Journal of Symbolic Logic 44 (2): 201-06.
Fine, Kit. 1980. “First-order Modal Theories II – Propositions.” Studia Logica 39 (2/3): 159-202.
Fine, Kit. 1981b. “Model Theory for Modal Logic – Part III: Existence and Predication.” Journal of Philosophical Logic 10 (3): 293-307.
Fine, Kit. 1982a. “Acts, Events and Things.” In Language and Ontology, edited by W. Leinfellner, E. Kraemer and J. Schank. Wien: Hölder-Pichler-Tempsky, as part of the proceedings of the Sixth International Wittgenstein Symposium 23rd to 30th August 1981, Kirchberg/Wechsel (Austria).
Fine, Kit. 1985a. “Logics Containing K4 – Part II.” The Journal of Symbolic Logic 50 (3): 619-51.
Fine, Kit. 1985b. “Natural Deduction and Arbitrary Objects.” Journal of Philosophical Logic 14: 57-107.
Fine, Kit. 1985c “Plantinga on the Reduction of Possibilist Discourse.” In Alvin Plantinga, edited by J. E. Tomberlin and P. van Inwagen. Dordrecht: Reidel. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1985d. Reasoning with Arbitrary Objects. Oxford: Blackwell.
Fine, Kit. 1988. “Semantics for Quantified Relevance Logic.” Journal of Philosophical Logic 17 (1): 27-59.
Fine, Kit. 1989a “Incompleteness for Quantified Relevance Logics.” In Directions in Relevant Logics, edited by R. Sylvan and J. Norman. Dordrecht: Kluwer.
Fine, Kit. 1989b “The Justification of Negation as Failure.” In Proceedings of the Congress on Logic, Methodology and the Philosophy of Science VIII, edited by J. Fenstad, T. Frolov and R. Hilpinen. Amsterdam: Elsner Science Publishers B. V.
Fine, Kit. 1989c. “The Problem of De Re Modality.” In Themes from Kaplan, edited by J. Almog, J. Perry and H. Wettstein. Oxford: Oxford University Press. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1990. “Quine on Quantifying In.” In Proceedings of the Conference on Propositional Attitudes, edited by C. A. Anderson and J. Owens. Stanford: CSLI. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1992. “Aristotle on Matter.” Mind 101 (401): 35-57.
Fine, Kit. 1994a. “Compounds and Aggregates.” Noûs 28 (2): 137-58.
Fine, Kit. 1994b. “Essence and Modality.” Philosophical Perspectives 8: 1-16.
Fine, Kit. 1994c “A Puzzle Concerning Matter and Form.” In Unity, Identity, and Explanation in Aristotle’s Metaphysics, edited by T. Scaltsas, D. Charles and M. L. Gill. Oxford: Oxford University Press.
Fine, Kit. 1994d “Senses of Essence.” In Modality, Morality and Belief: Essays in Honor of Ruth Barcan Marcus, edited by W. Sinnott-Armstrong. Cambridge: Cambridge University Press.
Fine, Kit. 1994e. “The Study of Ontology.” Noûs 25 (3): 263-94.
Fine, Kit. 1995a. “The Logic of Essence.” Journal of Philosophical Logic 24: 241-73.
Fine, Kit. 1995b. “Ontological Dependence.” Proceedings of the Aristotelian Society 95: 269-90.
Fine, Kit. 1995c. “Part-Whole.” In The Cambridge Companion to Husserl, edited by B. Smith and D. Woodruff. Cambridge: Cambridge University Press.
Fine, Kit. 1995d. “The Problem of Mixture.” Pacific Philosophical Quarterly 76 (3-4): 266-369.
Fine, Kit. 1998. “Cantorian Abstraction: A Reconstruction and Defense.” The Journal of Philosophy 95 (12): 599-634.
Fine, Kit. 1999. “Things and Their Parts.” Midwest Studies in Philosophy 23: 61-74.
Fine, Kit. 2000a. “A Counter-example to Locke’s Thesis.” The Monist 83 (3): 357-61.
Fine, Kit. 2000c. “Semantics for the Logic of Essence.” Journal of Philosophical Logic 29 (6): 543-84.
Fine, Kit. 2001. “The Question of Realism.” Philosophers’ Imprint 1 (2): 1-30.
Fine, Kit. 2002a. The Limits of Abstraction. Oxford: Clarendon Press.
Fine, Kit. 2002b. “The Problem of Possibilia.” In Handbook of Metaphysics, edited by D. Zimmerman. Oxford: Oxford University Press. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 2002c. “The Varieties of Necessity.” In Conceivability and Possibility, edited by T. S. Gendler and J. Hawthorne. Oxford: Oxford University Press. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 2003a. “The Non-Identity of a Material Thing and Its Matter.” Mind 112 (446): 195-234.
Fine, Kit. 2003b. “The Role of Variables.” The Journal of Philosophy 50 (12): 605-31.
Fine, Kit. 2005a. “Class and Membership.” The Journal of Philosophy 102 (11): 547-72.
Fine, Kit. 2005c. “Necessity and Non-existence.” In Modality and Tense: Philosophical Papers.
Fine, Kit. 2005d. “Our Knowledge of Mathematical Objects.” In Oxford Studies in Epistemology, edited by T. S. Gendler and J. Hawthorne. Oxford: Clarendon Press.
Fine, Kit. 2005e. “Reference, Essence, and Identity.” In Modality and Tense: Philosophical Papers. Oxford: Clarendon Press.
Fine, Kit. 2005f. “Tense and Reality.” In Modality and Tense: Philosophical Papers. Oxford: Clarendon Press.
Fine, Kit. 2006a. “In Defense of Three-Dimensionalism.” The Journal of Philosophy 103 (12): 699-714.
Fine, Kit. 2006b. “The Reality of Tense.” Synthese 150 (3): 399-414.
Fine, Kit. 2006c “Relatively Unrestricted Quantification.” In Absolute Generality, edited by A. Rayo and G. Uzquiano. Oxford: Clarendon Press.
Fine, Kit. 2008a. “Coincidence and Form.” Proceedings of the Aristotelian Society, Supplementary Volume 82 (1): 101-18.
Fine, Kit. 2008b. “The Impossibility of Vagueness.” Philosophical Perspectives 22 (Philosophy of Language): 111-36.
Fine, Kit. 2009. “The Question of Ontology.” In Metametaphysics: New Essays on the Foundations of Ontology, edited by D. Chalmers, D. Manley and R. and Wasserman. Oxford: Oxford University Press.
Fine, Kit. 2010a. “Semantic Necessity.” In Modality: Metaphysics, Logic, and Epistemology, edited by B. Hale and A. Hoffmann. Oxford: Oxford University Press.
Fine, Kit. 2010b. “Some Puzzles of Ground.” Notre Dame Journal of Formal Logic 51 (1): 97-118.
Fine, Kit. 2010c. “Towards a Theory of Part.” The Journal of Philosophy 107.
Fine, Kit. 2011a. “Aristotle’s Megarian Manoeuvres.” Mind 120 (480): 993-1034.
Fine, Kit. 2011b. “What is Metaphysics?” In Contemporary Aristotelian Metaphysics, edited by T. E. Tahko. Cambridge: Cambridge University Press.
Fine, Kit. 2012a. “Counterfactuals without Possible Worlds.” The Journal of Philosophy 59 (3): 221-46.
Fine, Kit. 2012b. “A Difficulty for the Possible Worlds Analysis of Counterfactuals.” Synthese 189 (1): 29-57.
Fine, Kit. 2012c “Guide to Ground.” In Metaphysical Grounding: Understanding the Structure of Reality, edited by F. Correia and B. Schnieder. Cambridge: Cambridge University Press.
Fine, Kit. 2012d. “The Pure Logic of Ground.” The Review of Symbolic Logic 25 (1): 1-25.
Fine, Kit. 2013a. “Fundamental Truths and Fundamental Terms.” Philosophy and Phenomenological Research 87 (3): 725-32.
Fine, Kit. 2014a. “Permission and Possible Worlds.” dialectica 68 (3): 317-36.
Fine, Kit. 2014b. “Truth-Maker Semantics for Intuitionistic Logic.” Journal of Philosophical Logic 43: 549-77.
Fine, Kit. 2015a. “The Possibility of Vagueness.” Synthese 194 (10): 3699-725.
Fine, Kit. 2015b. “Unified Foundations for Essence and Ground.” Journal of the American Philosophical Association 1 (2): 296-311.
Fine, Kit. 2017c. “Truthmaker Semantics.” In A Companion to the Philosophy of Language, edited by B. Hale, C. Wright and A. Miller. West Sussex: Wiley-Blackwell.
Fine, Kit. 2018a. “Ignorance of Ignorance.” Synthese 195 (9): 4031-45.
Fine, Kit. 2019c. “Constructing the Impossible.” In to appear in a collection of papers for Dorothy Edgington.
Fine, Kit. 2020a. “Semantics.” In The Routledge Handbook of Metaphysical Grounding, edited by M. J. Raven. New York: Routledge.
Fine, Kit, and Ben Fine. 1974d. “Social Choice and Individual Rankings I.” Review of Economic Studies 41: 303-22.
Fine, Kit, and Ben Fine. 1974e. “Social Choice and Individual Rankings II.” Review of Economic Studies 41: 459-75.
Fine, Kit, and Timothy McCarthy. 1984b. “Truth without Satisfaction.” Journal of Philosophical Logic 13 (4): 397-421.
Fine, Kit, and Gerhard Schurz. 1996. “Transfer Theorems for Multimodal Logics.” In Logic and Reality: Essays on the Legacy of Arthur Prior, edited by J. Copeland. Oxford: Clarendon.
Frege, Gottlob. 1892. “On Sense and Reference.” In Translations from the Philosophical Writings of Gottlob Frege, edited by P. T. Geach and M. Black. Oxford: Blackweel.
Hume, David. 1739. “A Treatise of Human Nature”, edited by L. A. Selby-Bigge and P. H. Nidditch. Oxford: Clarendon Press.
Kripke, Saul. 1972. Naming and Necessity. Cambridge, MA: Harvard University Press.
Kripke, Saul. 2011. “A Puzzle about Belief.” In Philosophical Troubles: Collected Papers, Volume I. Oxford: Oxford University Press.
Lewis, David. 1986. On the Plurality of Worlds. Oxford: Blackwell Publishers.
Lewis, David. 1991. Parts of Classes. Oxford: Blackwell.
Locke, John. 1689. An Essay Concerning Human Understanding.
McTaggart, J. M. E. 1908. “The Unreality of Time.” Mind 17: 457-74.
Prior, A. N. 1960. “The Autonomy of Ethics.” Australasian Journal of Philosophy 38 (3): 199-206.
Quine, Willard Van Orman. 1948. “On What There is.” Review of Metaphysics 2: 21-38. Reprinted in From a Logical Point of View, 2nd ed., Harvard: Harvard University Press, 1980, 1-19.
Raven, Michael J. 2019. “(Re)discovering Ground.” In Cambridge History of Philosophy, 1945 to 2015, edited by K. M. Becker and I. Thomson. Cambridge: Cambridge University Press.
Raven, Michael J., ed. 2020. The Routledge Handbook of Metaphysical Grounding. New York: Routledge.
Spinoza, Baruch. 1677. Ethics, Demonstrated in Geometrical Order.
Stalnaker, Robert. 1968. “A Theory of Conditionals.” In Studies in Logical Theory, edited by N. Rescher. Oxford: Blackwell.
Williamson, Timothy. 2013b. Modal Logic as Metaphysics. Oxford: Oxford University Press.
For Immanuel Kant (1724–1804), formal logic is one of three paradigms for the methodology of science, along with mathematics and modern-age physics. Formal logic owes this role to its stability and relatively finished state, which Kant claims it has possessed since Aristotle. Kant’s key contribution lies in his focus on the formal and systematic character of logic as a “strongly proven” (apodictic) doctrine. He insists that formal logic should abstract from all content of knowledge and deal only with our faculty of understanding (intellect, Verstand) and our forms of thought. Accordingly, Kant considers logic to be short and very general but, on the other hand, apodictically certain. In distinction to his contemporaries, Kant proposed excluding from formal logic all topics that do not properly belong to it (for example, psychological, anthropological, and metaphysical problems). At the same time, he distinguished the abstract certainty (that is, certainty “through concepts”) of logic (and philosophy in general) from the constructive evidence of mathematical knowledge. The idea of formal logic as a system led Kant to fundamental questions, including questions about the first principles of formal logic, redefinitions of logical forms with respect to those first principles, and the completeness of formal logic as a system. Through this approach, Kant raised some essential problems that later motivated the rise of modern logic. Kant’s remarks and arguments on a system of formal logic are spread throughout his works (including his lectures on logic). Nonetheless, he never published an integral, self-contained presentation of formal logic as a strongly proven doctrine. A lively dispute has thus developed among scholars about how to reconstruct his formal logic as an apodictic system, in particular concerning his justification of the completeness of his table of judgments.
One of Kant’s main results is his establishment of transcendental logic, a foundational part of philosophical logic that concerns the possibility of the strictly universal and necessary character of our knowledge of objects. Formal logic provides transcendental logic with a basis (“clue”) for establishing its fundamental concepts (categories), which can be obtained by reinterpreting the logical forms of judgment as the forms of intuitively given objects. Similarly, forms of inference provide a “clue” for transcendental ideas, which lead to higher-order and meta-logical perspectives. Transcendental logic is crucial to and forms the largest part of Kant’s foundations of metaphysics, as they are critically investigated and presented in his main work, the Critique of Pure Reason.
This article focuses on Kant’s formal logic in the systematic order of logical forms and outlines Kant’s approach to the foundations of formal logic. The main characteristics of Kant’s transcendental logic are presented, including his system of categories and transcendental ideas. Finally, a short overview is given of the subsequent role of Kant’s logical views.
Presentations of the history of logic published at the beginning of the 21st century seem to positively re-evaluate Kant’s role, especially with regard to his conceptual work that led to a new development of logic (see, for example, Tiles 2004). Although older histories of logic written from the standpoint of mathematical logic did appreciate Kant’s restitution of the formal side of logic, they ascribed to Kant a relatively unimportant role. They criticized him for what seemed to be his view on logic as principally not exceeding the traditional, Aristotelian boundaries (Kneale and Kneale 1991) and for his principled separation of logic and mathematics (Scholz 1959). Nevertheless, during the 20th century, some Kant scholars have confirmed and extensively elaborated on his relevance to mathematical logic (for example, Wuchterl 1958, Schulthess 1981). Moreover, it is significant that several founders of modern logic (including Frege, Hilbert, Brouwer, and Gödel) explicitly referred to and built upon aspects of Kant’s philosophy.
According to Kant, formal logic appears to be an already finished science (accomplished by Aristotle), in which essentially no further development is possible (B VIII). In fact, some of Kant’s statements leave the impression that his views of formal logic may have been largely compiled from contemporary logic textbooks (B 96). Nonetheless, Kant mentions that the logic of his contemporaries was not free of insufficiencies (Prolegomena IV:323). He organized the existing material of formal logic in a specific way; he separated the extraneous (for instance, the psychological, anthropological, and metaphysical) material from formal logic proper. What is particularly important for Kant are his redefinitions of logical forms in terms of formal unity and consciousness. These redefinitions are indispensable for his main contributions: his systematic view of formal logic and the application of this view in transcendental logic.
It also became apparent, primarily due to K. Reich’s 1948 monograph, that Kant’s systematic view of formal logic assumed, as an essential component, a justification of the completeness of formal logic with respect to the forms of our thinking. This conforms with Kant’s critique of Aristotle for his unsystematic, “rhapsodical” approach in devising the list of categories, since Kant intended to repair this deficiency by setting up a system of categories specifically on the basis of formal logic.
Finally, the contemporary development of logic, where logic has far exceeded the shape of a standard (“classical”) mathematical logic, has made it technically possible to explore some features of Kant’s logic that have largely escaped the attention of the earlier, “classically” based perception of Kant’s logic.
Although formal logic is the starting point of Kant’s philosophy, there is no separate text in which Kant systematically, in a strictly scientific way, presented formal logic as a doctrine. Essential parts of this doctrine, however, are contained in his published works, especially those on the foundations of metaphysics, in his handwritten lecture notes on logic (with the addition of Jäsche’s compilation), and in the transcripts of Kant’s lectures on logic. These lectures are based primarily on the textbook by G. F. Meier; and, according to the custom of the time, they include a large amount of material that does not strictly pertain to formal logic. Kant’s view was that it was harmful to beginners to receive instruction in a highly abstract form, in contrast to their concrete and intuitive way of thinking (compare II:305‒306). Nevertheless, many places in Kant’s texts and lectures are pertinent to or reflect the systematic aspect of logic. On this ground, it is possible to reconstruct and describe most of the crucial details of Kant’s doctrine of formal logic.
The reason Kant did not write a systematic presentation of formal logic can be attributed to his focus on metaphysics and the possibility of its foundations. Besides, he might have presumed that the systematic doctrine of formal logic could be recognized from the sections and remarks he had included about it in his written work, at least to the extent to which formal logic was necessary to understand his argument on the foundations of metaphysics. Furthermore, Kant thought that once the principles were determined, a formal analysis (as is required in logic) and a complete derivation of a system could be relatively easily accomplished with the additional help of existing textbooks (see B 27‒28, 108‒109, A XXI: “more entertainment than labor”).
We first present Kant’s doctrine of formal logic, that is, his theory of concepts, judgments and inference and his general methodology. Then, we address the question of the foundations of logic and its systematic character. Finally, we outline Kant’s transcendental logic (that is, logical foundations of metaphysics), especially in relation to formal logic, and give a brief overview of his historical influence.
2. The Concept of Formal Logic
What we here term “formal logic” Kant usually calls “general logic” (allgemeine Logik), in accordance with some of his contemporaries and predecessors (Jungius, Leibniz, Knutzen, Baumgarten). Kant only rarely uses the terms “formal logic” (B 170, also mentioned by Jungius) or “formal philosophy” (Groundwork of the Metaphysics of Morals IV:387), and he preferred to define “logic” in this general sense as a science of the “formal rules of thinking,” rather than merely a general doctrine of understanding (Verstand) (XVI refl. 1624; see B IX, 78, 79, 172). Let us note the distinction between Kant’s use of the term “formal philosophy” and its contemporary use (philosophy in which modern formalized methods are applied).
The following are the essential features of Kant’s formal logic (see B 76‒80):
(1) Formal logic is general inasmuch as it disregards the content of our thought and the differences between objects. It deals only with the form and general rules of thought instead and can only be a canon for judging the correctness of thought. In distinction, a special logic pertains to a special kind of objects and is conjoined with some special science as its organon to extend the content of knowledge.
(2) Formal logic is pure, as it is not concerned with the psychological empirical conditions under which we think and that influence our thought. These psychological conditions are dealt with in applied logic. In general, pure logic does not incorporate any empirical principles, and according to Kant, it is only in this way that it can be established as a science that proves its propositions with strong certainty.
Formal logic should abstract from the distinction of whether the content to which logical forms apply is pure or empirical. Therefore, formal logic is distinguished from transcendental logic, which is a special logic of pure (non-empirical) thinking and which deals with the origin of our cognitions that is independent of given objects. However, transcendental logic is, in a sense, also general, because it deals with the general content of our thought—that is, with the categories that determine all objects.
It is clear that Kant conceives logical forms, as forms of thought, in mentalistic, albeit not in psychological terms. For him, forms of thought are ways of establishing a unity of our consciousness with respect to a given variety of representations. In this context, consciousness comes into play quite abstractly as the most general instance of unity, since ultimately it is we ourselves, in our own consciousness, who are uniting and linking representations given to us. This abstract (non-empirical) unity is to be distinguished from a mere psychological association of representations, which is dispersed and dependent on changing subjective states, and thus cannot establish unity.
By using a mentalistic approach, Kant stresses the operational character of logic. For him, a logical form is a result of the abstract operations of our faculty of understanding (Verstand), and it is through these operations that a unity of our representations can be established. In connection with this, Kant defines function as “the unity of the action [Handlung] of ordering different representations under a common one” (B 93) and he considers logical forms to be based on functions. We see in more detail below how Kant applies his concept of function to logical forms. Further historical development and modifications of Kant’s notion of function can be traced in Frege’s notion of “concept” and Russell’s “propositional functions.”
3. Concept
According to Kant, the unity that a concept establishes from a variety of representations is a unity in a common mark (nota communis) of objects. The form of a concept as a common mark is universality, and its subject matter is objects. Three types of operations of understanding bring about a concept: comparison, reflection, and abstraction.
(1) Through comparison, as a preparatory operation, we become conscious of the identity and difference of objects, and come to an identical mark that is contained in representations of many things. This is a common mark of these things, which is a “partial concept” contained in their representations; other marks may also be contained in these representations, making the things different from one another.
(2) Through reflection, which is essential for concept formation, we become conscious of a common mark as belonging to and holding of many objects. This is a “ground of cognition” (Erkenntnisgrund) of objects, which universally holds of them. Universality (“universal validity”) is the form through which we conceive many objects in one and the same consciousness.
(3) Through abstraction, we leave out (“abstract from”) the differences between objects and retain only their common mark in our consciousness.
Kant characterizes the sort of unity that is established by a concept in the following, foundational way. Each concept, as a common mark that is found in many representations, has an analytic unity (identity) of consciousness “on itself.” At the same time, the concept is presupposed to belong to these, possibly composed, representations, where it is combined (synthesized) with the other component marks. That is, each concept presupposes a synthetic unity of consciousness (B 134 footnote).
On the ground of this functional theory of concepts, Kant explains the distinction between the content (intension) and the extension (sphere) of a concept. This distinction stems from the so-called Port-Royal logic (by A. Arnauld and P. Nicole) of the 17th century and has since become standard in so-called traditional logic (that is, in logic before or independent of its transformation starting with Boole and Frege’s methodology of formalization). According to Kant, concept A has a content in the sense that A is a “partial concept” contained in the representation of an object; concept A has extension (sphere) in the sense that A universally holds of many objects that are contained underA (Jäsche Logic §7 IX:95, XVI refl. 2902, Reich 1948 p. 38). The content of A can be complex, that is, it can contain many marks in itself. The content and extension of a concept A stand in an inversely proportional relationship: the more concept A contains under itself, the less A contains in itself, and vice versa.
A traditional doctrine (mainly originating from Aristotle) of the relationship between concepts can also be built on the basis of Kant’s theory of concepts. A concept B can be contained under A if A is contained inB, that is, as Kant says, if A is a note (a “ground of cognition”) of B. In this case, Kant calls A a higher concept with respect to B, and B a lower concept with respect to A. Kant also says that A is a “mark of a mark” B (a distant mark). Obviously, A is not meant as a second-order mark but rather as a mark of the same order as B. Also, A is a genus of B, while B is a species of A. Through abstraction, we ascend to higher and higher concepts; through determination, we descend to lower and lower concepts. The relationship between higher and lower concepts is subordination, and the relationship between lower concepts among themselves without mutual subordination is coordination. According to Kant, there is no lowest species, because we can always add a new mark to a given concept and thus make it more specific. Finally, with respect to extension, a higher concept is wider, and a lower concept is narrower. The concepts with the same extension are called reciprocal.
4. Judgment
Judgment is for Kant the way to bring given representations to the objective unity of self-consciousness (see B 141, XVI refl. 3045). Because of this unifying of a manifold (of representations) in one consciousness, Kant conceives judgment as rule (Prolegomena §23 IV:305, see Jäsche Logic §60 IX:121). For example, the objective unity is the meaning of the copula “are” in the judgment “All bodies are heavy”; what is meant is not our subjective feeling of heaviness, but rather the objective state of affairs that bodies are heavy (see B 142), which is representable by a thinking agent (“I”) irrespective of the agent’s changeable psychological states.
As Kant points out, there is no other logical use of concepts except in judgments (B 93), where a concept, as a predicate, is related to objects by means of another representation, a subject. No concept is related to objects directly (like intuition). In a judgment, a concept becomes an assertion (predicate) that is related to objects under some condition (subject) by means of which objects are represented. A logical unity of representations is thus established in the following way: many objects that are represented by means of some condition A are subsumed under some general assertion B, under which other conditions A’, A”, . . . too can possibly be subsumed. The unity of a judgment is objective, since it is conditioned by a representation (a subject concept or a judgment) that is objective or related to objects. The objective unity in a judgment is generalized by Kant so as to hold not merely between concepts (subject and predicate), but also between judgments themselves (as parts of a hypothetical or a disjunctive judgment).
According to Kant, the aspects and types of the unity of representations in a judgment can be exhaustively and systematically described and brought under the four main “titles”: quantity, quality, relation, and modality. This is a famous division of judgments that became standard in traditional logic after Kant.
a. Quantity and Quality
The assertion of a judgment can be related to its condition of objectivity without any exception or with a possible exception. In the first case, the judgment is universal (for example, “All A are B”), and in the second case, it is particular (for example, “Some A are B”).
With respect to a given condition of objectivity, an assertion is combined or not combined with it. In the first case, the judgment is affirmative (for example, “Some A are B”), while in the second case, it is negative (for example, “Some A are not B”).
If taken together, quantity and quality yield the four traditionally known (Aristotelian) types of judgment: universal affirmative (“All A are B,” AaB), universal negative (“No A is B,” AeB), particular affirmative (“Some A are B,” AiB), and particular negative (“Some A are not B,” AoB).
b. Relation
In a judgment, an assertion is brought under some condition of objective validity. There are three possible relations of the condition of objective validity to the assertion—subject–predicate, antecedent–consequent, and whole–members—each one represented by an appropriate exponent (“copula” in a wider sense).
(1) In a categorical judgment, a concept (B) as a predicate is brought under the condition of another concept (A) that is a subject that represents objects. Predicate B is an assertion that obtains its objective validity by means of the subject A as the condition:
x, which is contained under A, is also under B (XVI refl. 3096, JäscheLogic §29 IX:108, symbols modified).
The relation of a categorical judgment is represented by the copula “is.” A categorical judgment stands under the principle of contradiction, which is formulated by Kant in the following way:
No predicate contradictory of a thing can belong to it (B 190).
Hence, there is no violation of the principle of contradiction in stating “A is B and non-B” as far as B or non-B does not contradict A. Only, “and” is not a logical operator for Kant, since it can be relativized by time: “A is B” and “A is non-B” can both be true, but at different moments in time (B 192). (Thus, Kant’s logic of categorical judgments can be considered as “paraconsistent,” in the sense that p and not-p, not violating the law of contradiction, do not entail an arbitrary judgment.)
(2) In a hypothetical judgment, some judgment (say, categorical), q, is an assertion that obtains its objective validity under the condition of another judgment, p: q is called a consequent, p its antecedent (ground), while their relation is what Kant calls (in accordance with other logics of the time) consequence. The exponent of the hypothetical judgment is “if . . . then . . .,” but it need not correspond to the main operator of a judgment in the sense of the syntax in modern logic. This means that a hypothetical judgment is not simply a conditional, since, for instance, it should also include universally quantified propositions like “If the soul is not composite, then it is not perishable,” which could be naturally formalized as “x ((Sx ˄ ¬Cx) → ¬Px) (compare Dohna-Wundlacken Logic XX-II:763; see examples in LV-I:203, LV-II:472). Let us note that “If something is a human, then it is mortal” is for Kant a hypothetical judgment, in distinction to the categorical judgment “All humans are mortal” (Vienna Logic XX-II:934, Hechsel Logic LV-II:31).
A hypothetical judgment stands under the principle of sufficient reason:
Each assertion has its reason.
Not having a reason contradicts the concept of assertion. By this principle (to be distinguished from Leibniz‘s ontological principle of the same name), q and not-q are excluded as consequents of the same antecedent: they cannot be grounded on one and the same reason. As can be seen, only now do we come to a version of the Aristotelian principle of contradiction, according to which no predicate can “simultaneously” belong and not belong to the same subject. On the other hand, we have no guarantee that there will always be an antecedent sufficient to decide between some p and not-p as its possible consequents. (In this sense, it could be said that Kant’s logic of assertions is “paracomplete.”)
(3) In a disjunctive judgment, the component judgments are parts of some whole (the disjunctive judgment itself) as their condition of objective validity. That is, the objectively valid assertion is one of the mutually exclusive but complementary parts of the whole, for example:
x, which is contained under A, is contained either under B or C, etc. (XVI refl. 3096, JäscheLogic §29 IX:108).
The exponent of the disjunctive relation is “either . . . or . . .” in the exclusive sense, and, again, it should not be identified with the main operator in the modern sense. To see this, let us take Kant’s example of a disjunctive judgment, “A learned man is learned either historically or rationally,” which would, in a modern formalization, give a universally quantified sentence “x (Lx ® (Hx ˅Rx)) (JäscheLogic §29 IX:107).
In a disjunctive judgment, under the condition of an objective whole, some of its parts hold with the exclusion of the rest of the parts. A disjunctive judgment stands under the principle of excluded middle between p and not-p, since it is a contradiction to assert (or to deny) both p and not-p.
Remark. With respect to relation, a judgment is gradually made more and more determinate: from allowing mutually contradictory predicates, to excluding such contradictions on some ground but allowing undecidedness among them, to positing exactly one of the contradictory predicates by excluding the others. Through the three relations in a judgment, we step by step upgrade the conditions of a judgment, improve its unity, and strengthen logical laws, starting from paraconsistency and paracompleteness to finally come to a sort of classical logic.
In general, we can see that relation is what the objective unity of consciousness in a judgment basically consists in: it is a unifying function that (in three ways) relates a manifold of given representations to some condition of their objectivity. Since judgment is generally defined as a manner of bringing our representations to the objective unity of consciousness, the relation of a judgment makes the essential aspect of a judgment.
c. Modality
This is one of the most distinctive parts of Kant’s logic, revealing its purely intensional character. One and the same judgment structure (quantity, quality, and relation of a judgment) can be thought by means of varying and increasing strength as possible, true, and necessary. Correspondingly, Kant distinguishes
(1) problematic,
(2) assertoric, and
(3) apodictic
judgments (assertoric judgment is called “proposition,” Satz). For example, the antecedent p of a hypothetical judgment is thought merely as problematic (“if p”); secondly, p can also occur outside a hypothetical judgment as, for some reason, an already accepted judgment, that is, as assertoric; finally, p can occur as necessarily accepted on the ground of logical laws, thus apodictic.
These modes of judgment pertain just to how a judgment is thought, that is, to the way the judgment is accepted by understanding (Verstand). Kant says that (1) problematic modality is a “free choice,” an “arbitrary assumption,” of a judgment; (2) assertoric modality (in a proposition) is the acceptance of a judgment as true (logical actuality); while (3) apodictic modality consists in the “inseparable” connection with understanding (see B 101).
There is no special exponent (or operator) of modality; modality is just the “value,” “energy,” of how the existing exponent of a relation in a judgment is thought. Modality is in an essential sense distinguished from the quantity, quality, and relation, which, in distinction, constitute the logical content of a judgment (see B 99‒100; XVI refl. 3084).
Despite a very specific nature of modality, it is in a significant way—through logical laws—correlated with the relation of a judgment:
(1) logical possibility of a problematic judgment is judged with respect to the principle of contradiction—no judgment that violates this principle is logically possible;
(2) logical actuality (truth) of an assertoric judgment is judged with respect to the grounding of the judgment on some sufficient reason;
(3) logical necessity of an apodictic judgment is judged with respect to the decidability of the judgment on the ground of the principle of excluded middle
(see Kant’s letter to Reinhold from May 19, 1789 XI:45; Reich 1948 pp. 73‒76).
The interconnection of relation and modality is additionally emphasized by the fact that Kant sometimes united these two aspects under the title of queity (quaeitas) (XVI refl. 3084, Reich 1948 pp. 60‒61).
d. Systematic Overview
Systematic Overview
Kant gives an overview of his formal logical doctrine of judgments by means of the following table of judgments:
In his transcendental logic, Kant adds singular and infinite judgments as special judgment types. In formal logic (as was usual in logic textbooks of Kant’s time), they are subsumed under universal and affirmative judgments, respectively (see B 96‒97). A characteristic departure from the custom of 17th- and 18th-century logic textbooks is Kant’s (generalized) aspect of relation, which is not reducible to the subject–predicate relation, and directly comprises categorical, hypothetical, and disjunctive judgments—bypassing, for example, subdivision into simple and compound judgments. Another divergence from the custom of the time is Kant’s understanding of modality as independent of explicit modal expressions (“necessarily,” “contingently,” “possibly,” “impossibly”). Instead, Kant understands modality as an intrinsic moment of each judgment (for example, the antecedent and the consequent of a hypothetical judgment are as such problematic, and the consequence between them is assertoric), in distinction to the customary division into “pure” and “modal” propositions. The result of this was a more austere system of judgments that is reduced to strictly formal criteria in Kant’s sense and avoids the admixture of psychological, metaphysical, or anthropological aspects (B VIII).
Kant’s table of judgments has a systematic value within his formal logic. The fact that Kant uses the tabular method to give an overview of the doctrine of judgments shows, according to his methodological view on the tabular method (Section 6), that he is only summarizing a systematic whole of knowledge. Formal logic, as a system, is a “demonstrated doctrine” (Section 6), where everything “must be certain completely a priori” (B 78, compare many other places like B IX; A 14; Prolegomena IV:306; Groundwork of the Metaphysics of Morals IV:387; XVI refl. 1579 p. 21, 1587, 1620 p. 41, 1627, 1628; Preisschrift XX:271). Kant’s text supports the view that his formal logic should include a systematic, a priori justification of his table of judgments, despite dispute among scholars about how this justification can be reconstructed (see Section 7).
5. Inference
In an inference, a judgment is represented as “unfailingly” (that is, a priori, necessarily) connected with (and “derived” from) another judgment that is its ground (see B 360).
Kant distinguishes two ways we can derive a judgment (conclusion) from its ground:
(a) by the formal analysis of a given judgment (ground, premise), without the aid of any additional judgment—such an inference, which is traditionally known as immediate consequence, Kant calls an inference of understanding (Verstandesschluß, B 360);
(b) by the subsumption under some already accepted judgment (major premise) with the aid of some mediate judgment (additional, minor premise)—this is an inference of reason (Vernunftschluß), that is, a syllogism (B 360, compare, for example, XVI refl. 3195, 3196, 3198, 3201).
Kant distinguishes between “understanding” (Verstand) and “reason” (Vernunft) in the following way: understanding is the faculty of the unity of representations (“appearances”) by means of rules, while “reason” is the faculty of the unity of rules by means of principles (see B 359, 356, 361). Obviously, inference of understanding essentially remains at the unity already established by means of a given judgment (rule), whereas inference of reason starts from a higher unity (principle) under which many judgments can be subsumed.
Additionally, we can infer a conclusion by means of a presumption on the ground of already accepted judgments. This inference Kant names inference of the power of judgment (Schluß der Urteilskraft), but he does not consider it to belong to formal logic in a proper sense, since its conclusion, because of possible exceptions, does not follow with necessity.
a. Inference of Understanding (Immediate Consequence)
This part of Kant’s logical theory includes a variant of the traditional (Aristotelian) doctrine of immediate consequence, but as grounded in Kant’s previously presented theory of judgment. According to Kant, in an inference of understanding, we merely analyze a given judgment with respect to its logical form. Thus, Kant divides inference of understanding in accordance with his division of judgments:
(a) with respect to the quantity of a judgment, an inference is possible by subalternation: from a universal judgment to its corresponding particular judgment of the same quality (AaB / AiB, AeB / AoB);
(b) with respect to the quality of a judgment, an inference is possible according to the square of opposition (which usually includes subalternation): of the contradictories (AaB and AoB, AeB, and AiB), one is true and another false; of the contraries (AaB and AeB), at least one is false; of the subcontraries (AiB and AoB), at least one is true;
(c) with respect to the relation of a judgment, there is an inference by conversion (simple or changed): if B is (not) predicated of A, then A is (not) predicated of B (AaB / BiA, AeB / BeA, AiB / BiA);
(d) with respect to modality, an inference is possible by contraposition (for example AaB / non-BeA); Kant assigns contraposition to modality because the contraposition changes the logical actuality of the premise (proposition) to the necessity of the conclusion; that is, granted the premise, the conclusion expresses the exclusion (opposite) of self-contradiction (XVI refl. 3170, Hechsel Logic LV-II:448): granted AaB, non-B contradicts A (also, granted AeB or AoB, universal exclusion of non-B contradicts A, that is, non-BiA follows).
These inferences are valid on the ground of Kant’s assumption of the non-contradictory subject concept. Otherwise, if the subject concept is self-contradictory (nothing can be thought by it), then both contradictories would be false. For example, “A square circle is round” and “A square circle is not round” are both false due to the principle of contradiction (Prolegomena §52b IV:341, B 821: “both what one asserts affirmatively and what one asserts negatively of the object [of an impossible concept] are incorrect”; see B 819, 820‒821).
b. Inference of Reason (Syllogism)
Kant considers inference of reason within a variant of traditional theory of syllogisms, which includes categorical syllogism (substantially reduced to the first syllogistic figure), hypothetical syllogism, and disjunctive syllogism, everything shaped and modified in accordance with his theory of judgments and his conception of logic in general.
Each syllogism starts from a judgment that has the role of the major premise. In Kant’s view, the major premise is a general rule under the condition of which (for example, of its subject concept) a minor premise is subsumed. Accordingly, the condition of the minor premise itself (for example, its subject concept) is subsumed in the conclusion under the assertion of the major premise (for example, its predicate) (B 359‒361, B 386‒387). The major premise becomes in a syllogism a (comparative) principle from which other judgments can be derived as conclusions (see B 357, 358). Since there are three species of judgments with respect to relation, Kant distinguishes three species of syllogisms according to the relation of the major premise (B 361, XVI refl. 3199):
(a) Categorical syllogism. Kant starts from a standard doctrine of first syllogistic figure, where the major concept (predicate of the major premise) is put in relation to the minor concept (subject of the minor premise) by means of the middle concept (the subject of the major and the predicate of the minor premise): MaP, SaM / SaP; MeP, SaM / SeP; MaP, SiM / SiP; MeP, SiM / SoP. Kant insists that only the first figure of the categorical syllogism is an inference of reason, whereas in other figures there is a hidden immediate inference (sometimes reductio ad absurdum is needed) by means of which a syllogism can be transformed into the first figure (B 142 footnote, XVI refl. 3256; see The False Subtlety of the Four Syllogistic Figures in II).
(b) Hypothetical syllogism. The major premise is a hypothetical judgment, in which the antecedent and the consequent are problematic. Subsumption is accomplished by means of the change of the modality of the antecedent (or of the negation of the consequent) to an assertoric judgment (minor premise), from where in the conclusion the assertoric modality of the consequent (or of the negation of the antecedent) follows. The inference from the affirmation of the antecedent to the affirmation of the consequent is modus ponens, and the inference from the negation of the consequent to the negation of the antecedent is modus tollens of the hypothetical syllogism.
(c) Disjunctive syllogism. The major premise is a disjunctive judgment, where the disjuncts are problematic. Subsumption is carried out by the change of the problematic modality of some disjuncts (or their negations) to assertoric modality, from where in the conclusion the assertoric modality of the negation of other disjuncts (the assertoric modality of other disjuncts) follows. The inference from the affirmation of one part of the disjunction to the negation of the other part is modus ponendo tollens, and the inference from the negation of one part of the disjunction to the affirmation of the other part is modus tollendo ponens of the disjunctive syllogism.
In hypothetical and disjunctive syllogisms, there is no middle term (concept). As explained, the subsumption under the rule of the major premise is carried out just by means of the change of the modality of one part (or of its negation) of the major premise (see XVI refl. 3199).
In Kant’s texts, we can find short indications on how a theory of polysyllogisms should be built (for example, B 364, B 387‒389). Inference can be continued on the side of conditions by means of a prosyllogism, whose conclusion is a premise of a given syllogism (an ascending series of syllogisms), as well as on the side of what is conditioned by means of an episyllogism, whose premise is the conclusion of a given syllogism (a descending series of syllogisms). In order to derive, by syllogisms, a given judgment (conclusion), the ascending totality of its conditions should be assumed (either with some first unconditioned condition or as an unlimited but unconditioned series of all conditions) (B 364). In distinction, a descending series from a given conclusion could be only a potential one, since the acceptance of the conclusion, as given, is already granted by the ascending totality of conditions (B 388‒389). By requiring a given, completed ascending series of syllogisms, we advance towards the highest, unconditioned principles (see B 358). In this way, the logical unity of our representations increases towards a maximum: our reason aims at bringing the greatest manifold of representations under the smallest number of principles and to the highest unity (B 361).
c. Inference of the Power of Judgment (Induction and Analogy)
The inference of the power of judgment is only a presumption (“empirical inference”), and its conclusion a preliminary judgment. On the ground of the accordance in many special cases that stand under some common condition, we presume some general rule that holds under this common condition. Kant distinguishes two species of such an inference: induction and analogy. Roughly,
(a) by induction, we conclude from A in many things of some genus B, to A in all things of genus B: from a part of the extension of B to the whole extension of B;
(b) by analogy, we conclude from many properties that a thing x has in common with a thing y, to the possession by x of all properties of y that have their ground in C as a genus of x and y (C is called tertium comparationis): from a part of a concept C to the whole concept C
(see XVI refl. 3282‒3285).
What justifies such reasoning is the principle of our power of judgment, which requires that many cases of accordance should have some common ground (by means of belonging to the extension of the same concept or by having the marks of the same concept). However, since we do not derive this common ground with logical necessity, no objective unity is established, but only presumed, as a result of our subjective way of reflecting.
d. Fallacious Inference
For Kant, fallacious inferences should be explained by illusion (Schein, B 353): an inference may seem to be correct if judged on the ground of its appearance (species,Pölitz Logic XXIV-II:595, Warsaw Logic LV-II:649), although the real form of this proposed inference may be incorrect (just an “imitation” of a correct form, B 353, 354). Through such illusions, logic illegitimately becomes an organon to extend our knowledge outside the limits of the canon of logical forms. Kant calls dialectic the part of logic that deals with the discovery and solutions of logical illusions in fallacious inferences (for example, B 390, 354), in distinction to mere analytic of the forms of thought. Formal logic gives only negative criteria of truth (truth has to be in accordance with logical laws and forms), but cannot give any general material criterion of truth, because material truth depends on the specific knowledge about objects (B 83‒84). Formal logic, which is in itself a doctrine, becomes in its dialectical part the critique of fallacies and of logical illusion. In his logic lectures and texts, Kant addresses some traditionally well-known fallacies (for example, sophisma figurae dictionis, a dicto secundum quid ad dictum simpliciter, sophisma heterozeteseos, ignoratio elenchi, Liar). Below, in connection with Kant’s transcendental logic, we mention some of his own characteristic, systematically important examples of fallacies.
6. General Methodology
Since, according to Kant, formal logic abstracts from the differences of objects and hence cannot focus on the concrete content of a particular science, it can only give a short and very general outline of the form of a science, as the most comprehensive logical form. This outline is a mere general doctrine on the formal features of a method and on the systematic way of thinking. On the other hand, many interesting distinctions can be found in Kant’s reflections on general methodology that cast light on Kant’s approach to logic, philosophy, and mathematics.
Building on his concept of the faculty of reason, Kant defines method in general as the unity of a whole of knowledge according to principles (or as “a procedure in accordance with principles,” B 883). By means of a method, knowledge obtains the form of a system and transforms into a science. Non-methodical thinking (without any order), which Kant calls “tumultuar,” serves in combination with a method the variety of knowledge (whereas method itself serves the unity). In a wider sense, Kant speaks of a fragmentary (rhapsodical) method, which consists only in a subjective and psychological connection of thinking (it does not establish a system, but only an aggregate of knowledge, not a science, but merely ordinary knowledge).
In further detail, Kant’s general methodology includes the doctrine of definition, division, and proof—mainly a redefined, traditionally known material, with Kant’s own systematic form.
Let us first say that for Kant a concept is clear if we are conscious of its difference from other concepts. Also, a concept is distinct if its marks are clearly known. Now, definition is, according to Kant, a clear, distinct, complete, and precise (“reduced to a minimal number of marks”) presentation of a concept. Since all these requirements for a definition can be strictly fulfilled only in mathematics, Kant distinguishes various forms of clarification that only partially fulfill the above-mentioned requirements, as exposition, which is clear and distinct, but need not be precise and complete (see XVI refl. 2921, 2925, 2951; B 755‒758). Division is the representation of a manifold under some concept and as interrelated, by means of mutual opposition, within the whole sphere of the concept (see XVI refl. 3025).
Proof provides certainty to a judgment by making distinct the connection of the judgment with its grounds (see XVI refl. 2719). Proofs can be distinguished with respect to the grade of certainty they provide. (1) A proof can be apodictic (strong), in a twofold way: as a demonstration (proof by means of the construction in an intuition, in concreto, as in mathematics) or as a discursive proof (by means of concepts, in abstracto, as in philosophy). In addition, a strong proof can be direct (ostensive), by means of the derivation of a judgment from its ground, or indirect (apagogical), by means of showing the untenability of a consequent of the judgment’s contradictory. In his philosophy, Kant focuses on the examples where indirect proofs are not applicable due to the possibility of dialectical illusion (contraries and subcontraries that only subjectively and deceptively appear to be contradictories, which is impossible in mathematics, B 819‒821). (2) Sometimes the grounds of proof give only incomplete certainty, for instance, empirical certainty (as in induction and analogy), probability, possibility (hypothesis), or merely apparent certainty (fallacious proof) (see Critique of Judgment §90 V:463).
Furthermore, Kant distinguishes the syllogistic and tabular methods. The syllogistic method derives knowledge by means of syllogisms. An already established systematic whole of knowledge is presented in its whole articulation (branching) by the tabular method (as is the case, for example, with Kant’s tables of judgments and categories; see, for example, PölitzLogic XXIV-II:599, Dohna-WundlackenLogic XXIV-II:80, HechselLogic LV-II:494). In addition, the division of the syllogistic method into the synthetic (progressive) and analytic (regressive) is important. The former proceeds from the principles to what is derived, from elements (the simple) to the composed, from reasons to what follows from them, whereas the latter proceeds the other way around, from what is given to its reasons, elements, and principles. (For the application of these two syllogistic methods in metaphysics, see, for instance, B 395 footnote.)
Finally, Kant comes to the following three general methodological principles (B 685‒688):
(1) the principle of “homogeneity of the manifold under higher genuses”;
(2) the principle of specification, that is, of the “variety of the homogeneous under lower species”;
(3) the principle of continuity of the transition to higher genuses and to lower species.
These principles correspond to the three interests of the faculty of reason: the interests of unity, manifold, and affinity. Again, all three principles are just three sides of one and the same, most general, principle of the systematic (“thoroughgoing”) unity of our knowledge (B 694).
The end result of the application of methodology in our knowledge is a “demonstrated doctrine,” which derives knowledge by means of apodictic proofs. It is accompanied by a corresponding discipline, which, by means of critique, prevents and corrects logical illusion and fallacies.
7. The Foundations of Logic
As stated by Kant, formal logic itself should be founded and built according to strict criteria, as a demonstrated doctrine. It should be a “strongly proven,” “exhaustively presented” system (B IX), with the “a priori insight” into the formal rules of thinking “through mere analysis of the actions of reason into their moments” (B 170). Since in formal logic “the understanding [Verstand] has to do with nothing further than itself and its own form” (B IX), formal logic should be grounded in the condition of the possibility of the understanding in the formal sense, and this condition is technically (operationally) defined by Kant as the unity of pure (original) self-consciousness (apperception) (B 131, compare XVI 21 refl. 1579: logical rules should be “proven from the reason [Vernunft]”). This unity is the fundamental, qualitative unity of the act of thinking (“I think”) as opposed to a given manifold (variety) of representations. The operational “one-many” opposition, as well as the further analysis of its general features and structure, should be appropriate as a foundational starting point from which a system of logic could be strongly derived. The basic step of the analysis of this fundamental unity is Kant’s distinction between the analytic and synthetic unity of self-consciousness (see, for example, B §§15‒19): at first, the act of thinking (“I think”) appears simply to accompany all our representations. It is the identity of my consciousness in all my representations, termed by Kant analytic unity of self-consciousness. But this identity of consciousness would, for me (as a thinking subject), not be possible if I would not conjoin (synthesize) one representation with another and be conscious of this synthesis. Thus, the analytic unity of self-consciousness is possible only under the condition of the synthetic unity of self-consciousness (B 133). Kant further shows that the synthetic unity is objective, because it devises a concept of object with respect to which we synthesize representations into a unity. This unity is necessary and universally valid, that is, independent of any changeable, psychological state.
In Kant’s words: “the synthetic unity of apperception is the highest point to which one must affix all use of the understanding, even the whole logic and, after it, transcendental philosophy; indeed this faculty is the understanding itself” (B 134 footnote; see A 117 footnote and Opus postumum XXII:77). (For a formalization of Kant’s theory of apperception according to the first edition of the Critique of Pure Reason, see Achourioti and Lambalgen 2011.)
Kant himself did not write a systematic presentation of formal logic, and the form and interpretation of Kant’s intended logical system are disputed among Kant scholars. Nevertheless, it is evident that each logical form is conceived by Kant as a type of unity of given representations, that this unity is an act of thinking and consciousness, and that each logical form is therefore essentially related to the “original” unity of self-consciousness. Some scholars, starting from the concept of the original unity of self-consciousness—that is, from the concept of understanding (as confronted with a given “manifold” of our representations)—proposed various lines of a reconstruction of Kant’s assumed completeness proof of his logical forms (or supplied such a proof on their own), in particular, of his table of judgments (see a classical work by Reich 1948, and, for example, Wolff 1995, Achourioti and van Lambalgen 2011, Kovač 2014). There are authors who offer arguments that the number and the species of the functions of our understanding are for Kant primitive facts, and can be at most indicated (Indizienbeweis) on the ground of the “functional unity” of a judgment (Brandt 1991; see a justification of Kant’s table of judgments in Krüger 1968).
8. Transcendental Logic (Philosophical Logic)
Besides formal logic, Kant considers a branch of philosophical logic that deals with the foundations of ontology and the rest of metaphysics and shows how objects are constituted in our knowledge by means of logical categorization. This branch of logic Kant names “transcendental logic.”
a. A Priori–A Posteriori; Analytic–Synthetic
Kant’s transcendental logic is based on two important distinctions, which exerted great influence in the ensuing history of logic and philosophy: the distinction between a priori and a posteriori knowledge, and the distinction between synthetic and analytic judgments (see B 1‒3).
Knowledge is a priori if it is possible independently of any experience. For instance, “Every change has its cause.” As the example shows, knowledge can be a priori, but about an empirical concept, like “change,” since given a change, we independently of any experience know that it should have a cause. A priori knowledge is pure if it has no empirical content, like, for example, mathematical propositions.
Knowledge is a posteriori (empirical) if it is possible only by means of experience. An example is “All bodies are heavy,” since we cannot know without experience (just from the concept of body) whether a body is heavy.
Kant gives two certain, mutually inseparable marks of a priori knowledge: (1) it is necessary and derived (if at all) only from necessary judgments; (2) it is strictly universal, with no exceptions possible. In distinction, a posteriori knowledge (1) permits that the state of affairs that is thought of can also be otherwise, and (2) it can possess at most assumed and comparative universality, with respect to the already perceived cases (as in induction) (B 3‒4).
Analytic and synthetic judgments are distinguished with respect to their content: a judgment is analytic if it adds nothing to the content of the knowledge given by the condition of the judgment; otherwise, it is synthetic.
That is, analytic judgments are merely explicative with respect to the content given by the condition of the judgment, while synthetic judgments are expansive with respect to the given content
(see Prolegomena §2a IV:266, B 10‒11). Kant exemplifies this distinction on affirmative categorical judgments: such a judgment is analytic if its predicate does not contain anything that is not contained in the subject of the judgment; otherwise, the judgment is synthetic: its predicate adds to the content of the subject what is not already contained in it. An example of analytic judgments is “All bodies are extended” (“extended” is contained in the concept “body”); an example of synthetic judgments is the empirical judgment “All bodies are heavy” (“heavy” is not contained in the concept “body”).
We note that Kant’s formal logic should contain only analytic judgments, although its laws and principles refer to and hold for all judgments (analytic and synthetic) in general (see Reich 1948 14‒15, 17). Conversely, analytic knowledge is based on formal logic, affirming (negating) only what should be affirmed (negated) on pain of contradiction. Let us remark that for Frege, unlike for Kant, this notion of analytic knowledge holds also for arithmetic.
b. Categories and the Empirical Domain
The objective of Kant’s transcendental logic is pure forms of thinking in so far as they a priori refer to objects (B 80‒82). That is, necessary and strictly universal ways should be shown for how our understanding determines objects, independently of, and prior to, all experience. In Kant’s technical language, this means that transcendental logic should contain synthetic judgments a priori.
According to Kant’s restriction on transcendental logic, objects can be given to us only in a sensible intuition. These objects can be conceived as making Kant’s only legitimate, empirical domain of theoretical knowledge. Hence, the task is to discover which pure forms of our thought (categories, “pure concepts of understanding”), and in which way, determine the empirically given objects. Kant obtains categories from his table of logical forms of judgment (“metaphysical deduction of categories,” B §10, see §§20, 26) because these forms, besides giving unity to a judgment, are also what unite a sensibly given manifold into a concept of an object. Technically expressed, a form of a judgment is a “function of unity” that can serve to synthesize a manifold of an intuition. The manifold is synthesized into a unity that is a concept of an object given in the intuition. To “deduce” categories, Kant introduces some small emendations into his table of the logical functions in judgments. These emendations are needed because what is focused on in transcendental logic is not merely the form of thought, but also the a priori content of thought. Thus, Kant extends the division of “moments” under the titles of quantity and quality of judgments by adding singular and infinite judgments, respectively (for instance, “Plato is a philosopher”; “The soul is non-mortal”). He also changes the term “particular judgment” for “plurative,” since the intended content is not an exception from totality (which is the logical form of a particular judgment), but plurality independently from totality. With respect to the content, Kant reverses the order under the title of quantity (Prolegomena §20 footnote IV:302).
In correspondence with the 12 forms of judgments, Kant obtains 12 categories:
(Prolegomena §21 IV:303).
Sometimes, the order of the categories of quality is also changed: reality, limitation, full negation (Prolegomena §39 IV:325). In the Critique of Pure Reason, the table is more explicative. Under “Relation,” Kant lists:
(a) inherence and subsistence (substantia et accidens);
(b) causality and dependence (cause and effect);
(c) community (interaction between agent and patient).
Under “Modality,” he adds negative categories of impossibility, non-existence, and contingency (B 106). (For a possible reconstruction of a deduction of categories from the synthetic unity of self-consciousness as the first principle, see Schulting 2019.)
Kant further shows that all objects of a sensible intuition in general (be it in space and time or not) presuppose a synthetic unity (in self-consciousness) of a manifold according to categories. On the ground of this premise, he also shows that all objects of our experience, too, stand under categories. Briefly, in the proof of this result, Kant shows, first, that each of our empirical intuitions presupposes a synthetic unity according to which space and time are determined in this intuition. We then abstract from the space-time form of our empirical intuition, isolate just the synthetic unity, and, by subsumption under the first premise (on intuitions in general), conclude that this synthetic unity is based on the categories, which are applicable to our space-time intuition (“transcendental deduction of categories,” B §§20, 21, 26, B 168‒169).
In addition, transcendental logic comprises a theory of judgments a priori and of their principles. These principles determine how categories, which are pure concepts, are applied to objects given in our intuition and make our knowledge of objects possible. For Kant, there is no way to come to a theoretical knowledge of objects other than by means of experience, which includes, as its formal side, categories as well as space and time. Accordingly, there are a priori judgments about how categories can have objective validity in application to what can be given in our space-time intuition. As Kant puts it: the conditions (including categories) of the possibility of experience are at the same time the conditions of the possibility of the objects of experience, and thus have objective validity (B 197).
Kant systematically elaborates the principles of the pure faculty of understanding in consonance with his table of judgments. According to these principles, different moments that constitute our experience (1. intuition; 2. sensation; 3. perception of permanence, change, and simultaneity; 4. formal and material conditions in general) are subsumed under corresponding categories (1. extensive magnitude, 2. intensive magnitude, 3. categories of relation, 4. modal categories).
Kant emphasizes that concepts themselves cannot be conceived as objects (noumena) in the same (empirical) domain of objects (appearances, phaenomena) to which they as concepts apply. That is, in modern terms, we can speak of noumena only within a second-order regimentation of domains, with the lower (empirical) domain as ontologically preferred.
c. Transcendental Ideas
There are further concepts to which we are led, not by our faculty of understanding and the forms of judgment, but by our faculty of reason and its forms of inference. In distinction to categories, which are applicable to the domain of our experience, the concepts of the faculty of reason do not have their corresponding objects given in our intuition; their correspondents can only be purported objects “in themselves” (Dinge an sich), which transcend all our experience. A concept of the “unconditioned” (“absolute,” referring to the totality of conditions) for a given “conditioned” thing or state is termed by Kant a transcendental idea. Transcendental ideas, although going beyond our experience, have a regulative role to direct and lead our empirical thought towards the paradigm of the unconditioned synthetic unity of knowledge. According to the three species of inference of reason (categorical, hypothetical, and disjunctive), there are three classes of transcendental ideas (B 391, 438‒443):
(1) the unconditioned unity of the subject (the idea of the “thinking self”) that is not a predicate of any further subject;
(2) the unconditioned unity of the series of conditions of appearance (the idea of “world”), which further divides into four ideas in correspondence with the four classes of categories:
(a) the unconditioned completeness of the composition of the whole of appearances,
(b) the unconditioned completeness of the division of a given whole in appearance,
(c) the unconditioned completeness of the causal series of an appearance,
(d) the unconditioned completeness of the dependence of appearances regarding their existence;
(3) the unconditioned unity of the ground of all objects of thinking, in accordance with the principle of complete determination of an object regarding each possible predicate (the idea of “being of all beings”).
These transcendental ideas are in a natural way connected with a dialectic of our faculty of reason, where reason aims towards the knowledge of empirically unverifiable objects (B 397‒398).
(1) Through transcendental paralogisms, we come to think of the formal subject of our thought as of a substance.
(2) Through the antinomies of pure reason, the following opposites (seeming contradictions) remain undecided:
(a) the world has a beginning – the world is infinite;
(b) each composed thing consists of simple parts – there is nothing simple in things (they are infinitely divisible);
(c) there is a causality of freedom – everything happens according to the laws of nature;
(d) there is an absolutely necessary being – everything is contingent.
(3) The ideal of pure reason leads us to found the principle of complete determination on the idea of the most perfect being. In addition, Kant assumes here that “existence” is not a real predicate—that is, it does not contribute to the determination of a thing.
Kant insists on separating and excluding (1) the formal logical subject (“I think”) of all our thought from the empirical objects (substances) about which the subject can think; (2) the domain of experience from the members of this domain; and (3) the totality of concepts applicable to the domain from these concepts themselves. Thus, Kant’s transcendental dialectic includes and deals with logical problems connected with the possible disregarding of what we could today call type-theoretical distinctions and the distinction between a theory and its metatheory.
Let us add a methodological remark about the relationship between mathematical and transcendental logical knowledge. The rigor of mathematical evidence (intuitive certainty, B 762) is based, according to Kant, on the possibility of constructing mathematical concepts in intuition. This construction can be ostensive (geometric) or “symbolic” (“characteristic”, B 745, 762, as in arithmetic and algebra). However, as Kant points out, this is not available for transcendental logic, where knowledge should also be apodictic and a priori, but confined to the abstract, conceptual “exposition” (without a construction in intuition, albeit with an application of concepts to intuition). For this reason, definitions and demonstrations in the strictest sense are possible in mathematics, but not in transcendental logic (B 758‒759, 762‒763).
9. Influences and Heritage
Although Kant’s logic, if taken literally, is in form and content largely traditional as well as significantly dependent on the science of his time, it offered new essential and foundational perspectives that are deeply (and often unknowingly) built into modern logic.
Kant required a formal, though not mathematical, rigor in logic, purifying it of psychological and anthropological admixtures. This rigor was required in two ways: (a) in the sense of functionally defined logical forms, and (b) in the sense of a systematic, scientific form of logic. Kant’s transcendental logic is characterized by the strict distinction of formal logical and metaphysical aspects of concepts, as well as by defined standards of the justification of concepts and of their application in an empirical model of knowledge. Nevertheless, Kant strictly separated mathematical and philosophical rigor. It is in the aspect of the possibilities of the “symbolic construction” of concepts that modern logic has made great advances in comparison to Kant’s logic.
Let us give some examples of Kant’s influence on the posterior development of logic and philosophy.
Kant’s table of judgments influenced a large part of traditional or reformed traditional logic deep into the 20th century. Besides, although Frege criticized Kant’s table of judgment as contentual and grammatical, in Frege’s distinction between “judging” and the content of judgment, Kant’s distinction between modality and the logical content of the judgment can be traced. Kant’s restriction of the importance of categorical judgments, with an emphasis also on the logical relation between judgments, announced the future development of truth-functional propositional logic. Kant’s criterion of sensible intuition for the givenness of objects inspired Hilbert’s finitistic formalism with “concrete signs” and their shapes as the immediately intuitively given of his metamathematics. Kant’s foundational theory of the unity of apperception (in application to time) inspired the emergence of intuitionism (Brouwer). Kant’s undecidability of geometry by analytic means, properly corrected and reinterpreted, anticipates Gödel’s incompleteness results.
Kant’s distinctions of the analytic and the synthetic, and of the a priori and the a posteriori, had a deep impact on philosophical and mathematical logic, and have delineated an important part of philosophical discussions after Kant. Frege especially praised Kant’s analytic-synthetic distinction, despite his departure from Kant, according to whom arithmetic was, like geometry, synthetic. The analytic-synthetic distinction was a crucial subject of discussion and revision, for example, in Carnap‘s, Gödel’s, Quine‘s, and Kripke’s philosophies of logic, language, and knowledge.
Kant’s duality of the conceptual system and empirical model, with differentiated logical (and ontological) orders of concepts and their (intended) corresponding objects, already leads into the area of solving logical antinomies and of incompleteness (see Tiles 2004). With his conception of successively upgrading logical laws (from the law of contradiction, to the law of sufficient reason, to the law of excluded middle), Kant implicitly offered a general picture of possible logics that exceeds classical logic—as far as it was possible with the tools available to him. His logical foundations of philosophy can still inspire modern logical-philosophical investigations.
10. References and Further Reading
a. Primary Sources
Kant, Immanuel. 1910–. Kant’s gesammelte Schriften. Königlich Preussische Akademie der Wissenschaften (ed.). Berlin: Reimer, Berlin and Leipzig: de Gruyter. Also Kants Werke I–IX, Berlin: de Gruyter, 1968 (Anmerkungen, 2 vols., Berlin: de Gruyter, 1977).
Cited by volume number (I, II, etc.); Kritik der reinen Vernunft, 1st ed. = A, 2nd ed. = B.
Kant, Immanuel. 1998. Critique of Pure Reason. Cambridge, UK: Cambridge University Press. Transl. and ed. by Paul Guyer and Allen W. Wood.
Kant, Immanuel. 1992. Lectures on Logic. Cambridge, UK: Cambridge University Press. Transl. and ed. by J. Michael Young.
Kant, Immanuel. 1998. Logik-Vorlesungen: Unveröffentlichte Nachschriften I‒II. Hamburg: Meiner. Ed. by T. Pinder.
Cited as LV.
Kant, Immanuel. 2004. Prolegomena to Any Future Metaphysics. Cambridge, UK: Cambridge University Press. Transl. and ed. by Gary Hatfield.
b. Secondary Sources
Achourioti, Theodora and van Lambalgen, Michiel. 2011. “A Formalization of Kant’s Transcendental Logic.” The Review of Symbolic Logic. 4: 254–289.
Béziau, Jean-Yves. 2008. “What is ʻFormal Logicʼ?” in Proceedings of the XXII Congress of Philosophy, Myung-Hyung-Lee (ed.), Seoul: Korean Philosophical Association, 13: 9–22.
Brandt, Reinhard. 1991. Die Urteilstafel. Kritik der reinen Vernunft A 67‒76; B 92‒101. Hamburg: Meiner.
Capozzi, Mirella and Roncaglia, Gino. 2009. “Logic and Philosophy of Logic from Humanism to Kant” in Leila Haaparanta (ed.), The Development of Modern Logic. New York: Oxford University Press, pp. 78–158.
Conrad, Elfriede. 1994. Kants Vorlesungen als neuer Schlüssel zur Architektonik der Kritik der reinen Vernunft. Stuttgart-Bad Cannstatt: Frommann-Holzboog.
Friedman, Michael. 1992. Kant and the Exact Sciences. Cambridge (Ma), London: Harvard University Press.
Kneale, William and Kneale, Martha. 1991. The Development of Logic. Oxford: Oxford University Press. First published 1962.
Kovač, Srećko. 2008. “In What Sense is Kantian Principle of Contradiction Non-classical”. Logic and Logical Philosophy. 17: 251–274.
Kovač, Srećko. 2014. “Forms of Judgment as a Link between Mind and the Concepts of Substance and Cause” in Substantiality and Causality, Mirosław Szatkowski and Marek Rosiak (eds.), Boston, Berlin, Munich: de Gruyter, pp. 51–66.
Krüger, Lorenz, 1968. “Wollte Kant die Vollständigkeit seiner Urteilstafel beweisen.” Kant-Studien. 59: 333–356.
Lapointe, Sandra (ed.), 2019. Logic from Kant to Russell: Laying the Foundations for Analytic Philosophy. New York, London: Routledge.
Longuenesse, Beatrice. 1998. Kant and the Capacity to Judge: Sensibility and Discursivity in the Transcendental Analytic of the Critique of Pure Reason. Princeton: Princeton University Press. Transl. by Charles T. Wolfe.
Loparić, Željko. 1990. “The Logical Structure of the First Antinomy.” Kant-Studien. 81: 280–303.
Lu-Adler, Huaping. 2018. Kant and the Science of Logic: A Historical and Philosophical Reconstruction. New York: Oxford University Press.
MacFarlane, John. 2002. “Frege, Kant, and the Logic in Logicism.” The Philosophical Review. 111: 25–65.
Mosser, Kurt. 2008. Necessity and Possibility: The Logical Strategy of Kant’s Critique of Pure Reason. Washington, DC: Catholic University of America Press.
Newton, Alexandra. 2019. “Kant’s Logic of Judgment” in The Act and Object of Judgment, Brian Ball and Christoph Schuringa (eds.), New York, London: Routledge, pp. 66–90.
Reich, Klaus. 1948. Die Vollständigkeit der kantischen Urteilstafel. 2nd ed. Berlin: Schoetz. (1st ed. 1932).
English: The Completeness of Kant’s Table of Judgments, transl. by Jane Kneller and Michael Losonsky, Stanford University Press, 1992.
Scholz, Heinrich. 1959. Abriß der Geschichte der Logik. Freiburg, München: Alber. (1st ed. 1931).
Schulthess, Peter. 1981. Relation und Funktion: Eine systematische und entwicklungsgeschichtliche Untersuchung zur theoretischen Philosophie Kants. Berlin, New York: de Gruyter.
Schulting, Dennis. 2019. Kant’s Deduction from Apperception: An Essay on the Transcendental Deduction of Categories. 2nd revised ed. Berlin, Boston: de Gruyter.
Stuhlmann-Laeisz, Rainer. 1976. Kants Logik: Eine Interpretation auf der Grundlage von Vorlesungen, veröffentlichten Werken und Nachlaß. Berlin, New York: de Gruyter.
Tiles, Mary. 2004. “Kant: From General to Transcendental Logic” in Handbook of the History of Logic, vol. 3, Dov M. Gabbay and John Woods (eds.). Amsterdam etc: Elsevier, pp. 85–130.
Tolley, Christian. 2012. “The Generality of Kant’s Transcendental Logic.” Journal of the History of Philosophy. 50: 417‒446.
Tonelli, Giorgio. 1966. “Die Voraussetzungen zur Kantischen Urteilstafel in der Logik des 18. Jahrhunderts” in Kritik und Metaphysik, Friedrich Kaulbach and Joachim Ritter (eds). Berlin: de Gruyter, pp. 134–158.
Tonelli, Giorgio. 1994. Kant’s Critique of Pure Reason within the Tradition of Modern Logic: A Commentary on its History. Hildesheim, Zürich, New York: Olms.
Wolff, Michael. 1995. Die Vollständigkeit der kantischen Urteilstafel: Mit einem Essay über Frege’s Begriffsschrift. Frankfurt a. M.: Klostermann.
Wuchterl, Kurt. 1958. Die Theorie der formalen Logik bei Kant und in der Logistik. Inaugural-Dissertation, Ruprecht-Karl-Universität zu Heidelberg.
Author Information
Srećko Kovač
Email: skovac@ifzg.hr
Institute of Philosophy, Zagreb
Croatia
Bernardino Telesio (1509—1588)
Dubbed “the first of the new philosophers” by Francis Bacon in 1613, Bernardino Telesio was one of the most eminent thinkers of Renaissance Italy, along with figures such as Pico, Pomponazzi, Cardano, Patrizi, Bruno, Doni, and Campanella.
The young Telesio spent the early decades of his life under the guidance of his uncle Antonio (1482-1534), a fine humanist who was determined to go beyond the strict disciplinary division between literary and philosophical texts. Before the printing of the first edition of his principal work, De natura iuxta propria principia (On the Nature of Things According to their Own Principles) (Rome, 1565), Telesio assimilated the basics of ancient scientific thought (both Greek and Latin), as well as those of Plato’s and Aristotle’s Scholastic commentators. In the second half of the 16th century, he began to be recognized as an adversary of Aristotle’s thought, insofar as he upheld a conception of man and nature that attempted to replace the principles of Aristotle’s natural philosophy. His starting point was the definition of a new role for the notion of sense perception in animal cognition. Using the Stoic notion of spiritus (translating the Greek word pneuma), he criticized Aristotle’s hylomorphism. As a fiery substance and an internal principle of motion, spiritus is the principle of sensitivity: by the way of heat, it pervades the entire cosmos, so that all beings are capable of sensation. In addition to grounding Telesio’s epistemology, then, the notion of spiritus lies at the core of his natural philosophy. During the time span extending from 1565 to 1588, he overturned the traditional conception of the relationship between sensus and intellectus, as championed by the Scholastic followers of Aristotle. Telesio denied that human brain possesses a faculty able to grasp the forms or essentiae of natural beings from simple passive sensible data of experience. On the contrary, sense perception has an active role: it is the first form of understanding the natural world. It is by the “way of senses” that mental representations of natural things are selected and shaped. This process happens in strict cooperation with the corporeal principle of self-organization of the material soul. In human beings as well as in animals, brain is the main source of this principle, which governs the cognitive process without the support of a superior immaterial agent. This active form of “sentience” constitutes the primary causal connection between the brain and the external world. Founded on a reassessment of the categories of sense perception, Telesio’s philosophy of mind led to an empiricist approach to the study of natural phenomena.
Bernardino Telesio was born in Cosenza (Northern Calabria) to Giovanni Battista, a noble man of letters, and Vincenza Garofalo, the daughter of a lawyer. Bernardino was the first-born of eight sons, and as a child was sent to his uncle Antonio (1482-1534) to be educated. In 1517, they went to the Duchy of Milan, where the young Telesio became acquainted with the most illustrious pupils of his uncle. He also met some eminent men of letters, like Matteo Bandello (1485-1561), who in his Novelle (1554) will recall Antonio’s knack for entertaining the members of the intellectual circles led by such gentlewomen as Camilla Scarampa Guidoboni (ca.1454-ca.1518) and Ippolita Sforza Bentivoglio (1481-ca.1520).
In 1523 Bernardino and Antonio moved to Rome, entering the intellectual milieu of the papal court and of the Vatican library, which was animated by philosophers and humanists such as Paolo Giovio (1483-1552), Marco Girolamo Vida (1485-1566), Marcello Cervini (1501-1555), Coriolano Martirano (1503- 1558), and Giovanni Antonio Pantusa (1501-1562). Bernardino left Studium Urbis in 1527, soon after the “sack of Rome”. Then he moved to Padua, where his uncle had been appointed professor of Latin by the municipality of Venice (October 17th, 1527).
During his early education, Bernardino was deeply influenced by his uncle. Antonio was a fine humanist, whose works largely circulated across Europe. To name an example, Antonio’s De coloribus libellus (Venice, 1528) rose to great fame. Following the Venetian first edition, at least ten editions of the work were released in Paris by scholar-printers such as Chrestien Wechel, Jacob Gazel and Robert Estienne (Stephanus); and a further five appeared in Basel. In particular, the Basel reprints were released by such renowned humanists as Hieronymus Froben and Johannes Herbst Oporinus. Thus, the young Bernardino could benefit from the mastery of some of the finest Italian connoisseurs of ancient Greek and Latin literature, soon becoming himself an expert reader of classic authors such as Virgil, Cicero, Seneca, Pliny, and Lucretius.
It is important to note that the materialist and empiricist approach Telesio displayed in his early works did not come out at first; the main source was an open-minded reading of the texts written by the early commentators of Aristotle’s works, such as Alexander of Aphrodisia, recently revisited by a new generation of scholars, such as, for example, Pietro Pomponazzi. At the University of Padua, the young Telesio could learn the new critical approach to Aristotle’s works. During the time spent in Padua and Venice, he did not gain the title of magister medicinae et artium, yet he started to develop a serious interest in mathematics, medicine, and natural philosophy.
At the end of the Venetian period (1527-1529), Telesio came back to Calabria. After some time spent in Naples (probably from 1532 or 1533 up to the spring of 1534, when his uncle passed away), Bernardino moved to Rome (1534-1535), living in the papal environment of Paolo III Farnese. Then, between 1536 and 1538 he spent a fruitful period of study at the Benedictine monastery of Seminara in the South of Calabria. There he began to develop his arsenal of anti-Aristotelian arguments, partly taken from Presocratic, Hippocratic, Epicurean and Stoic ideas. From there he went back to Rome, meeting some illustrious members of the papal court. Benedetto Varchi, Annibal Caro, Niccolò Ardinghelli, Ippolito Capilupi, Alessandro Farnese, Gasparo Contarini, Niccolò Gaddi, Giovanni Della Casa, and the Orsini brothers became soon acquainted with the philosopher of Cosenza. The significant number of letters written by these figures in the 1540s allows us to follow Bernardino’s movements between Rome, Naples, and Padua (Sergio 2014; Simonetta 2015). By the early 1540s, Telesio was already renowned as an anti-Aristotelian philosopher.
It was during that time that Telesio started to study Vesalius’s program of reform of the ancient ars medendi, including both Galen’s legacy and the Corpus Hippocraticum. Between 1541 and 1542, he spent some time in Padua, during which he met the anatomist and physician Matteo Realdo Colombo (1516-1559). Telesio’s interest in the nova ars medendi, and, more specifically, in physiology of sense perception, will be attested in a work “Contra Galenum” entitled Quod animal universum ab unica Animae substantia gubernatur, written in the 1560s, and posthumously edited and published by Antonio Persio (1542-1612) in Varii de naturalibus rebus libelli (Telesio 1590, [139-227]).
In the late 1540s he probably lived in the Neapolitan household of Alfonso Carafa (d. 1581), III Duke of Nocera, and in the early 1550s he came back to Cosenza. There, in 1553, he married Diana Sersale (d. 1561), a noblewoman belonging to the municipality of Cosenza. He soon became a leading figure in the city, laying foundations for the future creation of the “Accademia Cosentina” (Lupi 2011). In Cosenza, Telesio had such distinguished pupils as Sertorio Quattromani (1541-1603) and Iacopo di Gaeta (fl. 1550-1600); the philosopher and physician Agostino Doni (fl. 1545-1583); the orientalist Giovanni Battista Vecchietti (1552-1619); the future mayor of Cosenza, Giulio Cavalcanti (1591-1600); and Telesio’s first biographer, Giovan Paolo d’Aquino (d. 1612). In 1554 Telesio was elected mayor of Cosenza. Throughout the 1550s, he worked to improve the initial versions of his works, and, soon after the death of his wife (1561), he probably spent a second period of study at the Benedictine abbey of Santa Maria del Corazzo.
In the early 1560s Telesio became more familiar with the academic environment of Naples, where the works of Vesalius, Colombo, Cardano, Eustachius, Cesalpino, Fracastoro, and Jean Fernel featured prominently in the study of natural philosophy and medicine. There Telesio probably read the works of Giovanni Argenterio (1513-1572), professor of medicine in Naples from 1555 to 1560, one of the major contributors to the diffusion of new medical ideas in Southern Italy. Like Girolamo Fracastoro, Argenterio criticized the Galenic theories of contagion and diseases, contributing to the slow downfall of Galen’s authority. He also probably read the work of Giovanni Filippo Ingrassia (1510-1580), a Sicilian physician who received his scientific education at Padua—studying with Vesalius, Colombo, Eustachius, and Fracastoro—and who was also critical of Galen.
In 1563 Telesio went to Brescia, paying a visit to the Aristotelian Vincenzo Maggi (1498-1564). On that occasion, he submitted to Maggi the manuscript of the first edition of De natura iuxta propria principia. In 1565, Telesio’s masterpiece was published in Rome by the papal printer Antonio Blado. In the same period, he completed the draft of the most important of his medical writings—the aforementioned Quod animal universum ab unica animae substantia gubernatur. In the next year, the Neapolitan printer Matteo Cancer released a short treatise, Ad Felicem Moimonam iris, about the phenomenon of rainbow (Telesio 1566 and 2011). These latter two works were an early testimony to the wide range of Telesio’s philosophical interests, as much as to the originality of his method in the quest for the causes of natural phenomena. In the same year of the publication of De natura, one of Telesio’s brothers, Tommaso (1523-1569), accepted the position of Archbishop of Cosenza, a title initially offered to Bernardino by Pius IV.
Toward the end of the 1560s and the beginning of 1570s, Telesio’s philosophical reputation was becoming more and more widespread. In 1567 the humanist Giano Pelusio wrote a short poem, Ad Bernardinum Thylesium Philosophum, where the philosopher of Cosenza was compared to Pythagoras; during the next few years, Telesio was assisted by his disciple Antonio Persio in the publication of the second edition of De rerum natura (1570). In the same year three pamphlets were printed: De colorum generatione, De his quae in aere fiunt et de terraemotibus, and De mari liber unicus (Telesio 1981 and 2013). They were printed in Naples, where Telesio lived under the patronage of Alfonso and Ferrante Carafa (d. 1593).
Telesio’s fame grew in the 1570s: in 1572, Francesco Patrizi wrote an insightful review of Telesio’s De rerum natura (Objectiones), to which Telesio replied in a letter, Solutiones Tylesij; meanwhile, Antonio Persio wrote a reply entitled Apologia pro Telesio adversus Franciscum Patritium. Patrizi’s letter offered Telesio an occasion to point out some arguments of his cosmology and psychology: a) the universality of sensus rather than of the soul (anima, spiritus); b) the fiery and physical nature of the heavens, wherein the Sun is considered the source of motion as well as of the life of celestial bodies; c) the eternity of “celestial spheres” replacing the Platonic idea of creatio ex nihilo; d) the primacy of sense perception over the intellect in the cognitive process of animal understanding. Telesio’s understanding of nature championed the notion of universal sensibility (pansensism) over that of universal animation of things (panpsychism). What governs nature itself are just internal natural principles: there is no need for a divine intelligence in order to explain its inner processes and the variety of natural phenomena.
In the same years as when the correspondence between Telesio and Patrizi was published, the Florentine Francesco Martelli translated Telesio’s De rerum natura (Delle cose naturali libri due) and the treatises De mari and De his, quae in aere fiunt. Around the same period, the orientalist Giovan Battista Vecchietti spent a brief journey at Pisa, defending Telesio’s doctrines against the Aristotelians of the Studium. The temper of that young Telesian caught the attention of the Duke of Tuscany, Cosimo de’ Medici. Between 1575 and 1576 Antonio Persio published three works: Liber Novarum positionum (1575), Disputationes Libri novarum positionum, and Trattato dell’ingegno dell’huomo (1576). By 1577, Patrizi completed L’amorosa filosofia, a dialogue wherein the philosopher of Cherso mentions the acquaintance he had with Bernardino Telesio. In the late 1570s Telesio came back to Naples, and, at that time, the humanist Bonifacio Vannozzi, the rector of Pisa university, wrote a letter to Telesio, defining him as “our Socrates” (Artese 1998, 191).
Living in Naples in the first half of the 1580s, Telesio immersed himself in the production of the third edition of De rerum natura, printed in 1586. He dedicated the work to Ferrante Carafa, IV Duke of Nocera. In that edition, Telesio unpacked in nine books the earlier and later topics of his thought, from cosmology to psychology and moral philosophy. Meanwhile, his thought came to be renowned in England. During his Grand Tour of Italy (1581-1582), the mathematician Henry Savile bought a copy of the second edition of De rerum natura. Just a few decades later, Telesio’s works will have spread in the cultural circles of the early Jacobean England. James I Stuart and Francis Bacon owned copies of Telesio’s works, as did churchmen and royal physicians like John Rainolds ([Reynolds], 1549-1607; a translator of King James’s Bible) and Sir William Paddy (1554-1634, a fellow of the Royal College of Physicians of London), and aristocrats like Sir Henry Percy (1564-1632), IX Earl of Northumberland. Even if with different views and motivations, they all read Telesio’s writings. Moreover, his thought attracted the attention of the most eminent men of the “Northumberland circle”: Sir Walter Raleigh (1552-1618), Walter Warner (ca.1557-1643), Thomas Harriot (1560-1621), and Nicholas Hill.
In 1587, a Neapolitan lawyer, Giacomo Antonio Marta (1559-1628), wrote a pamphlet against Telesio, titled Pugnaculum Aristotelis contra principia Bernardini Telesii (1587). A few years later, the young Campanella, in his Philosophia sensibus demonstrata (Naples, 1591)—the most remarkable manifesto of Telesian philosophy—launched a fierce attack against Marta’s book. In the first pages of his work, Campanella made a summary of Telesio’s epistemology, pointing out the need to clarify the first grounds of the new method before commencing the inquiry into the main issues of natural philosophy. By means of Telesio, Campanella contributed—in his own way—to the development of the early modern debate about the scientific method (Firpo 1949, 182-183). The empiricist approach adopted by Telesio and Campanella did not yet have the complexity and articulation of Galileo’s method, composed by the “manifest experiences” (sensate esperienze) and “necessary demonstrations” (certe dimostrazioni); nonetheless, a number of early 17th century Italian writers did not hesitate to label the Calabrian thinkers as just as dangerous as the novatores of the Galenic circle.
On July 23rd, 1587, Telesio came back to Cosenza, and wrote his will, most likely because of ill health. He died a year later in October. Among the participants of the burial ceremony were Sertorio Quattromani, Giovan Paolo d’Aquino, the members of the “Accademia Cosentina” (Iacopo di Gaeta, Vincenzo Bombini, Giulio Cavalcanti and others), and the young Tommaso Campanella, at that time a friar of Ordo Praedicatorum, hosted in the convent of San Domenico in Cosenza. For the occasion, Campanella composed some verses dedicated to Bernardino (Al Telesio Cosentino, in Campanella 1622, n° 68).
2. Psychology and Theory of Knowledge
Telesio’s natural philosophy is based on a new methodological approach to the study of nature. This is exactly what he points out in the first pages of his De natura (1565)—a work rightly characterized by some scholars as “Telesio’s masterpiece”. Such approach does not depend uniquely on his alleged modern “empiricism”. The main elements of Telesio’s “modernity” lie in his novel approach to psychology, animal physiology, and theory of science. On the one hand, Telesio offers a number of arguments for the similarity of animals and humans: for example, both animals and humans are able to perceive their own passions through the way of senses. On the other hand, the spiritus of humans is “purer” and “more abundant” than that of animals (Telesio 1586, VIII.14-15). Therefore, humans are better equipped than animals in the art of reasoning.
Telesio’s principal aim was to inquire the causes of natural phenomena without viewing those natural phenomena through the lenses of Platonic and Aristotelian metaphysics. As he states in the incipit of book I of De rerum natura (1570), “the construction of the world and the nature of the bodies contained in it should be not inspected by reason, as the Ancients did, but must be perceived by sense, and grasped from things themselves”. He did not belittle nor underestimate the role of reason. Nonetheless, he prioritizes the direct evidence that comes from the senses. The beginning of his natural philosophy lies in the experience deriving from sense perception, sensus being a cognitive power closer to natural things than reason itself. As Aristotle himself asserted, “there is nothing in the intellect that is not first in sense perception”. The first moves of Telesio’s thought were to develop this principle of classical empiricism in a new and more coherent way.
The perception of a physical object establishes a causal relation to the external world, and the first task of a scientist is to investigate the nature of that relation. In opposition to Aristotle, Telesio affirms that the ability of a sentient creature to reach knowledge of natural things is not “actualized” by the “form” of the perceived thing. He does not believe that all acts of sense perception simply mirror the natural beings of the external world. Rather, he thinks that knowledge of the world depends on the sensible data perceived by the sentient creature. That kind of affection (perceptio passionis) is the very starting point to reach knowledge of the world, as sense perception is the basic and most important property of all animals, while the act of understanding is nothing more than reckoning and reminding similarities and differences between previous sensations. In that perspective, Telesio abandoned the traditional doctrine of species, denying that natural things are the result of the combination of matter and form. According to him, the Peripatetic answer to the problem of human knowledge left unsolved the relation between causes and effects in the cognitive process. Once more, it is the concept of spiritus that lies at the core of Telesio’s psychology. As imperceptible, thin, fiery body, it constitutes our sensible soul (Telesio 1586, VII.4, and V.3); as anima sentiens of human bodies, it is present mainly in the nervous system, in order to guarantee the unity of the perception; consequently, it is the bearer of sensibility and movement (Telesio 1586, V.5., V.10, and V.12). In other words, Telesio provides a theory of mind where the spirit does produce actual internal representations in response to external stimuli—which are considered as passions—and to internal stimuli, which are the affections or the motions of spirit itself. Then, mental representations are simple reconstructions of the world. Telesio upheld that the material soul grasps natural beings by means of a corporeal, physical interaction with them. Consequently, scientific knowledge is not the result of a hierarchical process, nor does it consist in the gradual abstraction of similitudes or species (Telesio 1586, VIII.15-28). In some way, Telesio’s psychology anticipated the empiricist approach of the 17th century critics to Descartes’s doctrine of the cogito: in order to be aware of the knowledge of the natural things, humans do not need the intellectual self-consciousness of the sensible data coming from sense perception. Further, the editing process of sense perception can be improved only by sense perception, supported—when necessary—by the corporeal principle of spiritus.
Reasoning is nothing but the outcome of the self-organization of the “material soul” (spiritus) in cooperation with the “ways” or “means” of sense perception and the principal functions (memory, imagination) of the brain activated by the same principle of material soul. In order to follow their natural aim, that is, self-preservation (conservatio sui), both humans and animals are ruled by the opposed sensations of pleasure and pain, with the key function of the spirit at the core of bodily functions (Telesio 1586, VII.3; IX.2). In his early writings, Telesio did not directly challenge the theory of intelligible species of the Scholastic tradition; however, his opposition to that theory is evident in the basic principles of his psychology. They may be unfolded in five topics:
(a) intellectual cognition is based on a perceived similitude, which does not consist in a mental representation of an external object resulting from the encounter between the active intellect and passive sensation; rather, sense perception is an active operation of the spiritus, the material soul (Telesio 1586, V.34-47);
(b) the sensible data resulting from a perceptive experience has a cognitive role (as Campanella and Hobbes later explained, sensus is already a kind of iudicium; whereas understanding or imagination are nothing but “decaying sense”);
(c) the material agents involved in the cognitive process, from the “ways of sense” to the spiritus (“anima sentiens”), are merely corporeal (Telesio 1586, V.3-5, 10-12);
(d) the spirit is able to perceive because it can be subject to sensible, bodily alterations;
(e) since spiritus is the bearer of motion, a human soul does move in virtue of its own nature; what is at stake is indeed the concept of motion, in some way close to Lucretius’s atomism, even though Telesio himself does not show any intention to claim such a linkage. At the same time, Telesio’s theory excludes any mechanical approach to the physiology of sense perception: motion of bodies has to be explained through the physics of contact, and yet his theory of motion is still far from such kind of explanation that such modern authors as Gassendi, Descartes, and Hobbes later tried to provide.
Telesio’s naturalistic program, then, took sensation as a material process involving only material agents: (corporeal and) sensible objects, the “ways of sense” and the spiritus. Stating that an animal is ruled by one substance residing in its brain, Telesio abandoned Aristotle’s psychology and his threefold partition of the soul (intellective, sensitive, vegetative), as well as Galen’s partition of “pneumata” (animal, vital, natural) and his theory of “temperaments” (Telesio 1570, II.15). According to the Galenic system, the pneuma as a “transmitter substance” had a tripartite structure: a) the spiritus naturalis (pneuma physei) or vegetative, having its seat in the liver, and responsible for digestion, metabolism, production of blood and semen; b) the spiritus vitalis, localized in the heart, active in all kind of affections and motions; c) the spiritus animalis (psychei), situated in the brain, and responsible for the control and organization of the activity of the soul and of the intellect. Now, in the new system, both psychology and physiology, psyche and physis, were unified in one organic theory. Furthermore, the conception of the spiritus as a principle generated from the semen and diffused through the entire nervous system echoed some lines of Lucretius’s On the Nature of Things. Finally, locating the seat of the spirit in the brain, Telesio rejected Aristotle’s biological cardiocentrism (Telesio 1586, V.27).
In the 1586 edition of De rerum natura, Telesio introduced the notion of a divine soul (a deo immissa) to go along with the “material soul” (e semine educta) of his earlier thought (Telesio 1586, V.2-3). The idea of a divine soul capable of surviving the natural dissolution of the body is a conceptual device Telesio used with a twofold scope: on the one hand, Telesio could not deny that the concept of soul was a theological matter: Sacred Scripture teaches us that humans have a divine origin, infused by the Creator itself. Therefore, it would be unjust if God did not give to humans the prospect of an afterlife, as a remuneration for virtue and for vice experienced during the “mundane” lifetime (in that passage, it is evident that the source of Telesio’s argument is Book XIV of Marsilio Ficino’s Theologia Platonica de immortalitate animarum). On the other hand, Telesio remained faithful to the methodology of his early works: in 1586 he just pointed out the existence of a strict separation between the specific subjects of the philosopher’s and the theologian’s work. As a forerunner of the modern scientist, Telesio thought that the role of the philosopher was uniquely to inquire the secrets of nature “according its own principles”.
Telesio goes on to reject Aristotle’s definition of the soul as forma corporis, that is as the form and entelechy of an organic body (Aristotle, De anima II,1). According to Leen Spruit (1995), what matters here is the main topic of the formal mediation of sensible reality in intellectual knowledge. As it is well known, Aristotle regarded the mind as capable of grasping forms detached from matter (materia sensibilis). Aristotelian medieval commentators grounded that theory in the mediating role of representational forms called “intelligible species”.
According to Telesio, on the other hand, scientific knowledge of the world has to be necessarily mediated through sensible knowledge, which has an active role, whereas according to Aristotle the “materials” of sense perception play a passive role, from which the intellect grasps the form of each substance or natural being. Here lies another echo of Lucretius’s On the Nature of Things, where (in book III, ll. 359-369) he vigorously criticizes those philosophers who consider the senses as passive “gates” used by the soul.
As it is stated in the chapter V.2 of De rerum natura (1586), the spirit is what allows animals to perceive the external world, so it moves sometimes with the whole body, sometimes with single parts thereof. Probably inspired by the Aristotelian tradition of such authors as Alexander of Aphrodisias (on Aristotle’s Meteor. IV.8, for example), Telesio claimed that the “homeomerus” parts of the body (skin, flesh, tissues, blood, bones, and so forth) are the same for animals and humans: they differ in their appetites and needs, not in their calculations (logoi), and importantly, they all have the same kind of sense perceptions. Thus animal and human souls differ in degree, not in kind or quality.
Analogously, whereas Aristotle (in Meteor. book IV) asserted that all sensitive parts of the body must be homogeneous and be a direct composition of the four elements, in Telesio’s view, the variety of dispositions and functions of the different parts of the body had to be explained in the same way that the majority of the natural bodies are. In other words, the “homeomerous” mixtures cannot be considered as the “ultimate” parts of the “anhomeomerous” bodies (organs such as the eye, the heart, the liver, the lungs and so forth). Even though the spiritus is mostly present in the brain and in the nervous system, it is also spread throughout the entire body and, just like a brain, it was responsible for the motions, changes and the combination of different parts of the body. By the way of the sensus, the dynamics of attraction and repulsion provided for the constant balance of the living body.
3. Cosmology
Telesio eschewed metaphysical speculation; in his view, the most important task of the natural philosopher is to give attention to the observable phenomena in the natural world, looking for the causes of “sensible beings” (Telesio 1586, II.3). Thus it was in the spirit of the natural philosopher that he theorized that all natural things are the result of the two active and mutually antagonistic forces, “heat” and “cold”, acting upon matter, and thereby making possible the creation of inanimate and animate beings. Heat is responsible for the phenomena of elasticity, warmth, dryness, combustion, and lightness, as well as rarefaction of matter and motion and velocity of bodies; cold is responsible for the slowness of bodies in motion, and for their condensation, freezing and hardness. All the other natural phenomena (such as humidity or fermentation) are the results of combinations of different degrees of heat and cold. The interaction of heat and cold affects the nature of matter itself, a notion that Telesio intentionally left unclear. Taken per se, the concept of matter cannot be directly sensed, and its existence can simply be postulated, just like the notion of spiritus.
In this way Telesio rejected Aristotle’s view, according to which the two pairs of opposed qualities (cold/heat, and dry/humid), acting on the matter, gave rise to one of the four primary elements of natural beings (earth, air, fire, and water). According to Telesio, matter, as a physical, corporeal subjectum, has a merely passive role. In fact, what is important according to Telesio are the modifications of the subjectum, that is, the results of interactions between heat and cold (Telesio 1586, II.2).
On Telesio’s view, all things act according to their own nature, starting from the primary forces of cold and heat, by means of the ability to perceive each other. In order to sustain themselves, these primary forces and all beings which arise through their antagonistic interaction must be able to perceive themselves as opposite forces. In other words, they have to sense what is convenient and what is inconvenient or damaging for their own survival. Living bodies do not constitute a specific realm, separated from inanimate beings. They are all determined by the solar heat and the terrestrial matter. Again, it is important to note that sensation is not only a property of animate beings. Telesio’s philosophy can thus be described as a kind of pansensism: all beings, both animate and inanimate, are said to have the power of sensation. In fact, in the third edition (1586) of De rerum natura, the motion of celestial bodies will be explained by means of the principle of “self-preservation” (conservatio sui), in other words, the need to sustain the life itself of those bodies.
At the heart of Telesio’s cosmology, then, is the idea that nature is ruled by its own—internal, not external—principles. Consequently, the natural world does not need to be taken care of by any kind of divine intelligence. Heat and cold share the same “desire” to preserve themselves. The celestial spheres are made of matter, heat and cold (ivi, I.11-12, 8-9). Regarding the Ptolemaic system, Telesio rejected it as unnatural, probably because of the growing suspicion—in the 16th century cosmology—that it provided a mathematical device to “save the appearances,”, leaving unexplained the question of the actual natural causes of the planetary motions, as well as of other celestial phenomena. Beginning with the first edition of De rerum natura, Telesio’s objective was to replace Aristotle’s geocentrism with one of his own. At the cosmological level, the interplay between heat and cold involves the position of the Sun and of the Earth, being the seats and sources of heat and coldness, respectively. Because of its heat, the Sun was propelled into perpetual motion, whereas the Earth is immobile because of its coldness and its great weight. Consequently, the cosmic balance and harmony of the heavens depend on the struggle and equilibrium between the Sun and the Earth. Unlike Aristotle, Telesio upheld the fiery nature of the heavens. That moved the philosopher of Cosenza to deny the Aristotelian principle of a first-mover of the universe. Planetary motions are not the outcome of the patterns of motion between the several regions of the celestial spheres; rather, they are the consequence of a geocentric system ruled by thermal forces, wherein are still valid the ancient notions of densum and rarum.
Thus, Telesio chose heat and cold as the principal agents for knowledge of the world because together with prime matter (moles), they immediately affect bodies and their functions. As said before, the two primary bodies, the Sun and the Earth, are the subjects of Telesio’s argument. The former is the seat of heat, and the latter is that of cold (Telesio 1565, I.1-4). That statement literally expelled the idea of a creatio ex nihilo. Electing the Sun and the Earth as the celestial seats of heat and cold, Telesio defines the boundaries of the universe as the edges of the corporeal world (extrema corpora universi)Life itself depends on the right balance of heat and cold, and they are lastly called “forces of acting natures”, agentium naturarum vires (Telesio 1586 VII.9). The later Telesio, in fact, was firmly convinced that the world depended from the inner uniformity of nature and from its intrinsic virtue or “wisdom”.
Furthermore, against Aristotle, Telesio denied the theory of the locus as the limit of a body, taking into account the atomistic theory of space as an empty place filled by the bodies. By means of the two forces of heat and cold, and by affirming the idea of a space filled by matter, he abolished the Aristotelian theory of a cosmos divided into a sublunary world, in which generation and corruption take place, and a superlunary world with timeless regular movements. Moreover, he developed a critique of the Peripatetic theory of natural locus, pointing out that Aristotelians did not explain well the reason why the motion of heavy bodies becomes uniformly accelerated.
4. Influence and Legacy
With the publication of his early works (1565, 1566, 1570), Telesio established himself as a key figure in the intellectual milieu of the late 16th to early 17th century Italy. Some of his theses were read, commented on, and debated by a number of Italian philosophers, physicians, and amateurs of science, such as Francesco Patrizi, Antonio Persio, Agostino Doni, Giordano Bruno, Giambattista Vecchietti, Latino Tancredi, Tommaso Campanella, Andrea Chiocco, Giulio Cortese, Francesco de’ Vieri, Alessandro Tassoni, and Marco Aurelio Severino. In the early 17th century his writings circulated around Europe, and were read by Francis Bacon, Marin Mersenne, René Descartes, Pierre Gassendi, Jean-Cécile Frey, Charles Sorel, Walter Warner, Thomas Hobbes, and others.
One of the first authors to officially criticize the philosophy of the “Telesians” is Francesco de’ Vieri (1524-1591), lecturer of Aristotelian philosophy at the University of Pisa. In 1573 he published in Florence a work on the vernacular, Trattato delle metheore, in three books. The same work, augmented with a huge fourth book of 200 pages, edited in 1582, contains a rehearsal of the principal topics of the fourth book of Aristotle’s Meteorologica, and, with the purpose of showing his Platonic reading of Aristotle’s philosophy, he took occasion to attack the “Telesians”, with the aim of persuading them with “their own arguments” (p. 227). His critique of Telesio and of the Telesians is particularly significant because he offers a reassessment of the Aristotelian notion of sensus through the key reading of the Platonic concept of pneuma (a word belonging to the Stoic and pre-Aristotelian lexicon). As said above, Telesio translates the Greek word pneuma into the Latin expression of spiritus or anima sentiens. Some pages after the aforementioned quotation, Verino states that God created souls as eternal beings (ab aeterno), because a soul is not grasped from the alteration of matter (p. 247). This is a clear reference to the Telesian idea of a spiritus grasped from a material seed (spiritusex semine educta). In a manuscript kept at the National Library of Florence (Magl. XII.11, f. 23), the same author attacked Telesio and his followers, who erroneously attribute to the sensus “all judgments about the natural things”. It is important to recall that when Francesco de’ Vieri published the 1582 edition of his book, Telesio’s philosophy had already reached, in Tuscany and across Italy, the apex of its fame.
In 1587, a year before Telesio’s death, the Spanish philosopher Oliva Sabuco de Nantes y Barrera published a book, Nueva filosofía de la naturaleza del hombre, where she elaborated on a psychophysiology of the human body deeply influenced by Telesio’s doctrines (Bidwell-Steiner 2012). Then, in 1588 Francesco Muti published a work entitled Disceptationum libri V contra calumnias Theodori Angelutii in maximum philosophum Franciscum Patritium, in quibus pene universa Aristotelis philosophia in examen adducitur, in which he defended Telesio, taking in consideration the quarrel that took place at Ferrara during 1584 and 1585 between Patrizi and Angelucci (Sergio 2013, 71-72, 74). In 1589, Sertorio Quattromani, the new founder of the “Accademia Cosentina,” published a summary of Telesio’s thought called La filosofia di Bernardino Telesioris tretta in brevità e scritta in lingua Toscana (Naples, 1591).
In the last decade of the 16th century, an important role was played by Antonio Persio (1542-1612). Among Telesio’s disciples, Persio was the one who worked on the Venetian edition of the Varii de naturalibus rebus libelli (Apud Felice Valgrisium, 1590). That edition included both the booklets already published in 1570 (De his, quae in aerefiunt et de terrae motibus; De colorum generatione; De Mari) plus a number of writings Telesio had left unpublished (De cometis et lacteo circulo; De iride; Quod animal universum ab unica animae substantia gubernator; De usure spirationis; De coloribus; De saporibus; De somno). Some years later, one of Telesio’s former disciples, Giovan Paolo d’Aquino, published Oratione in morte di Berardino [sic] Telesio Philosopho Eccellentissimo agli Academici Cosentini (Cosenza, per Leonardo Angrisano, 1596), the first biography of the philosopher of Cosenza.
As noted above, in 1591 Campanella wrote a stunning defense of Telesian philosophy against Giacomo Antonio Marta’s Pugnaculum Aristotelis. In his work, Campanella took occasion to unfold and reassess the principles of Telesio’s naturalism, somehow anticipating (in his Praefatio) the basic essentials of Galileo’s methodology (above all, the alliance between the “sensate esperienze” and the “certe dimostrazioni”). Another early modern thinker to note, Alessandro Tassoni, devoted a number of pages of his works to Telesio’s meteorology (Trabucco 2019).
In the first decades of the 17th century, in Italy, Telesio’s ideas entered a wider scientific context, a constellation populated by a number of scientists interested in the so-called “mathematization of the world”, such as Galileo and the network of his disciples and correspondents. However, the new mathematical trend of natural philosophy did not eclipse Telesio’s merits and the scientific value of his work. Authors such as Latino Tancredi, Colantonio Stigliola, Marco Aurelio Severino, and Tommaso Cornelio will continue to spread his thought. Especially in Southern Italy, Telesio’s name became the distinctive mark of a philosophical tradition dating back to the greatest authors of the ancient, pre-Aristotelian period, such as Pythagoras, Empedocles, Philolaus, Alcmaeon, Timaeus of Locri, and so forth.
Meanwhile, in England, Francis Bacon devoted some pages of his writings to Telesio: firstly in his Advancement of Learning (1605), then in De principiis atque originibus, secundum fabulas Cupidinis et Coeli (1613), and finally in his Sylva sylvarum. Bacon’s reading of Telesio’s philosophy mainly focused on the portrayal of Telesio as the restorer of Parmenides’s philosophy, freezing the Calabrian thinker in the role of an innovator who took inspiration from the Eleatic monism for the setting of his materialistic world-view (Rees 1977, De Mas 1989, Bondì 2001, Garber 2016). At same time, Bacon himself expressed some concerns about the limits of Telesio’s theory of matter. According to Lord Verulam, Telesio’s concept of matter lies unexplained in regards to its specific function in the processes of generation and transformation of natural beings. However, Bacon admired such authors as Telesio, Cardano, and Della Porta with respect to the notion of spiritus, the power of imagination, and the sympathy between animate and inanimate objects (Gouk 1984). In that way, it is fair to say that Bacon contributed to the construction of the mythical conception of Telesio as a freethinker deeply indebted to the pre-Socratic tradition, which is not to say that the myth is altogether misleading (see Giglioni 2010: 70).
Back in the European continent, some 17th century traces of Telesio’s legacy can be found in such authors as Marin Mersenne (Quaestiones celeberrimae in Genesim, Paris, 1623); Gabriel Naudé (Apologie pour tous les grand personnages qui ont esté faussement soupçonnez de magie, Paris, 1625; Advis pour dresser une bibliotheque, Paris, 1627); Jean-Cécile Frey (Cribrum philosophorum qui Aristotelem superiore et hacaetate oppugnarunt, in Opuscula varia nusquamedita, Paris, 1646); Charles Sorel (Le sommaire des opinions les plus estranges des novateurs modernes en la philosophie comme de Telesius, de Patritius, de Cardan, de Ramus, de Campanelle, de Descartes, et autres, in De la perfection de l’homme, où les vrays biens sont considérez et spécialement ceux de l’âme, Paris, 1655; reprinted in La science universelle, vol. 4, 1668); Guy Holland (The grand prerogative of human nature, namely, the soul’s natural or native immortality, and freedom from corruption, London, 1653); and Pierre Gassendi (Syntagma philosophicum, in Opera Omnia, vol. I, Paris, 1658).
Another testimony of the role of Telesio’s legacy in the 17th century Naples is contained in Tommaso Cornelio’s Progymnasmataphysica (Venetiis, 1663): compare the Progymn. II. De initiis rerum naturalium; the Epistolade Platonica Circompulsione, and Epistola M. Aurelij Severini nomine conscripta (repr. Venetiis, 1683, pp. 41-42, 140, 144, 146, and 190-191).
In the French context, Pierre Gassendi was one of the most important authors to give attention to the Cosentine thinker. In his writings such novatores as Telesio and Campanella are mentioned in regards to the theories of space and time as well as the theory of sensory qualities including heat and cold (Syntagma, in Opera, I, 245b), as well as in Gassendi’s tripartite conception of void—that is to say, the inane separatum, that is, the idea of an infinite void expanding beyond the atmosphere; the inane disseminatum, that is, the interparticle void between the basic corpuscles of bodies; and the inane coacervatum, that is, the interparticle void “cobbled” together by experimental means (Opera I, 185a-187a, 192a-196a, 196b-203a). On that subject, in Gassendi’s notion of vacuum coarcevatum, there is no way to explain how bodies may divide and separate at the level of basic particles without the supposition of that kind of void; here, Gassendi found evidently insufficient the explanation of Telesio according to which heat and cold are the active principles of matter (for further details, see Fisher 2005, and Henry 1979).
Finally, a specific debt towards Telesio is also identifiable in Thomas Hobbes’s works. In the first chapter of Leviathan (1651), Hobbes openly rejected the doctrine of species, and in successive chapters he asserted a cohesive relationship between sense, imagination and reasoning, consistent with the Telesian approach (a first trace of that influence dates back to the Elements of Law, Natural and Politic, written in 1640). What is more, the notion of “self-preservation” (conservatio sui) was reassessed in Hobbes’s anthropology. Telesio’s influence became more explicit in 1655 in Hobbes’s De corpore, sect. IV., chap. XXV. In the fifth article of that chapter, Hobbes unfolds the basic properties of sensation and cognition in the simplest structures of the organized matter in motion. In the same place he provides a suggestion which allows us to place his materialism close to the Renaissance pansensism advanced by Telesio and Campanella. After he explained in a general way his physiology of sensation and animal locomotion, he stated:
I know there have been philosophers, and those learned men, who have maintained that all bodies are endued with sense. Nor do I see how they can be refuted, if the nature of sense be placed in reaction only. And, though by reaction of bodies inanimate a phantasm might be made, it would nevertheless cease, as soon as ever the object were removed. For unless those bodies had organs, as living creatures have, fit for the retaining of such motion as is made in them, their sense would be such, as that they should never remember the same (Hobbes 1656, XXV.5, p. 226. On the subject, see Schuhmann 1988: 109-133; Sergio 2008: 298-315).
5. References and Further Reading
a. Primary Sources
Telesio, Bernardino, 1565, De natura iuxta propria principia liber primus, et secundus (Romae, Antonium Bladum, 1565) – Ad Felicem Moimonam iris (Rome, Mattheus Cancer, 1566), ed. by R. Bondì, Rome, Carocci, 2011.
Telesio, Bernardino, 1570, De rerum natura iuxta propria principia, liber primus, et secundus, denuo editi – Opuscula (Neapoli, Josephum Cacchium, 1570), ed. by R. Bondì, Rome, Carocci, 2013.
Telesio, Bernardino, 1572, Delle cose naturali libri due – Opuscoli – Polemiche telesiane (Biblioteca Nazionale Centrale, Florence, Ms. Pal. 844, cc. 12r-204r; Cod. Magl. XII B 39), ed. by A. L. Puliafito, Rome, Carocci, 2013.
Telesio, Bernardino, 1586, De rerum natura iuxta propria principia, libri IX (Naples, Horatius Salvianus, 1586), ed. by G. Giglioni, Rome, Carocci, 2013.
Telesio, Bernardino, 1590, Varii de naturalibus rebus libelli ab Antonio Persio editi (Venice, F. Valgrisius, 1590), ed. by Miguel A. Granada, Rome: Carocci, 2012.
Telesio, Bernardino, 1981, Varii de naturalibus rebus libelli, ed. by L. De Franco, Florence, La Nuova Italia.
b. Secondary Sources
d’Aquino, Giovan Paolo, 1596, Oratione in morte di Berardino Telesio, philosopho eccelentissimo, Cosenza, Leonardo Angrisano.
Artese, Luciano, 1991, “Il rapporto Parmenide-Telesio dal Persio al Maranta,” Giornale Critico della Filosofia Italiana, 70: 15-34.
Artese, Luciano, 1994, “Bernardino Telesio e la cultura napoletana,” Studi Filosofici, 17: 91-110.
Artese, Luciano, 1998, “Documenti inediti e testimonianze su Francesco Patrizi e la Toscana,” Bruniana&Campanelliana, 4: 167-191.
Bacon, Francis, 1613, De principiis atque originibus, secundum fabulas Cupidinis et Coeli, in The Works of Francis Bacon, ed. by R. L. Ellis, J. Spedding, D. D. Heath, London, Longmans, 1858, vol. 5: 289-346.
Barbero, Giliola, Paolini, Adriana, 2017, Le edizioni antiche di Bernardino Telesio: censimento e storia, Paris, Les Belles Lettres.
Bianchi, Lorenzo, 1992, “Des novateurs modernes en philosophie: Telesio tra eruditi e libertini nella Francia del Seicento,” in Bernardino Telesio e la cultura napoletana, ed. by R. Sirri and M. Torrini, Naples, Guida: 373-416.
Bidwell-Steiner, Marlen, 2010, “Metabolism of the Soul. The Psychology of Bernardino Telesio in Oliva Sabuco’s Nueva filosofía de la naturaleza del hombre,” (1587) in Blood, Sweat and Tears. The Changing concepts of Physiology from Antiquity into Early Modern Europe, ed. by M. Horstmansoff, H. King, C. Zittel, Leiden, Brill: 662-684.
Boenke, Michaela, 2005, “Psicologie im System der naturphilosophischen Monismus; Bernardino Telesio,” in Körper, Spiritus, Geist: Psychologie vor Descartes, München, Paderborn: 120-142.
Bondì, Roberto, 2018a, Il primo dei moderni. Filosofia e scienza in Bernardino Telesio, Rome, Edizioni di Storia e Letteratura.
Bondì, Roberto, 2018b, “Dangerous Ideas: Telesio, Campanella and Galileo,” in Copernicus Banned. The Entangled Matter of the anti-Copernican Decree of 1616, ed. by N. Fabbri and F. Favino, Florence, Olschki, 1-27.
De Franco, Luigi, 1995, Introduzione a Bernardino Telesio, Soveria Manelli: Rubbettino.
De Frede, Carlo, 2001, Docenti di filosofia e medicina nella università di Napoli dal secolo XV al XVI, Naples, Lit. Editrice A. De Frede.
De Lucca, Jean-Paul, 2012, “Giano Pelusio: ammiratore di Telesio e poeta dell’«età aurea»,” in Bernardino Telesio tra filosofia naturale e scienza moderna, ed. by G. Mocchi, S. Plastina, E. Sergio, Pisa-Rome, Fabrizio Serra Editore: 115-132.
De Miranda, Girolamo, 1993, “Una lettera inedita di Telesio al cardinale Flavio Orsini,” Giornale Critico della Filosofia Italiana 72: 361-375.
Ebbersmeyer, Sabrina, 2013, “Do Humans Feel Differently? Telesio on the Affective Nature of Men and Animals,” in The Animal Soul and the Human Mind. Renaissance Debates, ed. by C. Muratori, Pisa-Roma, Fabrizio Serra Editore, 97-111.
Ebbersmeyer, Sabrina, 2016, “Telesio’s Vitalistic Conception of the Passions,” in Sense, Affect and Self-Preservation in Bernardino Telesio (1509-1588), ed. by G. Giglioni and J. Kraye, Dordrecht, Springer.
Ebbersmeyer, Sabrina, 2018, “Renaissance Theories of the Passion. Embodied Minds,” in Philosophy of Mind in the late Middle Ages and Renaissance. The History of Philosophy of Mind, vol. 3, ed. by S. Schmid, London, Routledge, 185-206.
Firpo, Luigi, 1951, “Filosofia italiana e Controriforma. iv. La proibizione di Telesio,” Rivista di Filosofia, 42/1: 30-47 (see also 41, 1950: 150-173 e 390-401).
Fisher, Saul, 2005, Pierre Gassendi’s Philosophy and Science. Atomism for Empiricists, Leiden, Brill.
Fragale, Luca Irwin, 2016, “Bernardino Telesio in due inediti programmi giovanili,” in Microstoria e Araldica di Calabria Citeriore e di Cosenza. Da fonti documentarie inedite, Milan, The Writer, 11-32.
Garber, Daniel, 2016, “Telesio Among the Novatores: Telesio’s Reception in the Seventeenth Century,” in Early Modern Philosophers and the Renaissance Legacy, ed. by C. Muratori and G. Paganini, Dordhecht, Kluwer, 119-133.
Gaukroger, Stephen, 2001, Francis Bacon and the Transformation of Early-Modern Philosophy, Cambridge, Cambridge University Press.
Giglioni, Guido, 2010, “The First of the Moderns or the Last of the Ancients? Bernardino Telesio on Nature and Sentience,” Bruniana & Campanelliana 16: 69-87.
Gómez López, Susana, 2013, “Telesio y el debate sobre la naturalezza de la luz en el Renacimiento italiano,” in Bernardino Telesio y la nueva imagen de la naturalezza en el Renacimiento italiano, ed. by Miguel Á. Granada, Siruela, Biblioteca de Ensayo, 194-235.
Granada, Miguel Ángel, 2013, Telesio y las novedades celestese: la teoría telesiana de los cometas, e Telesio y la Via Láctea, in Bernardino Telesio y la nueva imagen de la naturalezza en el Renacimiento italiano, ed. by Miguel Á. Granada, Siruela, Biblioteca de Ensayo, 116-149 and 150-193.
Hatfield, Gary, 1992, “Descartes’ physiology and its relation to his psychology,” in The Cambridge Companion to Descartes, ed. by J. Cottingham, Cambridge, Cambridge University Press, 335-370.
Henry, John, 1979, “Francesco Patrizi da Cherso’s Concept of Space and Its Later Influence,” Annals of Science, 36: 549-575
Hirai, Hiro, 2012, “Il calore cosmico di Telesio fra il de generazione animalium di Aristotele e il De carnibus di Ippocrate,” in Bernardino Telesio tra filosofia naturale e scienza moderna, ed. by G. Mocchi, S. Plastina, E. Sergio, Pisa-Rome, Fabrizio Serra Editore, 71-83.
Iovine, Maria Fiammetta, 1998, “Henry Savile lettore di Bernardino Telesio. L’esemplare 537.C.6 del De rerumnatura 1570,” Nouvelles de la République des Lettres 17: 51-84.
Leijenhorst, Cees, 2010, “Bernardino Telesio (1509-1588): New fundamental principles of nature,” in Philosophers of the Renaissance, ed. by P. R. Blum, 168-180, Washington, The Catholic University of America Press.
Lattis, James M, 1994, Between Copernicus and Galileo: Christoph Clavius and the Collapse of Ptolemaic Cosmology, Chicago, Chicago University Press.
Lerner, Michel-Pierre, 1986, “Aristote “oublieux de lui-meme” selon Telesio,” Les Études philosophiques, 3: 371-389.
Lerner, Michel-Pierre, 1992, “Le ‘parménidisme’ de Telesio: Origine et limites d’un hypothèse,” in Bernardino Telesio e la cultura napoletana. ed. by R. Sirri and M. Torrini, Naples: Guida, 79-105.
Lupi, Walter F., 2011, Alle origini della Accademia Telesiana, Cosenza: Brenner.
Mandressi, Rafael, 2009, “Preuve, expérience et témoignage dans les «sciences du corps»,” Communications 84: 103-118.
Margolin, Jean-Claude, 1990, “Bacon, lecteur critique d’Aristote et de Telesio,” in Convegno internazionale di studi su Bernardino Telesio, Cosenza, Accademia Cosentina, 135-166.
Mulsow, Martin, 1998, Frühneuzeitliche Selbsterhaltung. Telesio und die Naturphilosophie der Renaissance, Tübingen: Max Niemeyer Verlag.
Mulsow, Martin, 2002, “Reaktionärer Hermetismus vor 1600? Zum Kontext der venezianischen Debatte über die Datierung von Hermes Trismegistos,” in Das Ende des Hermetismus. Historische Kritik und neue Naturphilosophie in der Spätrenaissance. Dokumentation und Analyse der Debatte um die Datierung der hermetischen Schriften von Genebrard bis Casaubon (1567-1614), ed. by M. Mulsow, Tübingen, Max Niemeyer Verlag, 161-185.
Ottaviani, Alessandro, 2010, “Da Antonio Telesio a Marco Aurelio Severino: fra storia naturale e antiquaria,” Bruniana & Campanelliana, 16/1: 139-148.
Plastina, Sandra, 2012, “Bernardino Telesio nell’Inghilterra del Seicento,” in Bernardino Telesio tra filosofia naturale e scienza moderna, ed. by G. Mocchi, S. Plastina, E. Sergio, Pisa-Rome, Fabrizio Serra Editore, 133-143.
Puliafito, Anna Laura, 2013, Introduzione a Telesio 1572, xxxiii-xlv.
Pousseur, Jean-Marie, 1990, “Bacon, a Critic of Telesio,” in Francis Bacon’s Legacy of Texts: ‘The Art of Discovery Grows with Discovery’, ed. by W. Sessions, New York, AMS Press, 105-117.
Purnell, Fredrick, Jr., 2002, “A Contribution to Renaissance Anti-Hermeticism: The Angelucci-Persio Exchange,” in Das Ende des Hermetismus. Historische Kritik und neue Naturphilosophie in der Spätrenaissance, ed. by M. Mulsow, Tübingen, Max Niemeyer Verlag, 127-160.
Rees, Graham, 1977, “Matter Theory: A Unifying Factor in Bacon’s Natural Philosophy?,” Ambix 24: 110-125.
Schuhmann, Karl, 1988, “Hobbes and Telesio,” Hobbes Studies 1: 109-133.
Schuhmann, Karl, 2004, “Telesio’s Concept of Matter,” in Selected Papers on Renaissance Philosophy and on Thomas Hobbes, ed. by P. Steenbakkers and C. Leijenhorst, Dordrecht, Kluwer, 99-116.
Sciaccaluga, Nicoletta, 1997, “Movimento e materia in Bacone: uno sviluppo telesiano,” Annali della Scuola normale superiore di Pisa, classe di lettere e filosofia, Ser. 4, 2, 329-355.
Sergio, Emilio, 2007, “Campanella e Galileo in un «English Play» del circolo di Newcastle: «Wit’s Triumvirate, or the Philosopher» (1633-1635),” Giornale Critico della Filosofia Italiana, 86, 2, 298-315.
Sergio, Emilio, 2010, “Telesio e il suo tempo. Alcune considerazioni preliminari,” Bruniana & Campanelliana, 16, 1, 111-124.
Sergio, Emilio, 2013, Bernardino Telesio: una biografia, Naples: Guida.
Sergio, Emilio, 2014, “Bernardino Telesio (1509-1588),” in Galleria dell’Accademia Cosentina – Archivio dei filosofi del Rinascimento, vol. I, ed. by E. Sergio, Rome, CNR-ILIESI, 155-218.
Simonetta, Marcello, 2015, “Due lettere inedite del giovane Bernardino Telesio,” Bruniana & Campanelliana, 21, 2, 429-435.
Siraisi, Nancy G., 2011, “Giovanni Argenterio and Medical Innovation,” in Medicine and the Italian Universities, 1250-1600, Leiden, Brill, 329-355.
Spruit, Leen, 1995, “Bernardino Telesio,” in Species intelligibilis. From Perception to Knowledge, Leiden, Brill, vol. 2, 198-203.
Spruit, Leen, 1997, “Telesio’s reform of the philosophy of mind,” Bruniana&Campanelliana, 3: 123-143.
Spruit, Leen, 1998, Telesio’s Psychology and the Northumberland Circle, Durham Thomas Harriot Seminar, Occasional paper, Durham University, History of Education Project, 1-36.
Spruit, Leen, 2018, “Bernardino Telesio on Spirit, Sense, and Imagination,” in Image, Imagination, and Cognition. Medieval and Early Modern Theory and Practice, ed. by C. Lüthy, C. Swan, P. Bakker, C. Zittel, Brill, Leiden, 94-116.
Trabucco, Oreste, 2019, “Telesian Controversies on the Winds and Meteorology,” in Bernardino Telesio and the Natural Sciences in the Renaissance, ed. by P. D. Omodeo, Leiden, Brill.
Tutrone, Fabio, 2014, “The body of the soul. Lucretian echoes in the Renaissance theories on the psychic substance and its organic repartition,” Gesnerus, 71, 2, 204-236.
Peace is notoriously difficult to define, and this poses a special challenge for articulating any comprehensive philosophy of peace. Any discussion on what might constitute a comprehensive philosophy of peace invariably overlaps with wider questions of the meaning and purpose of human existence. The definitional problem is, paradoxically, a key to understanding what is involved in articulating a philosophy of peace. In general terms, one may differentiate negative peace, that is, the relative absence of violence and war, from positive peace, that is, the presence of justice and harmonious relations. One may also refer to integrative peace, which sees peace as encompassing both social and personal dimensions.
Section 1 examines potential foundations for a philosophy of peace through what some of the world’s major religious traditions, broadly defined, have to say about peace. The logic for this is that throughout most of human history, people have viewed themselves and reality through the lens of religion. Sections 2 through 5 take an historical-philosophical approach, examining what key philosophers and thinkers have said about peace, or what might be ascertained for possible foundations for a philosophy of peace from their work. Section 6 examines some contemporary sources for a philosophy of peace.
Sections 7 through 15 are more exploratory in nature. Section 7 examines a philosophy of peace education, and the overlap between this and a philosophy of peace. Sections 8 through 15 examine a range of critical issues in thinking about and articulating a philosophy of peace, including paradoxes and contradictions which emerge in thinking about and articulating a philosophy of peace. Section 16 concludes with how engaging in the practice of philosophy may itself be a key to understanding a philosophy of peace, and indeed a key to establishing peace itself.
It is logical that we should examine the theory of peace as set down in the teachings of some of the world’s major religious traditions, given that, for most of human history, people have viewed themselves and the world through the lens of religion. Indeed, the notion of religion as such may be viewed as a modern invention, in that throughout most of human history individuals have seen the spiritual dimension as integrated with the physical world. In discussing religion and peace, there is an obvious problem of the divergence between precept and practice, in that many of those professing religion have often been warlike and violent. Some writers, such as James Aho and René Girard, go further, and see religion at the heart of violence, through the devaluation of the present and through the notion of sacrifice. For the moment, however, we are interested in the teachings of the major world religions concerning peace.
If we examine world religious traditions and peace, it is appropriate that we examine Indigenous spirituality. There are a number of ways that such spirituality may provide grounds for a philosophy of peace, such as the notion of connectedness with the environment, the emphasis on a caring and sharing society, gratitude for creation and the importance of peace within the individual. This is not to deny that Indigenous societies, as with all societies, may be extremely violent at times. This is also not to deny that elements of Indigenous spirituality may be identifiable within other major world religious traditions. Yet many peace theorists look to Indigenous societies and Indigenous spirituality as a reference point for understanding peace.
Judaism enjoys prominence not merely as a world religion in its own right, and arguably the most ancient monotheistic religion in the world, but also as a predecessor faith for Christianity and Islam. Much of the contribution of Judaism towards theorizing on peace comes from the idea of an absolute deity, and the consequential need for radical ethical commitment. Within the Tanakh (Hebrew Scriptures), the Torah (Law) describes peace as an ultimate goal and a divine gift, although at times brutal warfare is authorized; the Nevi’im (Prophetic Literature) develops the notion of the messianic future era of peace, when there will be no more war, war-making or suffering; and the Ketuvim (Wisdom Literature) incorporates notions of inner peace into Judaism, such as the idea that a person can experience peace in the midst of adversity, and the notion that peace comes through experience and reflection.
Hinduism is a group of religious traditions geographically centered on the Indian sub-continent, which rely upon the sacred texts known as the Vedas, the Upanishads, and Bhagavad Gita. There are a number of aspects of Hinduism which intersect with peace theory. Karma is a view of moral causality incorporated into Hinduism, wherein good deeds are rewarded either within this lifetime or the next, and by contrast bad deeds are punished in this lifetime or the next. Karma presents a strong motivation to moral conduct, that is, one should act in accordance with the dharma, or moral code of the universe. A further element within Hinduism relevant to a peace theory is the notion of the family of humankind, and accordingly there is a strong element of tolerance within Hinduism, in that the religion tolerates and indeed envelopes a range of seemingly conflicting beliefs. Hinduism also regards ahimsa, strictly speaking the ethic of doing no harm towards others, and by extension compassion to all living things, as a virtue, and this virtue became central to the Gandhian philosophy of nonviolence.
Buddhism is a set of religious traditions geographically centered in Eastern and Central Asia, and based upon the teachings of Siddharta Gautama Buddha, although the dearth of any specific deity lead some to question whether Buddhism ought to be considered a religion. The significance of Buddhism for peace is the elevation of ahisma, that is, doing no harm to others, as a central ethical virtue for human conduct. It can be argued that the Buddhist ideal of avoidance of desire is also an important peaceful attribute, given that desire of all descriptions is often cited as a cause of war and conflict, as well as being a cause of the accumulation of wealth, which itself arguably runs counter to the creation of a genuinely peaceful and harmonious society.
Christianity is a set of monotheistic religious traditions, arising out of Judaism, and centered on the life and teachings of Jesus of Nazareth. The relationship of Christianity to a philosophy of peace is complex. Christianity has often emerged as a proselytizing and militaristic religion, and thus one often linked with violence. Yet there is also a countervailing undercurrent of peace within Christianity, linked to the teachings of its founder and also linked to the fact that its founder exemplified nonviolence in his own life and death. Forgiveness and reconciliation are also dominant themes in Christian teaching. Some Christian theologians have begun to reclaim the nonviolent element of Christianity, emphasizing the nonviolence in the teaching and life of Jesus.
Islam constitutes a further set of monotheistic religious traditions arising out of Judaism, stressing submission to the will of the creator, Allah, in accordance with the teachings of the Prophet Muhammed, as recorded in sacred texts of the Holy Qur’an. As with Christianity, the relationship of Islam to a philosophy of peace is complex, given that Islam also has a history of sometimes violent proselytization. Yet Islam itself is a cognate word for peace, and Islamic teaching in the Qur’an extols forgiveness, reconciliation, and non-compulsion in matters of faith. Moreover, one of the Five Pillars of Islam, Zakat, is an important marker of social justice, emphasizing giving to the poor.
There is an established scholarly tradition that interprets communism, the theory and system of social organization based upon the writings of Karl Marx and Friedrich Engels, as a form of nontheistic religion. Communist theory promises a peaceful future, through the elimination of inequality, the emergence of an ideal classless society, with a just distribution of resources, no class warfare and no international wars, given war in communist theory is often viewed as the result of capitalist imperialism. Communism envisages an end to what Engels described as social murder, premature deaths within a social class due to exposure to preventable yet lethal conditions.
Yet scholars such as Rudolph Rummel have suggested that communist societies have been the most violent and genocidal in human history. Idealism can be lethal. Others point to examples of peaceful communist societies. Importantly, scholars such as Noam Chomsky argue that, far from reflecting the ideals of Marx and Engels, communist societies of the twentieth century, in practice, betrayed those original ideals. Irrespective of this, the example of mass violence in communist societies suggests that a proper theory of peace must encompass not merely a goal or aspiration, but a way of life.
It is useful to enquire what commonalities we might discern in religious traditions regarding peace, and it seems fair to say that peace is usually viewed as the ultimate goal of human existence. For some religions, this is phrased in eschatological notions such as heaven or paradise, and in other religions this is phrased in terms of an ecstatic state of being. Even in communism, there is an eschatological element, through the creation of a future classless society. There is also an ethical commonality in traditions, in that peaceful existence and actions are set forth as an ethical norm, notwithstanding that there are exceptions to this.
It is in defining and understanding the exceptions that there is a degree of complexity. There is also a common conflict between universalism and particularism within religious traditions, with particularistic emphases, such as in the notion of the Chosen People, arguably embodying the potential for exclusion and violence.
2. Classical Sources for a Philosophy of Peace
The writings of Plato (428/7-348/7 B.C.E.) would not normally be thought of as presenting a source for a philosophy of peace. Yet there are aspects of Plato’s work, based upon the teaching of Socrates, which may constitute such a source. Within his major work Politeia (Republic), Plato focuses on what makes for justice, an important element in any broad concept of peace. Plato, in effect, presents a peace plan based upon his city-state. This ideal society is essentially static, involving three distinct classes, although it is, nevertheless, a society which provides for at least an internally peaceful polis or state. Plato also develops a theory of forms or ideals, and it is not too difficult to see peace as one of those forms or ideals, and, in contributing to the polis or state, we contribute to the development of that form or ideal. In his work Nomoi (Laws), Plato enunciates the view that the establishment of peace and friendship constitute the highest duty of both the citizen and the legislator, and in the work Symposium, Plato articulates the idea that it is love which brings peace among individuals.
The writings of Aristotle (384-322 B.C.E.) similarly do not present an obvious reference point for a philosophy of peace. Yet there may be such a reference point in his development of virtue ethics, notably in Ethica Nicomachea (Nichomachean Ethics). Virtue ethics may legitimately be linked to a philosophy or ethics of peace. The mean of each of the virtues described by Aristotle may be viewed as qualities conducive to peace. In particular, the mean of the virtue of andreia, usually translated as courage or fortitude, may be seen as similar to the notion of assertiveness, a quality which many writers see as important within nonviolence. Aristotle also identifies justice as a virtue, and many peace theorists emphasize the inter-relationship between peace and justice. Further, some writers have specifically identified peace or peacefulness as a virtue in itself. Interestingly, Aristotle sees the telos or goal of life as eudaimonia, or human flourishing, a concept similar to the ideals set forth in writing on a culture of peace.
3. Medieval Sources for a Philosophy of Peace
Saint Augustine of Hippo (354-430 C.E.) was both a bishop and theologian, and he is widely recognized as capably integrating classical philosophy into Christian thought. His thought is often categorized as late Roman or early medieval. One element of Augustinian thought relevant to a philosophy of peace is his adaptation of the neo-Platonic notion of privation, that evil can be seen as the absence of good. It is an idea which resonates with notions of positive and negative peace. Negative peace can be seen as the absence of positive peace. The notion of privation also suggests that peace ought to be seen as a specific good, and that war is the absence or privation of that good.
The best-known contribution of Augustine to a philosophy of peace, however, is his major work De civitate Dei (The City of God). Within this, Augustine contrasts the temporal human city, which is marked by violent conflict, and the eternal divine city, which is marked by peace. As with many religious writers, the ideal is peace. Augustine is also noteworthy for articulating the notion of just war, wherein Christians may be morally obliged to take up arms to protect the innocent from slaughter. However, this concession is by way of a lament for Augustine, as a mark that Christians are living in a temporal and fallen world. That is a concession which contrasts with the way that others have used just war theory, and in particular the work of Augustine, to justify and glorify war.
Saint Thomas Aquinas (ca.1225-1274) is perhaps best known for his attempt to synthesize faith with reason, for his popularization of Aristotelian thought, and for his focus on virtues. The significant contribution of Aquinas to a philosophy of peace is his major work Summa Theologica (Summary of Theology), and in particular the discussion on ethics and virtues in Part 2 of the work. At Question 29 of Part 2, Aquinas examines the nature of peace, and whether peace itself may be considered a virtue. Aquinas concludes that peace is not a virtue, and further concludes that peace is a work of charity (love). An important qualification, however, is that peace is also described as being, indirectly, a work of justice. We see here the inter-relationship of peace and justice, something taken up by contemporary peace theorists. Aquinas also refined the just war theory, including articulating the requirements of proper authority, just purpose, and just intent when resorting to war.
4. Renaissance Sources for a Philosophy of Peace
The Renaissance was a period of a revival of learning in Europe, and it is often identified as a period of transition from the medieval to the modern. The Renaissance is also known for the growth of humanism, that is, an era involving the rediscovery of classical literature, an outlook focusing on human needs and on rational means to solve social problems, and a belief that humankind can shape its own destiny. One central human problem for humanists, and indeed for many thinkers, was and is the phenomenon of war, and Renaissance humanists refused to see war as inevitable and unchangeable. This in itself is an important contribution to a philosophy of peace. Renaissance humanism was not necessarily anti-religious, and indeed most of the humanist writers from this time worked from specifically religious assumptions. It can be argued that in the 21st century we are still part of this humanist project, and an important part of the humanist project is to solve the problem of war and social injustice.
Erasmus of Rotterdam (ca.1466-1536), otherwise known as Desiderius Erasmus, is perhaps the foremost humanist writer of the Renaissance, and arguably also one of the foremost philosophers of peace. In numerous works, Erasmus advocated compromise and arbitration as alternatives to war. The connection between humanism and peace is perhaps best discernable in Erasmus’ 1524 work De libero arbitrio diatribe sive collatio (The Freedom of the Will), where Erasmus points out that if all that we do is predetermined, there is no motivation for improvement. The principle can apply to social dimensions as well. If everything is predetermined, then there is little point in attempting to work for peace. If we say that war and social injustice are inevitable, then there is little motivation to change. Further, saying that war and social injustice are inevitable serves as a self-fulfilling statement, and individuals will tend not to do anything to challenge war and social injustice.
De libero arbitrio is also useful for pondering a philosophy of peace in that the work presents an example of the idea that peace is a means or method, and not merely a goal. Although Erasmus wrote the work in debate with Martin Luther, Erasmus avoids polemics, is reticent to make assertions, strives for moderation, and is anxious to recognize the limitations of his argument. He points out in the Epilogue that parties to disputes will often exaggerate their own arguments, and it is from the conflict of exaggerated views that violent conflict arises. This statement was prophetic, given the religious wars which engulfed Europe following the Protestant Reformation.
However, the best-known peace tract from Erasmus is perhaps the adagium Dolce bellum inexpertis, (War is Sweet to Those Who Have Not Experienced It). Erasmus is quoting from the Greek poet Pindar, and in this adagium he is, in effect, presenting a cultural view of war, namely that war is at least superficially attractive. The implication, although Erasmus does not develop this, is that there is an element to peace which lacks the emotive appeal of war. This is an insight which explains much of the complex relationship between war and peace. Later writers would explore this idea to advocate for a vision of peace which would embrace some of the moral challenges associated with war.
Sir Thomas More (1478-1535) was another leading humanist writer of the Renaissance, and a friend and correspondent of Erasmus. In his 1516 book De optimae rei publicae statu deque nova insula utopia (On the Best Government and on the New Island Utopia), More outlines an ideal society based upon reason and equality. In Book One of Utopia, More articulates his concerns about both internal and external violence. Within Europe, and England in particular, there is senseless capital punishment, for instance in circumstances where individuals are only stealing to find something to eat and thus keep themselves alive. Further, there is a world-wide epidemic of war between monarchs, which debases the countries monarchs seek to lead. Book Two of Utopia provides the solution, with a description of an agrarian equalitarian society; where there is no private property; where the young are educated into pacifism; where war itself only resorted to for defensive reasons or to liberate the oppressed from tyranny; where psychological warfare is preferred to battle; and where there are no massacres nor destruction of cities. This utopian society suggested by More reflects a broad theory of peace. One of the interesting ramifications of More’s vision is whether such a peaceful society, and indeed peace, is ever attainable. The common meaning of the word “utopian” connotes something or a state which is not attainable, although it seems unlikely More would have written his work if he, in common with other humanists of his era and since, did not have at least some belief that the principles he was putting forth were in some way attainable.
5. Modern Sources for a Philosophy of Peace
Thomas Hobbes (1588-1679) was both a writer and a politician, whose writing was motivated by an overarching concern on how to avoid civil war, and the carnage and suffering resulting from this. He had observed this first-hand in England, and he famously articulated a statist view of peace as a contrast to the anarchy and violence of nature. In his two most noted works, De Civi (The Citizen) and Leviathan, Hobbes articulates a view that human nature is essentially self-interested, and thus the natural state of humankind is one of chaos. Hobbes also sees the essence of war as not merely the action of fighting, but a disposition to fight, and this exists only because there is a dearth of an overarching law-enforcing authority. The only way to introduce a measure of peace is therefore through submission of citizens to a sovereign, or, in more contemporary terminology, the state. Thus, a Hobbesian worldview is often taken to be pessimistic, it holds that the natural condition of humankind is one of violence, and that this violence inevitably predominates where there is no humanizing and civilizing impact of the state. Hobbes raises the important issue of how important is it to have an overarching external authority for lasting peace to exist. If we accept that such an external authority is necessary for peace, then arguably we have the capacity to invent mechanisms to set in place such an external authority.
Baruch or Benedictus de Spinoza (1632-1677) was a Dutch philosopher, of Jewish background, who wrote extensively on a range of philosophical topics. His relevance for a philosophy of peace in general may be found in his advocacy of tolerance in matters of religious doctrine. It is notable also that in his Tractatus Politicus (Political Treatise), written 1675-6 and published after his death, Spinoza asserts: “For peace is not mere absence of war but is a virtue that springs from force of character”. This is a definition of peace that anticipates later expositions, especially those that see peace as a virtue, but also twenty-first century peace theory that differentiates positive from negative peace.
John Locke (1632-1704) is arguably one of the most influential contributors to modern philosophy. Like other philosophers of the time, Locke is important for advancing the notion of tolerance, most clearly in his 1689 Letter Concerning Toleration. The background of this had been the destructive religious wars of the time, and Locke logically suggests that this violence can be avoided through religious tolerance. Within the work of Locke one can also discern elements of the idea of the right to peace. Around 1680, Locke composed his Two Treatises of Government, and, in the second of these at Chapter 2, Locke argues that each individual has a right not to be harmed by another person, that is, a right to life, and it is the role of political authority to protect this right. The right to life and the right not to be harmed arguably anticipate the later notion of the right to peace.
Jean-Jacques Rousseau (1712-1778) was a Genevan philosopher of history, and was both a leader and critic of the European Enlightenment. The idea of the noble savage, who lives at peace with his/her fellows and with nature, can be found in many ancient philosophers, although the noble savage is most often associated with the work of Rousseau. In his 1750 Discours sur les sciences et les arts (Discourse on the Sciences and the Arts), Rousseau posited that human morality had been corrupted due to culture; in his 1755 Discours sur l’origine et les fondements de l’inégalité parmi les hommes (Origins of the Inequality of Man), he posits that social and economic developments, especially private property, had corrupted humanity; in his 1762 work Du contrat social (The Social Contract), he posits that authority ultimately rests with the people and not the monarch; and in his 1770 Les Confessions (Confessions), Rousseau extols the peace which comes from being at one with nature. Rousseau anticipates common themes in much peace theory, and especially the counter-cultural and alternative peace movements of the 1960s and 1970s, namely that peace involves a conscious rejection of a corrupting and violent society, a return to a more naturalistic and peaceful existence, and a respect for and affinity with nature. In short, Rousseau suggests that the way to peace is through a more peaceful society, rather than through systems of peace.
Immanuel Kant (1724-1804) is often seen as the modern philosopher who, in his universal ethics and cosmopolitan outlook, has provided what many argue is the most extensive basis for a philosophy of peace. The starting point for the ethics of Kant is the philosophy of duty and an ethics based on duty, and, in particular, the duty to act so that what one does is consistent with what are reasonably desired universal results, what Kant called the categorical imperative. Kant introduced this notion in his 1785 work Grundlegung zur Metaphysik der Sitten (Foundation of the Metaphysics of Morals), and developed this in his 1788 Kritik der praktischen Vernunft (Critique of Practical Reason). It has been argued by many, including Kant himself, that we have a duty to peace and that we have a duty to act in a peaceful manner, in that we can only universalize ethics if we consider others, and this at the very least implies a commitment to peace.
A second important Kantian notion is that of das Reich der Zwecke, often translated as the realm or kingdom of ends. In Grundlegung zur Metaphysik der Sitten (Foundation of the Metaphysics of Morals), Kant suggests an ethical system wherein persons are ends-in-themselves, and each person is a moral legislator. It is a notion which has important implications for peace, in that the notion implies that each person has an obligation to regard others as ends-in-themselves and thus not engage in violence towards others. In other words, the notion implies that each person has a responsibility to act in a peaceful manner. If all persons acted in this way, it would also mean that the phenomenon of war, wherein moral responsibility is surrendered to the state, would become impossible.
Finally, Kant’s 1795 essay Zum ewigen Frieden (On Perpetual Peace) is the work most often cited in discussing Kant and peace, and this work puts forward what some call the Kantian peace theory. Significantly, in this work Kant suggests more explicitly than elsewhere that there is a moral obligation to peace. For instance, Kant argues in the Second Definitive Article of the work that we have an “immediate duty” to peace. Accordingly, there is also a duty for nation-states to co-operate for peace, and indeed Kant suggests a range of ways that this can be achieved, including republicanism and a league of nations. Importantly, Kant also suggests that the public dimension of actions, which can be understood as transparency, is important for international peace.
The work of Georg Wilhelm Friedrich Hegel (1770-1831) is contentious from the perspective of a philosophy of peace, as he holds what might be called a statist view of morality. Hegel sees human history as a struggle of opposites, from which new entities arise. Hegel sees the state, and by this he means the nation-state, as the highest evolution of human society. Critics, such as John Dewey and Karl Popper, have seen in Hegel a philosophical rationalization of the authoritarian and even totalitarian state. Yet the reliance on the state as an object of stability and peace does not necessarily mean acceptance of bellicose national policies. Further, just as human organization is evolving, one could equally argue that evolution towards a supra-national state with the object of world peace may also be consistent with the organic philosophy of Hegel. It is possible to view Hegel as a source for a philosophy of peace.
6. Contemporary Sources for a Philosophy of Peace
William James (1842-1910) was a noted American pragmatist philosopher, and his 1906 essay ‘The Moral Equivalent of War’, originally an oration, was produced at a time when many who had experienced the destruction and loss of life of the American Civil War were still alive. James provides an interesting potential source for a pragmatist philosophy of peace. James argues that it is natural that humans should pursue war, as the exigencies of war provide a unique moral challenge and a unique motivating force for human endeavor. By implication, there is little value in moralizing about war, and moralizing about the need for peace. Rather, what is needed is a challenge which will be seen as an equivalent or counterpoint to war – in other words a moral equivalent of war. The approach of James is consistent with the notion of positive peace, in that peace is seen to be something which embodies, or should embody, cultural challenges.
Mohandas Karamchand Gandhi (1869-1948) is widely regarded as the leading philosopher of nonviolence and intrapersonal peace. Through his life and teaching, Gandhi continually emphasized the importance of nonviolence, based upon the inner commitment of the individual to truth. Thus, Gandhi describes the struggle for nonviolence as truth-force, or satyagraha. Peace is not so much an entity or commodity to be obtained, nor even a set of actions or state of affairs, but a way of life. In Gandhism, peaceful means become united with and indistinguishable from peaceful ends, and thus the call for peace by peaceful means. The thought of Gandhi has been influential in the development of the intrapersonal notion of peace, that peace consists not so much as a set of conditions between those in power, but rather the inner state of a person. Gandhi is also noteworthy in that he linked nonviolence with economic self-reliance.
The philosopher Martin Buber (1878-1965) is well known for emphasizing the importance of authentic dialogue, which comes about when individuals recognize others as persons rather than entities. In his influential 1923 book Ich und Du (I and Thou), Buber suggests that we only exist in relationship, and those relationships are necessarily of two types: personal relationships involving trust and reciprocity, which Buber characterized as Ich-Du, or I-Thou relationships; and instrumental relationships, involving things, which Buber characterized as Ich-Es, or I-It relationships. The book was commenced during the carnage of World War One, and it is not too difficult to see the book as a philosophical reflection on the true nature of peace, in that peace involves dialogue with the other, with war constituting the absence of such dialogue.
There are commonalities between the philosophy of Buber and the ethics of care. Both indicate that we need to see the other as an individual and as a person, that is, we need to see the face of the other. If we recognize the other as human, and engage with them in dialogue, then we are less likely to engage in violence against others, and are more likely to seek for social justice for others. It is also noteworthy that Buber emphasized that all authentic life involves encounter. Thus, if we are not engaging in dialogue with others, then we ourselves do not have peace, at least not in the positive and full construction of the concept.
Martin Luther King Jr. (1929-1968) is perhaps best known as a civil rights campaigner, although he also wrote and spoke extensively on peace and nonviolence. These ideals were also exemplified in his life. One could argue that King did not develop any new philosophy as such, but rather expressed ideas of peace and nonviolence in a uniquely powerful way. Some of the key themes articulated by King were the importance of loving one’s enemies, the duty of nonconformity, universal altruism, inner transformation, the power of assertiveness, the interrelatedness of all reality, the counterproductive nature of hate, the insanity of war, the moral urgency of the now, the necessity of nonviolence in seeking for peace, the importance of a holistic approach to social change, and the notion of evil, especially as evidenced in racism, extreme materialism and militarism.
Gene Sharp (1928-2018) was also an important theorist of nonviolence and nonviolent action, and his work has been widely used by nonviolent activists. Central to his thought are his insights into the power of the state, notably that this power is contingent upon compliance by the subjects of a state. This compliance works through state institutions and through culture. From this, Sharp developed a program of nonviolent action, which works through subverting state power. Critics of Sharp argue that he was in effect a supporter of an American-led world order, especially as his program of nonviolent struggle was generally applied to countries not complying with US geostrategic priorities or with countries not compliant with corporate interests.
Johan Galtung (1930 -) is widely recognized as the leading contemporary theorist on peace, and he is often described as the founder of contemporary peace theory. Galtung has approached the challenge of categorizing peace through describing violence, and specifically through differentiating direct violence from indirect or structural violence. From this distinction, Galtung has developed an integrated typology of peace, comprising: direct peace, where persons or groups are engaged in no or minimal direct violence against another person or group; structural peace, involving just and equitable relationships in and between societies; and cultural peace, where there is a shared commitment to mutual support and encouragement. More recently, a further dimension has been developed, namely, environmental peace, that is, the state of being in harmony with the environment.
The notions of positive and negative peace derive largely from the work of Galtung. Direct peace may be seen as similar to negative peace, in that this involves the absence of direct violence. Structural and cultural peace are similar notions to positive peace, in that these notions invite reflection on wider ideas of what we look for in a peaceful society and in peaceful interactions between individuals and groups. Similarly, an integrated notion of peace, involving personal and social dimensions of peace, derives substantially from Galtung, in that Galtung sees the notions of peace and war as involving more than an absence of violence between nation-states, which is what people often think of when we speak of a time of peace or a time of war.
The value of the various Galtungian paradigms is that these encourage thinking about the complex nature of peace and violence. Yet a problem with the Galtungian approach is that it can be argued as being too all-encompassing, and thus too diffuse. Peace researcher Kenneth Boulding summed up this problem by suggesting, famously, that the notion of structural violence, as developed by Galtung, is, in effect, anything that Galtung did not like. By implication, Galtung’s notion of peace too can be argued to be too general and too diffuse. Interestingly, Galtung has suggested that defining peace is a never-ending task, and indeed articulating a philosophy of peace might similarly be regarded as a never-ending exercise.
7. The Philosophy of Peace Education
In investigating a philosophy of peace, it is useful to examine writing on what might reasonably constitute a philosophy of peace education. The reason is that when defining peace education, we are in effect defining peace, as the encouragement and attainment of peace is the ultimate goal of peace education. Just as peace is increasingly seen as a human right, so too peace education may be thought of as a human right. Thus any philosophy of peace education is very closely linked with what might be seen as a philosophy of peace. For convenience, we can divide approaches to a philosophy of peace education into the deontological and non-deontological.
James Calleja has argued that the philosophical basis for peace education may be found in deontological ethics, that is, we have a duty to peace and a duty to teach peace. Calleja relies strongly on the work of Immanuel Kant in developing this argument, and, in particular, on the Kantian notion of the categorical imperative, and in the subsequent categorical imperative of peace. The first formulation of the categorical imperative from Kant is that one should act in accordance with a maxim that is universal, that is, one should wish for others what one wishes for oneself. In effect, this is can be seen as a philosophical basis for nonviolence and for universal justice, in that as we would wish for security and justice for ourselves, so too we ought to desire this for others.
James Page has developed an alternative philosophical approach to peace education, identifying virtue ethics, consequentialist ethics, conservative political ethics, aesthetic ethics and care ethics as potential bases for peace education. Equally, however, each of the above may also be argued as providing an ethical and philosophical basis for a general theory of peace. For instance, peace may be seen as a settled disposition on the part of the individual, that is, a virtue; peace may be seen as the avoidance of the destruction of war and social inequality; peace may be seen as the presence of just and stable social structures, that is, a social phenomenon; peace may be seen as love for the world and the future, that is, an aesthetic disposition; and peace may be seen as caring for individuals, that is, moral action.
8. The Notion of a Culture of Peace
The realization that peace is more than the absence of conflict lies at the heart of the emergence of the notion of a culture of peace, a notion which has been gaining greater attention within peace research in the late twentieth and early twenty-first centuries. The notion was implicit within the UNESCO mandate, with the acknowledgment that since wars begin in human minds, it follows that the defense against war needs to be established in the minds of individuals. An extensive expression of this notion was set forth in the United Nations General Assembly resolution 53/243, the Declaration and Programme of Action on a Culture of Peace, adopted unanimously on 13 September 1999, which describes a culture of peace as a set of values, attitudes, traditions and modes of behavior and ways of life. Article 1 of the document indicates that these are based upon a respect for life, ending of violence and promotion and practice of nonviolence through education, dialogue and cooperation.
Any attempt at a philosophy of a culture of peace is complex. One of the challenges is that conflict is a necessary part of human experience and an important element in the emergence of culture. Even if we differentiate violent conflict from mere social conflict, this does not solve the problem entirely, as human culture has still been very much dependent upon the phenomenon of war. A more thorough solution is to admit that war and violence are indeed important factors in human experience and in the formation of human culture, and, rather than denying this, to attempt to seek and foster alternatives to war as a crucial motivating cultural factor for human endeavor, such as William James suggested in his famous essay on a moral equivalent of war.
9. The Right to Peace
Another emerging theme in peace theory has been the notion of peace as a human right. There is some logic to the notion of peace as a human right. The emergence of the modern human rights movement arose very much out of the chaos of global war and the emerging consensus that the recognition of human rights was the best way to establish and maintain peace. The right to peace may arguably be found in Article 3 of the Universal Declaration of Human Rights, which posits the right to life, sometimes called the supreme right. The right to peace arguably flows from the right to life. This right to peace has been further codified with United Nations General Assembly resolution 33/73, the Declaration on the Preparation of Societies for Life in Peace, adopted on 15 December 1978; with the United Nations General Assembly resolution 39/11, the Declaration of the Right of the Peoples of the World to Peace, adopted on 12 November 1984; and most recently with the United Nations General Assembly resolution 71/189, the Declaration on the Right to Peace, adopted on 19 December 2016.
In a lecture to the International institute of Human Rights in 1970, Karel Vastek famously suggested categorizing human rights in terms of the motto of the French revolution, namely, “liberté, égalité, fraternité.” Following this analysis, first generation rights are concerned with freedoms, second generation rights are concerned with equality, and third generation rights are concerned with solidarity. The right to peace is often characterized as a solidarity or third generation right. Yet one can take a wider interpretation of peace, for instance, that peace implies the right to development and the enjoyment of individual human rights. In this light, peace can be seen as an overarching human right. It is noticeable that there seems to have been such an evolution in thinking about the human right to peace, in that this is gradually being interpreted to include other rights, such as the right to development.
In examining the philosophical foundations for a human right to peace it is useful to examine some of the philosophical bases for human rights generally, namely, interest theory, will theory, and pragmatic theory. Interest theory suggests that the function of human rights is to promote and protect fundamental human interests, and securing these interests is what justifies human rights. What are fundamental human interests? Security is generally identified as being a basic human interest. For instance, John Finnis refers to “life and its capacity for development” as a fundamental human interest, and that “A first basic value, corresponding to the drive for self-preservation, is the value of life” (1980, p. 86). The best chance for self-preservation is that there be a norm for non-harm, which is an important element within a culture of peace. The right to peace therefore serves that basic need for life, both in the sense of protection from violence but also in serving the interests of a good life.
Will theory focuses on the capacity of individuals for freedom of action and the related notion of personal autonomy. For instance, those such as Herbert Hart have argued that all rights stem from the equal right of all individuals to be free. Any right to personal freedom, however, contains an inherent limitation, in that one cannot logically exercise one’s own freedom to impinge upon another person’s freedom. This is captured in the adage that my right to swing my fist ends at another person’s nose. Why is that adage correct? One answer is that within the notion of will theory there is an implicit endorsement of a right to peace, that is, not to harm or do damage to others.
The pragmatic theory of human rights posits that such rights simply constitute a practical way that we can arrive at a peaceful society. For instance, John Rawls suggests that the laws of people, as opposed to the laws of states, is a set of ideals and principles by which people from different backgrounds can agree on how their actions towards each other should be governed and judged, and through which people can establish the conditions of peace. This is not to deny those critics who point out that human rights can function as a rationale for the powerful to engage in collective violence, and that there can be a tension between human rights and national sovereignty. Thus, paradoxically, national sovereignty can sometimes serve to promote and provide peace, and human rights can sometimes be used underscore violence.
The importance of the human right to peace is perhaps best summed up by William Peterfi, who has described peace as a corollary to all human rights, such that “without the human right to peace no other human right can be securely guaranteed to any individual in any country no matter the ideological system under which the individual may live” (1979, p.23). The notion of the human right to peace also changes the nature of discourse about peace, from something to which individuals and groups might aspire, to something which individuals and groups can reasonably demand. The notion of the human right to peace also changes the nature of the responsibility of those in positions of power, from a vague aspiration that those in power need to provide for peace, to the expectation and duty that those in power will provide peace.
10. The Problem of Absolute Peace
Given the challenges of defining peace, the philosophical problem of peace may be phrased in terms of a question: is there any such thing as absolute peace? Or ought we be satisfied with an imperfect peace? For instance, can there ever be a complete elimination of all forms of armed conflict, or at least the elimination of reliance on armed force as the ultimate means of enforcement of will? Similarly, one may ask: is there any such thing as absolute co-operation and harmony between individuals and groups, an absolute sense of well-being within individuals, and an absolute oneness with the external environment?
The philosophical solution to this problem may be to point out that there is always an open-ended dimension to peace, that is, if we take a broad interpretation of peace, we will always be moving towards such a goal. Some might articulate this as the eschatological dimension of peace, suggesting that the contradictions which are raised in any discussion on peace can only be resolved, ultimately, at the end of time. It is relevant to note, however, that peace theorists have pointed out that if we assert that a certain outcome, such as peace, is not attainable, our actions will serve to make this a self-fulfilling prophecy. In other words, if we assert that peace, relative or absolute, is not attainable, then there will be a reduced expectation of this, and a reduced commitment to making this happen.
11. Peace and the Nature of Truth
It is worthwhile looking at the relationship of the theory of peace to the theory of truth. The relationship can be seen to operate at a number of levels. For instance, Mohandas Gandhi described his theory of nonviolence as satyagraha, often translated as truth force. Similarly, Gandhi entitled his autobiography ‘The Story of My Experiments with Truth’. Gandhi saw nonviolence, or ahimsa, as the noblest expression of truth, or sat, and argued there is no way to find truth except through nonviolence. For Gandhi, peace was not merely an ideal, rather it was based on what he saw as the truth of the innate nonviolence of individuals, which the institutions of war and imperialism distorted. Further, peace involves authenticity, a notion related to truth, in that the person involved in advocating peace ought to themselves be peaceful. We thus arrive at the Gandhian dictum that there is no way to peace as such, rather peace is the way, that is, peace is an internal life-style commitment on the part of the individual.
Conversely, war arguably operates as a form of untruth. This was summed up succinctly by Erasmus, in his dictum that war is sweet to those who have not experienced it. IN 1985, Elaine Scarry wrote that the mythology of war obscures what war is actually about, namely, the body in pain. Similarly, Morgan Scott Peck has written about a lack of truthfulness, especially in war, as being the essence of evil. Typically, those advocating war will concede that the recourse to war is not a good option, but suggest that there is no other option, or that war is the least bad option. The empirical history of nonviolence suggests that this is not the case, and that there are almost always alternatives to violence.
If peace is about establishing societies with harmonious and cooperative relationships, then a key component in establishing such societies is arguably knowledge about ourselves, or accepting the truth about ourselves. Without this, it is unlikely that we will be able to establish peaceful societies, as we will not have resolved the inclinations to violence within ourselves. The notion of what constitutes the true self, or the truth about one’s self, is a complex one. Carl Gustav Jung usefully wrote about the shadow or the normally unrecognized side of one’s character. The extent to which the shadow side of our personality can result in participation in and support for violence can be shocking to us. This is not to say that human nature is irretrievably attracted to violence or cruelty. For instance, the Seville Statement on Violence, sponsored by UNESCO, argues that war is a human invention. Yet there is a strong argument that peace involves recognition of the potential within one’s self for violence. Put another way, peace involves peace with one’s self.
12. Peace as Eros
In the work of Sigmund Freud, and especially in his 1930 work Das Unbehagen in der Kulture (Civilization and its Discontents), Eros is the life instinct, which includes sexual instincts and the will to live and survive. The nominal opposite of the life instinct is the death instinct, which is the will to death. Later theorists described this as Thanatos. Freud developed his theory of competing drives in his therapeutic dealings with soldiers from World War One, many of whom were suffering from psychological trauma as a result of their war experiences. It is not too difficult to see Eros as a synonym for peace, in that peace involves all that Eros represents. Psychiatrist and peace activist Eric Fromm developed this theme further, writing of biophilia as the love of life, from which all peace comes, and necrophilia, as the love of death and destruction, which is the basis of war.
Even if we acknowledge a link between the death instinct and war, the relationship between the life instinct and the death instinct is not simple. Freud wrote of the basic desire for death seemingly competing with the desire for life. Yet the two instincts may also be viewed as complementary. It is because we are all aware, at least subconsciously, of our impending mortality, that we a driven to risk death, especially in the enterprise we call war. Many writers have explored this complexity. For instance, the psychiatrist Elizabeth Kubler-Ross writes: “Is war perhaps nothing else but a need to face death, to conquer and master it, to come out of it alive—a peculiar form of denial of our own mortality?” (2014, p.13).
If we think of Eros as peace, then a logical extension is to think of human sexuality and the expression of human sexuality as one embodiment of peace. The post-Freudians Herbert Marcuse and Wilhelm Reich both developed this theme, arguing that the origins of war and unjust social organization rested in repressed sexual desire, and that conversely peace implies sexual freedom. This idea was neatly summed up in the 1960s radical slogan, “Make love not war”. An important qualification to the peace-as-sexuality theory is that this always involves consensual sexual relationships. Many writers have identified rape and other exploitative sexual relationships as important components of war and social injustice.
13. Peace, Empire and the State
In considering a philosophy of peace, the phenomenon of empire presents a paradox for peace theory. The establishment of an empire may be seen as establishing a form of peace. It is thus common to refer to Pax Romana, as the form of peace which was established by virtue of the Roman Empire, and Pax Britannica, Pax Sovietica, and Pax Americana, referring to later periods of empire. It is true that within empires, it can be argued that there is no war, at least not in the conventional sense. Critics of imperialism, however, point to violence being moved to the periphery of the empire; there is the problem of inter-imperial rivalry; and there is also the problem that empires frequently engage in the violent suppression of minorities within the borders of the empire.
Similarly, the phenomenon of the state presents a paradox for peace theory. The establishment of a stable state generally means that citizens can live and work free from violence, and ideally, at least in democratic states, within a framework of social justice. Yet, as sociologist Max Weber famously pointed out, it is in the very nature of the state that it claims a monopoly over the legitimate use of violence. The legitimate use of violence finds its ultimate expression in the phenomenon of war. Thus, anarcho-pacifists argue if one wants to eliminate war, then one needs to eliminate the state, at least in its current nation-state form.
14. An Existentialist Philosophy of Peace
Existentialism may be defined in philosophical terms as the view that truth cannot be objectified, but rather it can only be experienced. This is not to deny the objective reality of an entity, but rather to say that the limitations of language are such that this cannot be objectified. We can apply this to a philosophical analysis of peace, and suggest that ultimately peace cannot be objectified, but rather it can be experienced. Thus, attempts to specify what peace is are likely to be problematic. Rather we can represent peace by way of illustration, to say that peace involves a set of behaviors and attitudes, and we can represent peace by way of negation, to say that peace is not deliberate violence to other persons. Or we can say, in true existentialist fashion, that we can only know peace through encounter or relationship.
Another way of articulating the idea of existentialist peace is by referring to the metaphysics of peace. The existentialist theologian John Macquarrie writes: “By a metaphysical concept, I mean one the boundaries of which cannot be precisely determined, not because we lack information but because the concept itself turns out to have such depth and inexhaustibility that the more we explore it, the more we see that something further remains to be explored” (1973, p.63), and further: ”If peace … is fundamentally wholeness, and if metaphysics seeks to maximize our perception of wholeness and inter-relatedness, then peace and metaphysics may be more closely linked than is sometimes supposed; while, conversely, the fragmented understanding of life may well be connected with the actual fracturing of life itself, a fracturing which is the opposite of peace. But the true metaphysical dimensions of peace emerge because even to seek a wholeness for human life drives us to ask questions which take us to the very boundaries of understanding. What is finally of value? What is real and what is illusory? What conditions would one need to postulate as making possible the realization of true peace?” (1973, p.64).
15. Decolonizing Peace
Postcolonial theory posits, in general terms, that not only has global colonial history determined the shape of the world as we know it today, but the power relationships implicit in colonialism have determined contemporary thinking. Thus, the powerless tend to be marginalized in contemporary thinking. Some writers, such as Victoria Fountain, have suggested there is a need to decolonize peace theory, including taking into account the everyday experience of ordinary people, transcending liberal peace theory which tends to assume the legitimacy of power, and transcending the view that the Global North needs to come to rescue of the Global South. Thus the discourse on peace, so it is argued, needs to be less Eurocentric. The argument is that the narrative of peace needs to change.
Postcolonial peace theory intersects with much feminist peace theory, represented by writers such as Elizabeth Boulding, Cynthia Enloe, Nel Noddings, and Betty Reardon. The suggestion is often made by such theorists that a feminine or maternal perspective is uniquely personal, caring and peace-oriented. The corollary to this is that a male perspective tends to be less personal, less caring, and more war-centric. Feminist peace theorists have also pointed out that war and militarism work on patriarchal assumptions, such as women need protecting and it is the duty of men to protect women, and that there is no alternative to the current system of security through power and domination. The argument is also made that war and patriarchy are part of the same system.
Postcolonial and feminist peace theory are highly contested. For instance, it can be argued that, as current philosophical discourse has evolved from European origins, articulating peace in terms of concepts articulated by European authors is a merely a matter of utilizing this global language. Similarly, one can argue since it is a historical reality that most influential philosophers in history have hitherto been male, therefore the existing narrative will naturally tend to have more male sources and male voices. One can arguably apply a quota system to some areas such as contemporary politics, but it is more difficult to argue that a quota system ought to be applied to narrative and to discourse. Critics of postcolonial peace theory also allege that postcolonial peace theory tends to avoid universalist statements on human rights, which itself is important, given the key role of human rights in peace, and given the emerging human right to peace itself.
16. Concluding Comments: Philosophy and Peace
One interesting way to address the issue of a philosophy of peace is to think of war as representing the absence of philosophy, in that war is prosecuted on the assumption that one person or group itself possesses truth, and that the views of that individual or group ought to be imposed, if necessary, by violent force. War may also be seen as the absence of philosophy in that war represents an absence of the love of wisdom. This is not to deny there are philosophies and philosophers who justify war and injustice. Ultimately, however, these philosophies are not sustainable, as war is an institution which involves destruction of both the self and societies. Similarly, social injustice is not sustainable, as within social injustice we find the seeds of war and destruction.
Conversely, it can be argued that philosophy itself represents the presence of peace, in that philosophy generally does not or should not involve assumptions that one person or group by itself uniquely possesses truth, but rather the way to truth is through a process of questioning, sometimes called dialectic. Therefore, philosophy by its essence is or should be a tolerant enterprise, and it is also an enterprise which involves or should involve debate and discussion. Philosophy thus presents a template for a peaceful society, wherein differing viewpoints are considered and explored, and which, through the love of wisdom, encourages thinking and exploring about positive and life-enhancing futures. This means that engaging in philosophy may well be a useful start to a peaceful future.
17. References and Further Reading
Aho, J. (1981) Religious Mythology and the Art of War. Westport: Greenwood.
Aquinas (1964-1981) Summa Theologiae: Latin Text and English Translation. (T. Gilbey and others, Eds.) Cambridge: Blackfriars, and New York: McGraw-Hill.
Aristotle (1984) The Complete Works of Aristotle. The Revised Oxford Translation. (J.Barnes, Ed.) Princeton: Princeton University Press.
Aron, R. (1966) Peace and War: A Theory of International Relations (R.Howard and A.B. Fox, Transl.) London: Weidendfeld and Nicholson.
Augustine (1972) Concerning the City of God against the Pagans. (H. Bettenson, Transl.) Harmondsworth: Penguin.
Boulding, E. (1988) Building a Global Civic Culture: Education for an Interdependent World. San Francisco: Jossey-Bass.
Boulding, E. (2000) Cultures of Peace: The Hidden Side of History. Syracuse: Syracuse University Press.
Boulding, K. (1977) Twelve friendly quarrels with Johan Galtung. Journal of Peace Research. 14(1): 75-86.
Buber, M. (1984) I and Thou. (R. Gregor-Smith, Transl.) New York: Scribner.
Calleja, J.J. (1991) A Kantian Epistemology of Education and Peace: An Evaluation of Concepts and Values. PhD Thesis. Bradford: Department of Peace Studies, University of Bradford.
Chomsky, N. (2002) Understanding Power: The Indispensable Chomsky. (P.R. Mitchell and J.Schoeffel, Eds.). New York: The New Press.
Ehrenreich, B. (1999) Men Hate War, Too. Foreign Affairs 78 (1): 118–22.
Enloe, C. (2007) Globalization and Militarism: Feminists Make the Link. Lanham: Rowman and Littlefield.
Erasmus, D. (1974) Collected Works of Erasmus. Toronto: University of Toronto Press.
Finnis, J. (1980) Natural Law and Natural Rights. Oxford: Clarendon Press; New York: Oxford University Press.
Fontan, V.C. (2012) Decolonizing Peace. Lake Oswego: Dignity Press.
Galtung, J. (2010) Peace, Negative and Positive. In: N.J. Young (Ed.). The Oxford Encyclopedia of Peace. (pp. 352-356). Oxford and New York: Oxford University Press.
Galtung, J. (1996) Peace by Peaceful Means. London: SAGE Publications..
Gandhi, M.K. (1966) An Autobiography: The Story of my Experiments with Truth. London: Jonathan Cape.
Girard, R. (1977) Violence and the Sacred. (P. Gregory, Transl.) Baltimore: John Hopkins University Press.
Hobbes, T. (1998) On the Citizen (R.Tuck and M.Silverthorne, Eds.) Cambridge: Cambridge University Press.
Hobbes, T. (1994) Leviathan (E. Curley, Ed.) Indianapolis: Hackett.
Kant, I. (1992-) The Cambridge Edition of the Works of Immanuel Kant. (P.Guyer and A. Woods, Eds.) Cambridge: Cambridge University Press.
King, M.L. (1963) Strength to Love. Glasgow: Collins.
Kübler-Ross, E. (2014) On Death and Dying. New York: Scribner.
Locke, T. (1988) Two Treatises of Government. (P. Laslett, Ed.) Cambridge: Cambridge University Press.
Locke, T. (2010) A Letter Concerning Toleration and Other Writings. (M. Goldie, Ed.) Indianapolis: Liberty Fund.
Macquarrie, J. (1973) The Concept of Peace. London: SCM.
More, T. (1999) Utopia. (D. Wootten, Ed.) Cambridge: Hackett Publishing.
Noddings, N. (1984) Caring: A Feminine Approach to Ethics and Moral Education. Berkeley: University of California Press.
Page, J.S. (2008) Peace Education: Exploring Ethical and Philosophical Foundations. Charlotte: Information Age Publishing.
Page, J.S. (2010) Peace Education. In: E. Baker, B. McGaw, and P. Peterson (Eds.) International Encyclopedia of Education. (Volume 1, pp. 850–854). Oxford: Elsevier.
Page, J.S. (2014) Peace Education. In: D. Phillips (Ed.) Encyclopedia of Educational Theory and Philosophy. (Volume 2, pp. 596-598). Thousand Oaks: Sage Publications.
Peck, M.S. (1983) People of the Lie. New York: Simon and Schuster.
Peterfi, W. (1979) The Missing Human Right: The Right to Peace. Peace Research, 11(1): 19-25.
Rawls, J. (1999) The Law of Peoples. Cambridge: Harvard University Press.
Reardon, B. (1993) Women and Peace: Feminist Visions of Global Security. Albany: State University of New York Press.
Roche, D. (2003) The Human Right to Peace. Toronto: Novalis.
Rousseau, J. (1990-2010) Collected Writings. (R. Masters and C. Kelly, Eds.) 13 volumes. Dartmouth: University Press of New England.
Rummel, R. (1994) Death by Government. New Brunswick: Transaction Press.
Scarry, E. (1985) The Body in Pain. New York and London: Oxford University Press.
Spinoza, B. (2002) Baruch Spinoza: The Complete Works. (M.L. Morgan, Ed., S. Shirley, Transl.) Indianapolis: Hackett.
Watson, P.S. and Rupp. E.G. (Eds.) (1969) Luther and Erasmus: Free Will and Salvation. London: SCM Press.
Author Information
James Page
Email: jpage8@une.edu.au
University of New England
Australia
David Lewis (1941–2001)
David Lewis was an American philosopher and one of the last generalists, in the sense that he was one of the last philosophers who contributed to the great majority of sub-fields of the discipline. He made central contributions in metaphysics, the philosophy of language, and the philosophy of mind. He also made important contributions in probabilistic and practical reasoning, epistemology, the philosophy of mathematics, logic, the philosophy of religion, and ethics, including metaethics and applied ethics. He published four monographs and over one hundred articles.
Lewis’s contributions in metaphysics include foundational work in the metaphysics of modality, in particular his peculiar view of concrete modal realism. He also developed influential views about properties, dispositions, time, persistence, and causation. In the philosophy of language, he made important contributions to our understanding of conditionals—counterfactuals in particular. He also developed an influential account of what it is for a group of individuals to use a language, based on his similarly influential account of what it is for a group of individuals to adopt a convention. In the philosophy of mind, Lewis gave an important defense of mind-brain identity theory, and also developed an account of mental content that was based on his metaphysics of properties and modality.
This article discusses in detail only Lewis’s most popularized and influential views and arguments in metaphysics, the philosophy of language, and the philosophy of mind. His views on metaphysics are discussed first, but his views on language and mind are no less influential. The focus is on representative examples of his most important views and arguments concerning particular issues. The article begins with a few short remarks about his biography, and it ends with a discussion of some of his other philosophical contributions.
David Kellogg Lewis was born in 1941 in Oberlin, Ohio. He did his undergraduate studies at Swarthmore College in Pennsylvania. He studied abroad for a year in Oxford, where he was tutored by Iris Murdoch, and where he had the opportunity to attend lectures by J. L. Austin. These experiences inspired him to major in philosophy when he returned to Swarthmore. He did his Ph.D. at Harvard, studying under W. V. O. Quine, who supervised his dissertation, which was the basis of his first book, Convention (1969). There he met his wife Stephanie, with whom he ultimately co-authored three papers. He worked at UCLA from 1966 to 1970, moving from there to Princeton, where he remained until his death in 2001. He spent a lot of time visiting and working in Australia from 1971 onward. As a result, his work was deeply influenced by a number of Australian philosophers, and, in turn, his work has made an indelible mark on analytic philosophy in Australia.
2. Modality
If you are looking for what Lewis had to say about modality, you most likely want to learn about his well-known but rather idiosyncratic view, concrete modal realism. The study of modality is the study of the meanings of expressions like ‘necessarily’ and ‘possibly’. One can assert that Socrates was a blacksmith, which is, of course, false. But one can also assert something weaker, that, possibly, Socrates was a blacksmith (that is, Socrates could have been a blacksmith). Or one can assert something stronger, that, necessarily, he was a blacksmith (that is, he could not have failed to be a blacksmith). There are different senses of the words ‘necessarily’ and ‘possibly’. One is related to what someone knows. Perhaps you are unsure whether Socrates was a philosopher. You might say that Socrates could have been a philosopher, meaning that, for all you know, Socrates was a philosopher (that is, nothing you know contradicts it). Or perhaps you are certain that he was a philosopher, in which case you might simply say that Socrates was a philosopher. Or you might say something stronger—that Socrates must have been a philosopher (that is, what you know contradicts his not having been a philosopher). This sort of modality is epistemic modality. The sort of modality Lewis was most concerned with in his development of concrete modal realism is alethic modality, and concerns how things might have been, or how things must be, regardless of what anyone thinks or knows about it.
One of the central questions in the study of (alethic) modality is what ‘necessarily’ and ‘possibly’ mean. Most discussions of modality are framed in terms of modal logic, which is a formal language that is an extension of propositional or first-order logic, generated by adding the modal operators ‘necessarily’ and ‘possibly’, abbreviated by ‘’ (the box) and ‘’ (the diamond). One approach to the question of what the modal operators mean is simply not to answer to it, and to take them as primitive, that is, to take their meanings to be unanalyzable. But, the reader might think, this is not all that satisfying an approach to take. And Lewis would agree. One of the first things he does in his seminal work on concrete modal realism, On the Plurality of Worlds (1986b)—hereafter ‘Plurality’, is to argue that the modal operators should not be taken to be primitive, but instead should be given some sort of analysis in non-modal terms. In the mid-20th century, logicians developed semantics for a variety of systems of modal logic. These semantics provide truth conditions for the box and diamond in terms of mathematical objects which came to be called ‘possible worlds’, since they were naturally interpretable as ways that the world could have been. Trump won the 2016 U.S. presidential election. But it could have been otherwise. He could have lost. Imagine that Trump lost the 2016 election, and that as few other facts as possible are different in order for that to have happened. What you are imagining is a possible world. The basic idea behind any possible-worlds-based analysis of the modal operators is rather simple. One can state the conditions in which sentences involving the modal operators are true in terms of possible worlds, by quantifying over them with quantifiers that behave exactly like those of the universal and existential quantifiers from standard first-order logic, as follows:
for every possible world w, p is true at w.
for some possible world w, p is true at w.
So a statement is necessarily true if it is true at every possible world and false otherwise. And it is possibly true if it is true at at least one possible world and false otherwise. It is actually true if it is true at the actual world (that is, the possible world which we inhabit).
Lewis was not the first to interpret the objects quantified over in these analyses, and at which propositions are true (and false), as possible worlds. Thus he was not the first to admit possible worlds into his ontology. What sets him apart from many of those who came before was how he conceived of possible worlds. Typically, worlds were thought of as abstract objects, for example, as maximal consistent sets of sentences of some interpreted language (1986b: 142 ff.). A maximal set of sentences is one that contains, for every sentence p, either p or its negation. A consistent set of sentences is one which does not imply a contradiction. So, {grass is green, grass is not green} is not consistent. Nor is {grass is green, if grass is green then snow is white, snow is not white}. However, {grass is green, snow is white}, is consistent, though not maximal. For Lewis, a possible world is not some abstract object like a set of sentences. Instead, it is something akin to our own world—a continuum of spacetime filled with objects of various sorts, like the ones we ourselves are surrounded by—galaxies, stars, mountains, people, chairs, atoms, and so forth. Possible worlds, for Lewis, are concrete, just like this world in which we find ourselves. Strictly speaking, modal realism is just the view that possible worlds exist (whether one thinks they are abstract or concrete). Concrete modal realism is the view that they exist and are concrete objects. It is this latter, more controversial thesis that Lewis is famous for defending.
Lewis’s argument for concrete modal realism has two main parts. The first part consists in arguing for the ‘realist’ part of concrete modal realism, thereby providing reasons against the alternative of taking the modal operators as primitive. His argument for this consists in showing what possible worlds are good for. He highlights some things that can be done, or can more easily be done, if possible worlds are available. He highlights four such things. The first concerns certain modal locutions of natural language (English) that do not appear to be translatable into sentences with just the box and diamond. One sort of such locution involves modal comparisons. The example Lewis gives is: “a red thing could resemble an orange thing more closely than a red thing could resemble a blue thing” (1986b: 13). Lewis’s analysis involves quantification over possible individuals:
For some x and y (x is red and y is orange and for all u and v (if u is red and v is blue, then x resembles y more than u resembles v)). (1986b: 13)
But, he points out, one would not be able to translate the original sentence with just boxes and diamonds, since “formulas [of modal logic] get evaluated relative to a world, which leaves no room for cross-world comparisons” (1986b: 13). A realist about modality like Lewis, according to whom possible worlds, including the things in them, are as real as our own world and the things in it, is able to make these cross-world comparisons, and thus do justice to modal locutions of natural language that the modal primitivist cannot. He points out that this problem extends past natural language and into philosophical quasi-technical language. The basic idea behind supervenience, the philosophical workhorse of Lewis’s day, used to formulate various theses about dependence, is that the Fs supervene on the Gs if and only if there could be no difference in the Fs without a difference in the Gs. But, he notes (1986b: 14 ff.), attempts to capture this basic notion strictly in terms of the modal operators have failed, either resulting in something too weak or too strong.
The other jobs that Lewis thinks possible worlds can do are briefly outlined as follows. The second job is that talk of possible worlds allows us to make sense of the idea that some possibilities are closer to actuality than others (for example, Hillary Clinton’s having won the 2016 election is a closer possibility to actuality than her being in command of a colonial expedition to the Andromeda galaxy). Such comparisons are useful in making sense of counterfactual claims, that is, claims of the form ‘if it were the case that p then it would be the case that q’. Discussion of Lewis’s account of counterfactuals, and the role possible worlds play in it, occurs in section 7. The third job Lewis thinks that possible worlds can do is that they provide us with the resources to formulate what he takes to be the best theory of mental content, that is, the best theory about what our thoughts are about. He thinks such a theory will construe such contents as sets of possibilities, that is, as sets of possible worlds or possible individuals. The fourth job is that Lewis thinks that sets of possible individuals can play the role of properties, a discussion of which occurs in detail in the next section (section 3). One who takes the modal operators as primitive will not be able to accomplish these things—at least not as easily. This is already clear in the case of jobs three and four; a primitivist about modality will simply not have the worlds and individuals hanging around which they can collect up into sets to act as properties or the contents of our thoughts. While some may balk at some of the consequences of modal realism (such as that there exist infinitely many talking donkeys in other possible worlds, in virtue of it being possible that infinitely many talking donkeys exist), Lewis thinks that these theoretical benefits nonetheless provide reason to prefer modal realism to the primitivist alternative.
The second part of Lewis’s argument for concrete modal realism consists in arguing for the ‘concrete’ component of the view, and comprises a number of arguments against various forms of modal realism which regard possible worlds as abstract entities of one sort or another—what he calls ‘ersatz realism’. Often times, Lewis’s strategy is to argue that concrete modal realism does a better job solving certain problems as compared to these ersatzist alternatives. These arguments can be found in chapter three of Plurality. Just one example, conveniently connected to issues already discussed, is Lewis’s first argument against what he calls ‘linguistic ersatzism’, the view, already introduced, that possible worlds are maximal consistent sets of sentences. Lewis’s complaint is that linguistic ersatzism is committed to a primitive conception of modality—something which Lewis has already argued against, and something to which his own view is not similarly committed. Lewis provides two reasons to think linguistic ersatzism is committed to primitive modality, of which only the first is discussed here. The notion of consistency, in part in terms of which the linguistic ersatzist characterizes possible worlds, appears to be a modal notion: “a set of sentences is consistent iff those sentences, as interpreted, could all be true together” (1986b: 151 ital. orig.). Since Lewis’s own view is not committed to primitive modality, he is able to give a complete analysis of modality in terms of his particular brand of possible worlds, while the linguistic ersatzist is not.
Lewis’s view about modality is distinctive not only in that he takes possible worlds to be concrete. It is also distinctive in the way it analyzes possibility and necessity claims about individuals. Consider possibility claims. One might think that, for something to possibly be some way, there is a possible world at which that very thing is that way. So, for example, one might think that, for it to be true that Hubert Humphrey could have won the 1968 United States presidential election, there is a possible world at which Humphrey—the very same person who lost the 1968 election in the actual world—won the 1968 election. This is a very natural way to think about the analysis of possibility claims. The thesis that objects exist in more than one possible world is known as ‘transworld identity’. When worlds are taken to be concrete, transworld identity amounts to the claim that worlds share constituents, and, for this reason, Lewis calls it ‘(concrete) modal realism with overlap’. It is typically understood as the idea that a thing in this world which could have been qualitatively different than it actually is itself inhabits another possible world as well, in which it is qualitatively different. Instead of taking this approach, Lewis elects to reject any overlap among possible worlds, and to analyze possibility and necessity claims about individuals in terms of counterparts. In particular:
for every possible world w at which a counterpart of a exists, is true at w.
for some possible world w at which a counterpart of a exists, is true at w.
Lewis’s analysis of modality in terms of counterparts is known as ‘counterpart theory’. His complete view about modality, then, is what could be called ‘concrete modal realism with counterpart theory’.
Lewis discusses counterpart theory in Plurality, Ch. 4, ‘Counterpart Theory and Quantified Modal Logic’ (1968), and ‘Counterparts of Persons and Their Bodies’ (1971). When do x and y stand in the counterpart relation? Lewis thinks an object’s counterparts will track intrinsic similarity to some extent. But the notions come apart. This is mainly because the counterpart relation is context-sensitive. This is connected to a factor that Lewis thinks constitutes an advantage of counterpart theory to concrete modal realism with overlap, namely, it can help us make sense of variability in our judgments about what properties are essential to an object (1986b: 252–53). Consider a statue of a human being made of clay standing in a grotto. Many are inclined to say that it is essential to the statue that it has the shape it has. Were it another shape (for example, the shape of a horse), it would be a different statue. The lump of clay, however, would have been the same object even if it were shaped differently than it is. One solution to this problem is to say that there are actually two objects in the grotto: the statue, with a certain set of essential properties, and the lump of clay, with a different set. But Lewis took it to be a cost to be saddled with the possibility of multiple objects that occupy exactly the same spatial region. Lewis’s solution was to note that there can be a single object in the grotto but, when we are describing it as a statue (context 1), we are particularly interested in a certain set of the object’s properties, while, when we are thinking of it as a lump of clay (context 2), we are interested in a different set. In context 1, a lump of clay that was sourced from exactly the same place as the lump of clay in our world was sourced will not count as a counterpart of the object in the grotto if it has a different shape. But it will count as a counterpart of the object in the grotto in context 2. This allows Lewis to explain why, in context 1 but not context 2, we are inclined to say that the object has its shape essentially. In every possible world in which the object has a counterpart (described as a statue), that counterpart will have the same shape that it does.
Lewis’s key arguments against concrete modal realism with overlap appear in chapter four of Plurality. Another important argument is based on what Lewis calls ‘the problem of accidental intrinsics’. If possible worlds share parts (like Humphrey), it is not clear, given modal realism with overlap, how Humphrey could have different intrinsic properties at each world. He presumably does so, since, for at least some of the intrinsic properties he actually has, he could have lacked them, and for at least some of those he actually lacks, he could have had them. Lewis’s example concerns Humphrey’s shape. He actually has five fingers on his left-hand. But he could have had six. It will not do, Lewis thinks, for the proponent of overlap to relativize Humphrey’s property instantiation to worlds, saying, for example, that he has five fingers on his left-hand relative to the actual world, but that the world relative to which he has six fingers on his left-hand is a distinct world. This might work for a tower having different cross-sectional shapes on different levels, Lewis says, for example, being square on the third floor but circular on the fourth. But, he points out, it is only a part of the tower that has the shape at each level. According to modal realism with overlap, the whole of Humphrey exists at each world at which Humphrey exists. Similarly, the relativization strategy might work when Humphrey is honest according to one media source and dishonest according to other. The sources represent Humphrey in different ways. This might work for the ersatzist, whose ersatz individuals merely represent actual objects (as would, for example, a collection of predicates which are sufficient to represent Humphrey and no one else). According to the concrete modal realist, however, possible individuals are individuals, not representations of individuals. Finally, the relativization strategy might work with extrinsic relations like being a father of. A man might be father of Ed and son of Fred, that is, he might be father relative to Ed but not to Fred. But Humphrey’s five-fingeredness concerns his shape, and, as Lewis points out, “If we know what shape is, we know that it is a property, not a relation” (1986b: 204).
Counterpart theory is not without its detractors. Saul Kripke (1980: 45, fn. 13), for example, complains that, on Lewis’s view, possibility claims about an individual are not actually about that individual him-, her-, or itself, but, rather, about one of his, her, or its counterparts. When one says, for example, ‘Humphrey could have won the 1968 election’, the complaint goes, one is not saying something about the Humphrey we are acquainted with—that is, one is not strictly saying something about that very individual who, in our actual world, lost the 1968 election. Instead, one is saying something about an individual that exists in some other possible world, who is similar to our actual Humphrey in certain relevant respects and to sufficient degrees, who won the 1968 election in that world. Lewis is unimpressed with this objection (see, for example, Plurality: 196). He thinks that ‘Humphrey could have won the 1968 election’ is about our Humphrey—the Humphrey in the actual world. Granted, the analysis of this claim involves invoking a distinct entity—one of Humphrey’s counterparts. But it is the actual Humphrey who has the modal property of possibly winning. His counterpart, in contrast, has the property of winning (simpliciter) as well.
3. Properties
Lewis was a realist about properties. That is, he thought that properties exist. Properties can be intuitively understood as ways that things can be. Beyond that very general conception, disagreement arises. One major point of disagreement is about whether properties are repeatable—that distinct things which can be truly ascribed to be similar, in some respect, literally share something in common. This sort of property is usually termed a ‘universal’. Those who endorse this view are realists about universals. According to realists, greenness, for example, is a sui generis entity, distinct from any particular green thing, that is had, or instantiated by each green thing. Realists typically seek to explain the similarity among similar things (such as green things), by appealing to the fact that each instantiates the same universal (so each green thing instantiates greenness). Those who deny the claim that properties are repeatable are nominalists about universals. (This form of nominalism is stricter than that most commonly at issue in the philosophy of mathematics, which denies the existence of all abstract entities, including sets.) Nominalists about universals come is many flavors. David Armstrong (1978a) provides a relatively comprehensive taxonomy of them. Of particular relevance to Lewis’s views on the matter are class nominalists, who identify properties with the sets of the individuals that can be truly described as having them. On such a view, the property of greenness, for example, is identified with the set of green things SG, that is, as that set which contains frogs, grass, the Statue of Liberty, and so forth. To instantiate the property of greenness is just, according to the class nominalist, to belong to the set SG.
Lewis is officially a nominalist. He elected to identify properties with sets, and thus his view was a form a class nominalism. (Lewis had perfectly analogous views about relations.) As such, Lewis’s view faces challenges similar to those class nominalists face. Chief among them is the problem of coextensive properties, which is the concern that class nominalism must identify any properties which have the same extension (that is, apply to the same individuals), whether those properties are intuitively the same or not. The set of those organisms which have hearts, for example, is, as it happens, the same as that which have kidneys. As such, the class nominalist is forced to identify the property of being a cordate with that of being a renate. This seems wrong, however. The former property seems to concern one sort of organ, the latter a completely different sort of organ. These properties seem to be distinct.
Lewis’s solution to this problem is made possible by his views on modality. Lewis identifies each property not with the set of individuals in the actual world to which it can be truly ascribed. Rather, he identifies it with the set of individuals in all possible worlds to which it can be truly ascribed. Due to his views about modality, such individuals exist, and are thus available to be members of sets. The result is a class nominalism that is immune to the aforementioned problem. While it is actually true that every cordate is a renate and vice versa, this is an accident—the result of a long and complex series of events in the evolutionary history of life on Earth. But this history could have unfolded differently. Thus there are possible worlds, according to Lewis, which contain organisms which have hearts but which filter toxins in a different way. And there are worlds which contain organisms which have kidneys but deliver oxygen to cells in a different way. The existence of organisms of either sort ensures that the set of cordates is distinct from the set of renates, and so ensures that these properties are distinct. Of course, one might raise the concern that Lewis’s view has a perfectly analogous problem with properties whose extensions are identical in every possible world, as that of being a triangular polygon and being a trilateral (three-sided) polygon presumably are. For more on this issue, see section 2 of Sophie Allen’s article ‘Properties.’
So far, Lewis looks to be nothing more than a class nominalist, if a relatively sophisticated one, owing to the tricks he can draw from his concrete modal realist bag. But he recognizes that universals do important philosophical work. He enumerates the jobs that universals can do in ‘New Work for a Theory of Universals’ (1983a). To take just one example, Lewis admits that universals can serve to distinguish laws of nature from mere accidental regularities. Armstrong (1978b and 1983) employs universals in this way in his theory of lawhood. According to Armstrong, what ensures, for example, that:
(G1) All uranium spheres are less than one mile in diameter
is a law of nature, while:
(G2) All gold spheres are less than one mile in diameter
is not, is that (G1) is made true not just by the contingent fact that there are no uranium spheres that are one mile in diameter or larger. It is made true by a certain fact that holds at certain worlds about the universals being a uranium sphere and being less than one mile in diameter. These universals jointly instantiate a second-order universal (second-order because it relates universals rather than particulars), which relates these two universals in such a way that it guarantees, at any world at which these universals stand in this relationship, that there will never be a uranium sphere with a diameter of one mile or more (since the relationship between the universals will ensure that any such sphere will explode). There is no such fact concerning the universals being a gold sphere and being less than one mile in diameter. What makes (G2) true is a fact that has nothing to do with these universals. Instead, it has to do only with certain historical contingencies about our world that suffice to explain why, in fact, no gold spheres one mile in diameter or larger ever naturally developed or were artificially constructed. With just his properties, Lewis does not have the resources to explain this difference. Lewis’s properties are abundant. Any old collection of things count as a property. Thus Lewis would have no basis on which to say that the property of being a uranium sphere is related to the property of being less than one mile in diameter in any way that is more (or less) significant than the relation between being a gold sphere and being less than one mile in diameter. He can say that the set-theoretic intersection of each pair is empty, that is, the properties do not share any members (remember, for Lewis, properties are sets). But the similarity of being a uranium sphere and being a gold sphere in this respect would provide him with no basis on which to say that the first figures into a law of nature while the second does not.
Lewis rejects Armstrong’s approach to lawhood (along with his commitment to the existence of universals), and instead characterizes a law as a statement of a regularity that belongs to a suitable deductive system, which (i) is true, (ii) is closed under strict implication (that is, whatever is necessarily implied by any set of statements in the system is also in the system), and (iii) is balanced with respect to simplicity and empirical informativeness. In particular, the system must be as simple as it can be without being informationally too impoverished to do justice to the empirical facts about the world, but, to the extent that it does not sacrifice a sufficient degree of simplicity, it must be as informative as it can be. Nonetheless, Lewis recognizes a problem with his view, and, while he does not need to endorse universals to solve it, he requires something more than his ontology of properties. The problem is that there is a way for a deductive system to meet Lewis’s criteria (i)–(iii) that is clearly undesirable. Suppose we have discovered the best system S for describing the actual world. The way scientists have currently formulated it is rather complicated. But some wiseacre comes up with the idea to introduce a new predicate F into our language and stipulate that F is satisfied by all and only those things at the worlds at which S is true. But suppose further that this wiseacre refuses to provide an analysis of F. S can then be axiomatized with the single axiom ‘’. This theory is very simple, and it is, in a sense, as informationally enriched as it can be, since it perfectly selects the worlds at which S is true. Nonetheless, the theory is useless to the curious inhabitant. It tells them nothing about what their world is like.
The first step of Lewis’s solution to this problem is to adopt some primitive distinctions among properties. There are those that are perfectly natural, those which are natural to some degree (though not perfectly natural), and those which are unnatural. Lewis (1983a: 346 ff.) imagines that the perfectly natural properties will be those properties that would correspond to universals in Armstrong’s metaphysics, which is sparse enough to enable him to distinguish between laws (for example, being made of uranium). Less natural (but still comparatively natural) properties would correspond to families of suitably related universals (for example, being metallic). The spectrum would continue until wholly unnatural, gerrymandered properties are reached (for example, being either the Eiffel Tower or a part of the moon). Lewis notes that admitting universals into one’s ontology can provide the basis for a distinction between more and less natural properties, in the way just gestured at in the comparison with Armstrong’s metaphysics. But he notes that the distinction can be taken to be a primitive one between properties (classes) instead. This is Lewis’s preference; it allows him to avoid realism about universals and thus remain a nominalist. Lewis then solves the problem of the true but useless theory ‘’ by imposing a further criterion that the most suitable deductive system which sets the laws apart from the non-laws is one whose axioms are stated in a way that refers only to perfectly natural properties.
4. Time and Persistence
Lewis’s most well-known writings about time have to do with the persistence of objects. Lewis was a four-dimensionalist. That is, he believed that there exist four-dimensional objects, extended not just in space, but in time as well. Four-dimensionalism is to be contrasted with three-dimensionalism, according to which the only objects which exist are extended in space only (if they are extended at all that is, so as not to rule out the existence of non-extended points of space). Lewis’s commitment to four-dimensionalism was a result of his endorsement of two theses: (1) unrestricted composition, and (2) eternalism. Unrestricted composition is the thesis that any objects compose some object. So not only do my head, torso, arms, and legs compose an object (me), my head and the near side of the moon compose an object as well. Eternalism is a view about the ontology of time, according to which past, present, and future times, objects, and events are equally real. Eternalism is to be contrasted with presentism, the view that only the present time and present objects and events are real, and with the growing block theory, the view that past and present times, objects, and events are real, but future ones are not. Committing oneself to unrestricted composition and eternalism requires one to countenance four-dimensional objects. Not only do any presently existing objects compose an object, past ones do too. And, crucially, objects which exist at different times compose objects as well, such as the object that is composed of George Washington’s first wig and the sandwich someone just made for lunch.
As strange a view as four-dimensionalism might seem, Lewis has good reasons for adopting it. These reasons concern issues connected to the persistence of objects through time. Lewis is a perdurantist, and as such believes that for an object to persist through an interval of time is for it to perdure, that is, to have proper parts, one of which is wholly present at each moment of that interval. Perdurantism is to be contrasted with endurantism, according to which an object’s persistence through an interval of time amounts to the whole object being wholly present at each moment of that interval. Perdurantism, obviously, requires the truth of four-dimensionalism, at least assuming that some objects do in fact persist through time. This is because any such object must have parts which exist at different times. According to perdurantism (at least Lewis’s version—Theodore Sider develops another version of it in 1996 and 2001), the objects that we refer to with our names and definite descriptions are actually four-dimensional worm-like objects. We are acquainted with them by being acquainted with some of their parts at various times. So, for example, the Taj Mahal is a spacetime worm that extends back to about 1653. I am acquainted with it only insofar as I am acquainted with one of its parts, which extends through time for about two hours, which I toured on November 28, 2015. Even human beings, according to Lewis, are actually spacetime worms. They are not themselves shaped like those objects depicted in anatomy textbooks. Instead, those diagrams depict certain parts of human beings that exist at instants of time.
Lewis’s perdurantism might seem like an odd view, but, he thinks, it solves an important problem. Its competitor endurantism faces an important problem which Lewis calls the ‘problem of temporary intrinsics’ (1986b: 202–04 and 2002), which is analogous to the problem of accidental intrinsics which faces concrete modal realism with overlap (see the discussion in section 2). Everyone agrees that objects change over time. A person may previously have been standing and currently be sitting. The endurantist must say that the very same object has both the property of standing and sitting. This looks, at least at first glance, to be a contradiction. Endurantists typically say that the contradiction is only apparent, and they explain it away in various ways. But Lewis does not think any of those strategies succeed. One strategy endurantists use is to say that what we thought were properties, instantiated by a single object, are actually relations, instantiated by an object and a time. There is no contradiction involved in one’s both standing and sitting, since one is standing in relation to one (past) time and sitting in relation to another (the present time). But Lewis thinks that if an intrinsic property like shape (that is, a property having only to do with an object, and nothing to do with how it is related to other objects) is anything, it is not a relation (see the Lewis quotation at the end of section 2). Another strategy endurantists use to explain away the apparent contradiction resulting from temporary intrinsics is to adopt presentism. Since only the present is real, the person has the property of sitting. They do not have the property of standing. (They did have the property of standing when that moment was present. But it is present no longer, and thus is not real.) But, Lewis thinks, presentism comes at a high cost. The presentist must reject the idea that a person has a past and (typically) a future as well, since, according to presentism, neither the past nor future exists. Lewis points out that perdurantism solves the problem nicely. There is something that has the property of sitting—a part of the person that is wholly present at a certain moment in the past. And there is something that has the property of standing—a part of the person that is wholly present at the present moment. But there is no contradiction since these are distinct parts of this person. Lewis’s perdurantist solution appeals to the same consideration which allows us to say that there is no contradiction in my left hand currently being fist-shaped and my right hand currently being open-palmed. They are different parts of me, and so are distinct objects. There is no contradiction in distinct objects having incompatible properties.
5. Humean Supervenience
Lewis believes that everything in the actual world is material. He also defends a thesis he calls ‘Humean supervenience’. Humean supervenience is the thesis that, in Lewis’s words, “all there is to the world is a vast mosaic of local matters of particular fact, just one little thing and another” (1986c: ix). Hume was known for rejecting the idea that there were hidden connections behind conjoined phenomena which necessitate their conjunction. He was not against there being regularities in the world. His objection was to these regularities being explained by necessary connections (such as Armstrong’s second-order states of affairs relating universals—see section 3). Lewis is sympathetic to this view, and also likes the idea that macroscopic phenomena are reducible to certain basic microscopic phenomena. These microscopic phenomena Lewis takes to be just the geometrical arrangement of the world’s spacetime points, and the instantiation of certain perfectly natural properties at each of those points. Lewis takes this to mean that fundamental entities are point-sized, or, perhaps, that the fundamental entities are the spacetime points themselves.
Lewis is willing to admit that other possible worlds sufficiently different from our own might be different in this last respect. In particular, he thinks that it might take more than just the point-wise distribution of instantiations of perfectly natural properties to determine all of the phenomena in the world. Now the scientifically informed reader might object that our current physical theories show that this is not true even at our world. Some of our most promising physical theories, for example, posit spatially extended fields as being among the fundamental constituents of reality, rather than point-like entities. As Daniel Nolan (2005: 29 ff.) and Brian Weatherson (2016: sec. 5) point out, Lewis is concerned more with illustrating the defensibility of this latter thesis than with its truth. It could be regarded as an idealization or simplification, suitable for philosophical purposes, in terms of which Lewis formulates his thesis of Humean supervenience. If it turns out that the fundamental furniture of the world actually consists of spatially extended entities, rather than point-like entities, Lewis will be content to backpedal a bit, and formulate Humean supervenience in a way that is consistent with that, such as, for example, claiming that what is true at a given world is determined by the geometrical arrangement of its spacetime points and where perfectly natural properties are instantiated at the spacetime regions occupied by the fundamental entities. But, as Lewis suggests in ‘Humean Supervenience Debugged’ (1994a: 474), he expects that, even once we have settled on the nature of the physical world, we will find that the profusion of phenomena at our world can be explained by a comparatively sparse base of simple entities instantiating comparatively basic properties and perhaps also standing in comparatively basic relations.
6. Causation
Lewis is known for his counterfactual analysis of causation. Lewis made significant contributions to the semantics of counterfactuals, which will be discussed in the next section. The following is perhaps the most straightforward way to provide an analysis of causation in terms of counterfactuals, though, as we will see, it is importantly different from Lewis’s account:
x causes y iff x and y occur, and if x had not occurred, then y would not have occurred.
Counterfactual analyses of causation are to be contrasted with productive accounts, according to which x causes y iff x produces some change in properties in y, where the notion of production is typically taken to be primitive. Both sorts of analysis face their own characteristic set of problems. This article discusses only what is the most well-known problem for the above counterfactual account, the problem of causal preemption (or causal redundancy) since it will help the reader understand why Lewis develops his own counterfactual analysis of causation in the way that he does. Suppose that Alice and Bob are throwing rocks at bottles and Alice throws her rock at one of the bottles and hits it, shattering it. Intuitively, Alice’s throw caused the bottle to shatter. But suppose also that Bob was ready to throw his rock at the same bottle just in case Alice did not throw, and, moreover, he has perfect aim. Thus Bob’s rock would have struck the bottle, causing it to shatter, had Alice not thrown. Due to this fact, the right side of the above counterfactual analysis of causation is not satisfied in this case. It is not the case that, had Alice not thrown, the bottle would not have shattered. This is because, given the way the case was set up, Bob’s throw would have ensured that the bottle would shatter. Yet, intuitively, Alice’s throw caused the bottle to shatter. Something seems to be wrong with the above counterfactual analysis of causation.
In order to avoid this problem, in ‘Causation’ (1973a), Lewis distinguishes between causation and causal dependence. The above analysis is actually the analysis Lewis provides of causal dependence. He defines causation in terms of chains of causal dependence (where a chain might, but typically will not, have only two nodes). So, for example, if y causally depends on x, and z causally depends on y, then x causes z, even ifz might have occurred even if x had not. Lewis thinks there is independent motivation for this move, as he thinks there are often cases in which it is natural to say that x causes z even when z does not counterfactually depend on x. This is explained by Lewis by positing a chain of causal dependence. In general, counterfactual dependence is not transitive. The light would not have come on if I had not flicked the switch. I would not have flicked the switch if I had been out running errands. But the light may well have come on just then even if I had been out running errands. Another member of my family might have walked into the room and flicked the switch. Lewis deals with cases of causal preemption, like the one involving Alice and Bob, by pointing out that, in such cases, there will nonetheless be a chain of counterfactual (and thus causal) dependence which we can invoke to secure the truth of the causal claims we think are true. Lewis grants that it is not the case that, if Alice had not thrown her rock, then the bottle would not have shattered (since Bob would have fired). But, he thinks this establishes only that the bottle’s shattering doesn’t causally depend on Alice’s throw. Since causes need only be linked by chains of causal dependence to their effects, Lewis can still say that Alice’s throw caused the bottle to shatter. He would note first that:
(CF1) the bottle would not have shattered if Alice’s rock had not been speeding toward it.
This is true because, by the time the rock was speeding toward the bottle, Bob has seen that Alice had thrown her rock, and so has refrained from throwing his own rock. Lewis would note second that:
(CF2) Alice’s rock would not have been speeding toward the bottle if Alice had not thrown it.
This sets up a chain of causal dependence between Alice’s throw and the bottle’s shattering, which is enough, on Lewis’s account, to secure the desired conclusion that Alice’s throw caused the bottle to shatter.
Lewis’s counterfactual account of causation, as just explicated, still has a problem with preemption. This is the problem of late preemption, in which one causal process is preempted by the effect rather than by an event earlier in the process. So, for example, rather than Bob’s throw being preempted by Alice’s throwing her rock, suppose Bob threw his rock a split second after Alice threw hers, and that his rock did not hit the bottle only because the bottle had shattered a split second before Bob’s rock reached the bottle’s former position. In this case (adapted from Hall 2004), (CF1) would be false, and so Lewis would be unable to set up a chain of counterfactual dependence on which he could base a determination that Alice’s throw caused the bottle to shatter. This problem led Lewis to revise his view significantly in ‘Causation as Influence’ (2000a and 2004), wherein he analyzes causation in terms of the notion of influence. Lewis characterizes influence as follows:
C influences E iff there is a substantial range C1, C2,… of different not-too-distant alterations of C (including the actual alteration of C) and there is a range E1, E2,… of alterations of E, at least some of which differ, such that if C1 had occurred, E1 would have occurred, and if C2 had occurred, E2 would have occurred, and so on. Thus we have a pattern of counterfactual dependence of whether, when, and how on whether, when, and how. (2000a: 190 and 2004: 91)
The precise circumstances in which an event occurs, including the exact time at which it occurs, and the manner in which it occurs, are relevant to whether one event influences another. On this characterization, Alice’s throw influenced the bottle’s shattering, since it made a difference, for example, to the exact manner in which it occurred. Let’s say, for example, that her rock hit the right side of the bottle, and that it shattered to the left. But if she had thrown a bit to the left, the bottle would have shattered towards the right. The same is not true of Bob’s throw. If he had thrown a bit to the left, the bottle still would have shattered in the way that it did, since Alice’s rock would still have hit it in the way that it did. This allows Lewis to say that Alice’s throw caused the bottle to shatter, despite the fact that Bob’s rock was on its way to ensure that it shatters in case Alice’s aim happened to be off.
Another sort of problem that gives Lewis trouble involve absences. It is not clear how Lewis’s view can deal with cases like when an absence of light causes a plant to die. There is no event in terms of which we can formulate any counterfactuals of the form ‘if x had not occurred, then y would not have occurred’ in such cases. Lewis (for example, 2000a, sec. X) deals with absences by admitting that there are some instances of causation that do not have causes (understood as events). Instead, he thinks that it is true to say that the absence of light caused the plant to die as long the right sorts of counterfactuals are true, for example, ‘if there had been more light over the past few weeks, the plant would have survived’.
7. Counterfactuals
Lewis makes use of some of the tools of his theory of modality in his contributions to the literature on the semantics of counterfactuals. A counterfactual is a certain type of conditional. A conditional is a sentence synonymous to one of the form ‘if…, then…’. An indicative conditional is a conditional whose verbs are in the indicative mood, for example:
(1) If Tom is skiing, then he is not in his office.
Other conditionals are in the subjunctive mood, for example:
(2) If Tom were a skiing instructor, then he would be in great shape.
Many of the subjunctive conditionals that we use on a day-to-day basis, such as (2), are counterfactual conditionals, that is, conditionals whose antecedents express statements that are contrary to what is actually the case. (Suppose Tom is in fact an accountant.) The material conditional ‘→’ from propositional logic can be used to adequately translate many natural language conditionals. Recall that, as an operator that is truth-functional, all there is to the meaning of ‘’ is its truth conditions as given by its truth table, according to which it is true if either p is false or q is true, and it is false otherwise (that is, when p is true and q is false).
But there are many other natural language conditionals which cannot be adequately translated with the material conditional. Counterfactuals form an important class of such conditionals.
Before Lewis, the most well-worked-out accounts of counterfactuals construed them as strict conditionals meeting certain conditions (in particular, see Goodman 1947 and 1955). A strict conditional is just a material conditional that holds of necessity, that is, a statement of the form ‘’. The simplest strict-conditional account of counterfactuals (which is admittedly simpler than Goodman’s, but will be sufficient to motivate Lewis’s account) analyzes each counterfactual in terms of the corresponding strict conditional, that is,
‘’ is true iff .
(Following Lewis in Counterfactuals, (1973b, 1–2), ‘if it had been the case that p then it would have been the case that q’ is abbreviated with ‘’.) This account is inadequate because a strict conditional is like a material conditional insofar as strengthening its antecedent cannot take the entire conditional from being true to being false, whereas this is not so for counterfactuals (see Lewis 1973b: ch. 1, Nolan 2005: 74 ff., and Weatherson 2016: sec. 3.1). Recall from propositional logic that the following inference pattern is valid.
The analogous inference pattern involving the strict conditional is also valid:
But the analogous inference for the counterfactual conditional is not valid:
Suppose that the counterfactual (2) above is true, and consider the following strengthening of it:
(3) If Tom were a skiing instructor and he always wore a robotic exoskeleton so that he did not ever expend any energy, then he would be in great shape.
(3) appears to be false. If he never expended any energy, he would not be in great shape. But (3) follows from (2) on the strict conditional account because of the validity of the above inference pattern involving the strict conditional. It does not, however, follow on Lewis’s account.
Lewis analyzes counterfactuals in terms of possible worlds, and the basic idea behind his analysis is similar to that of Robert Stalnaker (1968). Stalnaker proposed the following analysis of counterfactuals in terms of the similarity of worlds:
‘’ is true iff the most similar p-world to the actual world is also a q-world, where a p-world is just a world at which p is true.
(Technically this only specifies the truth conditions for counterfactuals that are non-vacuously true, that is, when there is at least one p-world most similar to the actual world. But we can ignore vacuously true counterfactuals.) Lewis has a helpful metaphor which he employs when thinking about the similarity between worlds. He thinks about possible worlds as if they were arranged in a space, with the actual world at the center, with larger and smaller degrees of similarity to the actual world being represented by larger and smaller distances from (closeness to) the actual world. Counterfactual (2) above, for example, is true, on Stalnaker’s account, because the most similar (closest) world to the actual world at which Tom is a skiing instructor is one at which he is in great shape. A world in which Tom wears a robotic exoskeleton while teaching people to ski (thus keeping him in poor shape) is plausibly less similar to (farther away from) the actual world than one in which he teaches people to ski using his own muscles. (3), however, requires one to look at the closest world at which both Tom is a skiing instructor and Tom wears a robotic exoskeleton. And in that world, plausibly, Tom is not in great shape. It would require even more changes in the actual facts to ensure that Tom would be in great shape in such a world (for example, Tom has taken a pill—the result of a medical breakthrough that has not occurred at the actual world—that keeps his body in great shape even if he does not exercise).
There are important differences between the analysis Lewis ultimately settles on and Stalnaker’s. For one, Lewis rejects Stalnaker’s assumption that there will always be a unique p-world that is most similar to the actual world. As a result, the analysis that Lewis adopts is closer to the following:
‘’ is true iff all p-worlds that are most similar to the actual world are also q-worlds.
Lewis also challenges the tempting assumption that there is a closest “sphere” of p-worlds to the actual world (this is the Limit Assumption—see 1973b: 19 ff.). Without it, counterfactuals are best analyzed as follows:
‘’ is true iff there is a -world that is more similar to the actual world than any -world.
Finally, Lewis questions the tempting assumption that each world is more similar to itself than any other world (1973b: 28 ff.). Making this assumption results in entailing . So, for instance, ‘Tom is a skiing instructor and Tom is in great shape’ would entail (2). But it would seem odd for this counterfactual to be true if its antecedent were not in fact false. In the end, Lewis sticks with this assumption for technical reasons (cf. Weatherson 2016: sec. 3.2).
Lewis’s analysis of counterfactuals is not without problems. Kit Fine (1975), for instance, argues that Lewis’s account, as it stands, makes the following counterfactual false, though it is presumably true:
(4) If Nixon had pressed the button, there would have been nuclear war.
It seems that any of the worlds in which Nixon pressed the button that are most similar to the actual world are ones in which there was no nuclear war, but in which instead some relatively minor miracle occurred—some violation of the natural laws of our world, perhaps specific to the exact location of the button and the specific time at which Nixon pressed it—which renders the button momentarily useless. To surmount this problem, Lewis says more about similarity in ‘Counterfactual Dependence and Time’s Arrow’ (1979b). He had already noted that similarity would be context-sensitive in his book Counterfactuals. That is, he had already noted that the “distance” that possible worlds are from the actual world might be different for the same counterfactual when it is uttered in different contexts. If, for example, (2) were uttered in a context in which it had already been established that Tom owned a robotic exoskeleton and was considering using it, the closest worlds to the actual world would include those in which he wore it and thus maintained a poor physique, thus rendering the counterfactual false instead of true. But Lewis says little else about similarity there.
To deal with Fine’s challenge, Lewis outlines a number of rules which one should abide by while measuring similarity given a context:
(1) It is of the first importance to avoid big, widespread diverse violations of law.
(2) It is of the second importance to maximize the spatiotemporal region throughout which a perfect match of particular fact prevails.
(3) It is of the third importance to avoid even small, localized, simple violations of law.
(4) It is of little or no importance to secure approximate similarity of particular fact, even in matters that concern us greatly. (1979b: 472)
Lewis assumes determinism throughout his discussion. That is, he assumes that everything that occurs is necessitated by the events which occurred earlier together with the laws of nature. Lewis thinks that determinism better explains, in comparison to indeterminism, the fact that counterfactuals which concern events which occur at different times exhibit an asymmetry which encodes the fixedness of the past and the openness of the future (1979b: 460). Given the assumption of determinism, and the assumption that Nixon did not press the button in the actual world, any world in which Nixon did press the button must either (i) be a world in which a small miracle occurred to enable Nixon to press the button despite having the same history as the actual world or (ii) be a world that has a completely different history than our own world, to enable Nixon’s pressing of the button to be necessitated by that history. By Lewis’s rules above, type (i) worlds are more similar to the actual world than type (ii) worlds, since the latter violate the more important rule (2). Type (i) worlds are identical to the actual world up to the point at which Nixon is considering pressing the button. Type (ii) worlds have completely different histories. Type (i) worlds violate only the less important rule (3), since they feature a small miracle. Lewis grants that there will be worlds with the same history as the actual world in which Nixon presses the button but no nuclear war ensues because another miracle causes a malfunction in the button, preventing the warheads from launching. But these worlds will have to involve miracles in addition to the one which enables Nixon to press the button. This is a further violation of rule (3). In contrast, a world in which Nixon presses the button and nuclear war ensues will violate the less important rule (4). As a result, Lewis concludes, the most similar worlds to the actual world are worlds in which Nixon presses the button and nuclear war ensues. Lewis’s account, therefore, makes the above counterfactual (4) true, as it should be.
8. Convention
Lewis’s earliest work is devoted to developing an account of what it is for a group of individuals to use a language. The lion’s share of his work on this issue can be found in his first book, Convention (1969) (see also ‘Languages and Language’ (1975)). Lewis makes use of the notion of a convention in his analysis of language use, and a significant part of the importance of this book is due to the account of conventions that he offers. Conventions about language use are by no means the only ones around. It is, for example, a convention in the United States to drive on the right-hand side of the road. An initial picture of convention that one might have is one of convention as the result of agreement. That is, one might think that a convention among some individuals is the result of an agreement they make with one another. However, individuals appear able to make an agreement only in a language. Thus one cannot give an analysis of what it is for a group of individuals to speak a language in terms of convention, understood in terms of agreement, since it would be circular; it would presuppose that these individuals speak a language (cf. Weatherson 2016: sec. 2). Lewis’s analysis of conventions avoids this problem.
What motivates the implementation of conventions are coordination problems. Roughly, a coordination problem is a problem facing two or more people where the best outcome for each person can result only by the coordination of their actions. Suppose, for example, that each member of a group of people is trying to decide which side of the road to drive on. Consider one such individual, Carol. Carol might have her own basic unconditioned preference on which side to drive. She might, for instance, prefer to drive on the right-hand side of the road because the steering wheel of her car is situated on the right-hand side, and she would like to place herself as far from oncoming traffic as possible. Still, she has a conditional preference concerning driving on the left-hand side of the road. She would prefer to drive on the left-hand side of the road on the condition that everyone else drives on the left-hand side of the road. This is rooted in Carol’s desire to minimize the chances she is hit by oncoming traffic. We can suppose that everyone (or at least almost everyone) in the group has the conditional preferences that she prefers to drive on the left (right) side of the road on the condition that everyone else drives on the left (right) side of the road. Notice that there are two ways to solve these individuals’ coordination problem: (1) they might adopt the convention that everyone drive on the left side of the road, and (2) they might adopt the convention that everyone drive on the right side of the road. When everyone in the group settles on one of these options, what results is a coordination equilibrium.
It is important to note that there is more than one equilibrium which the members of the group can adopt to create the best outcome for all of them. It is in such circumstances that a convention must be adopted. In other words, some coordination problems will have only a single solution, in which case there is no need for a convention. People will act in such a way just because it creates the best outcome for them (and for everyone else). Suppose, for example, that there is a group of farmers that sell a certain product, say, coffee, to a population. We can suppose that there is a certain price p below which each farmer will fail to make an adequate profit on each item, which would ultimately drive them out of business. And we can suppose that there is certain price p′ above which consumers will forgo the product, substituting it with another less expensive product, like chicory or tea, available from others, or changing their habits altogether to eliminate a bitter morning drink from their diet. Assuming that p′ > p, we can expect these farmers (each of whom, we are supposing, is acting in her own self-interest) to offer their product somewhere within the price range bounded by p and p′. This outcome is not the result of the adoption of a convention among these farmers. It is instead a result of each farmer acting in her own self-interest, of there being only one way for each farmer to achieve the best outcome for herself, and of her accurately observing the character of her market. Solving other coordination problems, however, such as the question of which side of the road everyone should drive on, requires a convention, since there are two possible ways to achieve the best outcome for everyone involved.
Of course, everyone in Carol’s group could get together and have a vote to decide which side of the road everyone in their group should drive on, in effect making an explicit agreement with one another. Perhaps the majority of car owners have an unconditioned preference like Carol’s, and prefer, for whatever reason, to drive on the right-hand side of the road. In this case, the result will be that everyone agrees to drive on the right-hand side of the road. But, importantly, agreement is not the only way to establish a convention (1969: 33–34). It might be that, as a matter pure chance, the first handful of people on the road with their cars happened to share Carol’s unconditional preference to drive on the right, and this effectively forced the latecomers to drive on the right in order to avoid the preexisting oncoming traffic.
In the spirit of the above considerations, Lewis ultimately settles on the following analysis of a convention:
A regularity R in the behavior of members of a population P when they are agents in a recurrent situation S is a convention if and only if it is true that, and it is common knowledge in P that, in almost any instance of S among members of P,
almost everyone conforms to R;
almost everyone expects almost everyone else to conform to R;
almost everyone has approximately the same preferences regarding all possible combinations of actions;
almost everyone prefers that any one conform to R, on condition that almost everyone conform to R;
almost everyone would prefer that any one conform to R′, on condition that almost everyone conform to R′,
where R′ is some possible regularity in the behavior of members of P in S, such that no one in almost any instance of S among members of P could conform both to R and to R′. (1969: 78)
One aspect of this analysis worth noting immediately is its tolerance for a certain number of exceptions (embodied by the consistent appearance of occurrences of ‘almost’). This is to prevent the analysis from failing to count as a convention what we would think should be counted as one. Of course, from time to time, there are, unfortunately, those who drive on the wrong side of the road. But these isolated incidents should not preclude the existence of a convention in the population to which these individuals belong, even if it did not come about as a result of an agreement. Suppose that the convention to drive on the right side of the road in Carol’s group arose by chance as described above, with all later drivers conforming to the preference of the first few drivers to drive on the right-hand side of the road. After weeks of this, we would not expect a single individual driving a single time on the left side of the road, for whatever the reason (whether the result of negligence or an intentional act of rebellion), to prevent the regularity that had emerged in the behavior of drivers in the group from being a convention. The convention is still there. It is just that this individual has failed, on this occasion, to act in accordance with it.
Another thing worth noting about Lewis’s analysis of convention is that, by ‘common knowledge that p’, Lewis does not require that p be true (1969: 52 ff.). Instead, it is enough that everyone has reason to believe that p, everyone has reason to believe that everyone has reason to believe that p, and so on. Whether or not anyone in fact believes that p, or in fact believes that everyone has reason to believe that p, and so on, is inconsequential to the analysis. This is why Lewis must specify separately that it is true that conditions (1)–(5) hold. Lewis adopts this characterization of common knowledge because he does not want to require, effectively, that, for a convention to hold, everyone believes that it holds. While he expects many people to be adept enough reasoners that they will come to believe the things they have reason to believe, he wants to allow for exceptions—individuals who never explicitly represent to themselves all of the various conditions which must hold for a convention to be present. But the presence of such individuals, of course, should not prevent a convention from being present (1969: 60 ff.).
Conditions (1) and (2) of Lewis’s analysis of convention are relatively straightforward, and they have been discussed above. Condition (4) is relatively straightforward as well. It requires, for example, that the vast majority of Carol’s group prefers that everyone in the group drives on the right-hand side of the road on the condition that almost everyone drives on the right-hand side of the road. If a substantial portion of the population did not desire that a convention be observed, the convention could easily collapse at any time, even if almost everyone had been observing it up to that time. This sort of situation is often exactly what is present just before a convention is abandoned. Consider public order—the tendency for people in many societies to act in an orderly and organized way while out in public. It is not implausible to say that public order is a convention which exists in these societies. And when it does, it is often, at least in part, the result of people wanting to live in a peaceful and orderly environment. But a sufficient number of grievances can develop within a population to the point where their preference for those grievances to be addressed trumps their preference for a peaceful and orderly environment. In such circumstances, the convention of public order can disappear. Condition (5) is what distinguishes conventions from cases where only one coordination equilibrium is possible, as in the example with the farmers selling their coffee. In that case, there existed no other regularity in the behavior of the farmers other than selling their coffee in the price range between p and p′ that would have resulted in the best outcome for each of them.
Condition (3) is a bit trickier to understand. It is connected to formal issues of game theory—particularly with the question of whether a coordination equilibrium is possible. The basic idea behind it can be illustrated with an example. For simplicity, suppose that Carol and Diane are the only people in the group. There are four possible combinations of actions to the coordination problem of which side of the road on which to drive:
(a) Carol drives on the left and Diane drives on the left.
(b) Carol drives on the left and Diane drives on the right.
(c) Carol drives on the right and Diane drives on the left.
(d) Carol drives on the right and Diane drives on the right.
And there are, in principle, twenty-four possible ways for each of Carol and Diane to order these actions according to her preference. By adopting condition (3), Lewis aims to ensure that there is enough agreement between the preferences of Carol and Diane to make a coordination equilibrium possible. If, for example, Carol prefers (d) to (a), and (a) to either (b) or (c), then an equilibrium will be unreachable if, for example, Diane prefers either of (b) and (c) to either of (a) or (d). (This is in part because Diane represents a significant portion of the group.)
Now that Lewis’s analysis of convention has been introduced, one can appreciate how he employs it in his account of what it is for a group of individuals to speak a language. Lewis provides an in-depth discussion of what he takes a language to be (1969: 160 ff.). But it should be noted that, for Lewis, a language is not just a collection of basic vocabulary items (a lexicon) and a set of rules for arranging them into more complex elements of the language, including sentences of arbitrary complexity (a grammar). It also includes an interpretation, that is, a function which assigns to each sentence of the language a set of conditions under which that sentence is true (and false). (Technically, the function assigns truth conditions to each possible utterance of each sentence, since Lewis wants to accommodate the possibility of ambiguous sentences, which are standard features of natural languages. Lewis also makes allowance for imperative sentences as well, which are “true” just in case they are obeyed.) So, a language that is just like English except that ‘p or q’ is true iff p is true and q is true and ‘p and q’ is true iff p is true or q is true would not be English, but some other language. Though it consists of the same basic vocabulary items and grammar as English, and thus the same sentences, it supplies interpretations of some of those sentences that are different from those that English supplies. In particular, it switches the truth conditions of ‘and’ and ‘or’ in English. As a result of this conception of languages, a sentence can only be true or false in a language. Another language could also have that same sentence as one of its elements, but it could supply different truth conditions for it.
For Lewis, what it is for a population P to use a language L is for there to be a convention in P to be truthful in L, that is, it is true for almost all individuals to almost always utter sentences only if they believe them to be true (1969: 177, cf. 1975: 7). That is, it is true that, and common knowledge in P that, in almost any instance of verbal communication among members of P:
almost everyone is truthful in L;
almost everyone expects almost everyone else to be truthful in L;
almost everyone has approximately the same preferences regarding all possible combinations of utterances of L;
almost everyone prefers that any one person is truthful in L, given that everyone else is truthful in L; and
there is some other possible language L′ which almost everyone would prefer that any one be truthful in, on condition that almost everyone is truthful in L′.
But Lewis is careful to note that a person must occasionally use or respond appropriately to utterances of sentences of L in order to be a member of a population that uses L. If, at some point, she stops using and responding appropriately to such utterances, she will eventually not belong to any population that uses L (1969: 178).
9. Mind
There are two major respects in which Lewis contributes to the philosophy of mind. The first concerns his theory of mind, which is a version of the identity theory. The second is his theory of mental content, that is, an account of the contents of certain mental states like what is believed when one has a belief, and what is desired when one has a desire. This article discusses only the former (aside from the brief discussion of the latter included in section 2). As indicated in section 4, Lewis is a materialist insofar as he believes that everything in the actual world is material. As a result, he rejects idealism, that is, the view that everything is mental, and dualism, the view that there are fundamentally two different types of entity, mental and physical. Thus, he is a physicalist, and, as mentioned above, an identity theorist. He is a type-type identity theorist, and as such, identifies each type of mental state (each type of experience we can have) with a type of neurophysiological state. So, for example, for Lewis, pain is identical to, say, c-fiber firing. (C-fibers are nerve fibers in the human central nervous system, activation of which is responsible for certain types of pain.) Such views are typically contrasted with token-token identity theories, which say only that each token mental state is identical to some token physical state. A token-token identity theorist will reject the rather general identity between pain and c-fiber firing, though they will recognize an identity between, say, the specific token of pain that Ronald Reagan felt when he was struck by John Hinkley Jr.’s bullet on March 30, 1981 and the appropriate token neurophysiological event which occurred in Reagan’s brain and which was caused by his nerves firing as a result of the bullet strike.
Lewis’s commitment to his theory of mind can be found in his earliest published work, in ‘An Argument for the Identity Theory’ (1966). Given the title, the reader will not be surprised that his main argument for it can be found there too. He argues that because mental states are defined in terms of their causal roles, being caused by certain stimuli and causing certain behaviors, and because every physical phenomenon’s occurrence can be explained by appeal only to physical phenomenon, the phenomena to which we appeal to explain our behaviors, which are usually rendered in the vocabulary of folk psychology (for example, Alice felt/believedx, so she did y), must themselves be physical phenomena. Folk psychology is the largely unscientific theory that each of us uses in order to explain and predict the behavior of others, by appealing to such things as pleasure, pain, beliefs, and desires. We are using folk psychology, for example, when we say that Alice screamed because she was in pain.
Concerning his first premise, Lewis thinks that, for instance, pain is defined by a set of pairs of causal inputs and behavioral outputs that is characteristic only to it. That set might include, for example, the causal input of a live electrode being put into contact with a human being, and the causal output of that human being vocalizing loudly. If this sounds behaviorist, that is because the view has its roots in behaviorism. But, unlike the behaviorist, Lewis does not think that that is all there is to say about mentality. He thinks that each mental state must still be a physical entity. While each is definable in terms of causal roles, each is a neurophysiological state. Furthermore, Lewis thinks that the mental concepts afforded to us by folk psychology pick out real mental states—at least for the most part. Thus Lewis expects that, by and large at least, each mental state that is part of our folk psychological theory will be definable in terms of a unique set of causal inputs and outputs. This sets Lewis (and other reductionists about the mind) apart from eliminativists, who expect no such accuracy in our folk psychological theory, and, indeed, often argue against its adequacy (as in, for example, Churchland 1981).
Lewis’s second premise is that the physical world is explanatorily closed. For any (explicable) physical phenomenon, there are some phenomena in terms of which it can be explained that are themselves physical. (Lewis leaves room for physical phenomena that have no explanations because they depend on chance, such as why a particular atom of uranium-235 decayed at a particular time t.) What is important for Lewis’s project is that this means we will never have to appeal to any non-physical (read: mental) entity in order to explain any physical phenomenon. And, because the causes and effects in the characteristic set that defines any given mental state are always physical (things like the placement of live electrodes and vocalizations), we will never need to invoke mental phenomena in order to explain any of these phenomena. We will be able to find some physical phenomena in terms of which to do so.
Very often, token-token identity theorists are role functionalists, who identify each type of mental state with a type of functional role. This role can, in principle, be realized by more than one type of physical state. And hence each type of mental state can, in principle, be realized by more than one type of physical state. But, according to role functionalists, a mental state itself is not identical to any physical state. So, for example, a role functionalist might identify pain with the functional state of bodily damage detection. That functional state is (we are supposing) realized in humans by c-fiber firings. As a result, pain is realized in humans by c-fiber firings. But it is something more abstract than just c-fiber firings; it is just whatever plays the role of bodily damage detection. It just so happens that what plays that role in humans is (we are supposing) c-fiber firings. Lewis was not a role functionalist. As stated, he identified each type of mental state with some type of physical state. So he identified pain with c-fiber firings, rather than saying that the former is realized by the latter.
This opens Lewis’s view up to the problem of the multiple realizability of the mental. This is the idea that human beings (or, more generally, organisms in which the role of bodily damage detection is played by c-fibers) are presumably not the only sorts of creatures that can be in pain. There may be animals on earth which lack c-fibers but which, when subjected to an electric shock, behave in the sort of way human beings behave, vocalizing loudly, moving away from the source of the shock, and so on. And even if there are not, we can imagine beings, perhaps Martians, that meet these conditions. What of them? Presumably, they can be in pain. But if they do not have c-fibers, then Lewis is forced to say that they, in fact, cannot be in pain.
In ‘Mad Pain and Martian Pain’ (1980a), Lewis deals with this problem by essentially biting the bullet. He recognizes that there will be distinct mental states associated with similar causal roles like human pain, jellyfish pain, Martian pain, and so forth. But he does not think this was too big a bullet to bite. The debate is, ultimately, just one about which state—realizer or role—we refer to when we use our folk psychological terminology to refer to mental states (such as ‘pleasure’, ’pain’, ‘belief’, ‘desire’, and so on). But Lewis also thinks there is good reason to prefer his view. Remember that he identifies mental states by their causal roles. Pain is whatever both is caused by certain sorts of stimuli (electric shocks, pricks with a needle, and so forth) and causes certain sorts of behavior (vocalizing loudly, moving away from the stimulus, and so forth). But an abstract functional role is not apt to play this causal role. There must be something physical that does so—that is actually involved in the push-and-pull of each causal chain of physical events. On Lewis’s account, according to which each type of mental state is a type of physical state, and in which each token mental state is a token physical state, there is always a physical state to play the needed causal role, and, moreover, to play that role while keeping the world at large completely material. One cannot help but appreciate how neatly this reply is connected to the argument he originally gives for his identity theory in his 1966 paper.
Another problem Lewis addresses in ‘Mad Pain and Martian Pain’ is, in a certain sense, the reverse of the problem of the multiple realizability of the mental. His terminology calls this ‘the problem of mad pain.’ The basic idea is that it is possible for there to be individual human beings (and as such, individuals we want to count as being capable of being in human pain), who lack the behavioral outputs that are typically associated with certain environmental inputs among humans, or have atypical behavioral outputs associated with certain environmental inputs. So, for example, when subjected to an electric shock, rather than screaming or moving away from its source, such an individual might sigh, relax her posture, and smile pleasantly. And when eating a piece of cake, she might scream and move away from it. Call such an individual a madman.
Even as early as his 1966 paper, Lewis is careful to characterize the characteristic causal role of a mental state as a set of typical associated environmental stimuli and behaviors (1966: 19–20). So the existence of a madman here or there does not cause problems for Lewis’s view. But, of course, one immediately wonders relative to what group these stimuli and behaviors are typically associated. He says, of the group relative to which we should characterize ‘pain’:
Perhaps (1) it should be us; after all, it’s our concept and our word. On the other hand, if it’s X we’re talking about, perhaps (2) it should be a population that X himself belongs to, and (3) it should preferably be one in which X is not exceptional. Either way, (4) an appropriate population should be a natural kind—a species, perhaps. (1980a: 219–20)
In the case of representative individuals of a population, all four criteria pull together. In the case of the Martian, criterion (1) is outweighed by the other three (whether the characteristic set for pain in Martians is exactly the same as it is in humans or if there are some differences between them). And in the case of the madman, it is criterion (3) that is outweighed by the other three. There will be certain cases with which Lewis’s account will have difficulties, to be sure. If a lightning strike hits a swamp and produces a one-off creature that is a member of no population apart from that consisting of just itself, Lewis’s account would provide no direction about how to regard a set of associated stimuli and behaviors which are correlated in the creature. That is, it would not tell us which mental state the set is associated with. But Lewis is prepared to live with such difficult cases, as he think our intuitions would not be reliable in such a situation anyway. As a result, he thinks that the fact that his theory provides no definitive answers in such cases is not a drawback of it, but, in fact, is in line with our pre-theoretic estimation of such cases.
A final issue worth mentioning is qualia—the subjective nature of an experience, for example, what it feels like to be in the sort of pain caused by a live electrode being put into contact with one’s left thumb. Identity theorists, and physicalists in general, often face the problem of qualia, that is, the allegation that their theory cannot make sense of the idea that there is something that it feels like to be in a particular mental state. One of the most famous statements of this problem is by Frank Jackson, in his paper, ‘Epiphenomenal Qualia’ (1982). He asks us to consider an individual, Mary, who has spent her entire life in a black and white room, never seeing any color other than black and white. Nonetheless, she has devoted herself to learning everything she can about color from (black and white) textbooks, television programs, and so forth, and is, at this point, perfectly knowledgeable about the subject. We can suppose she knows every piece of physical information there is to know about electromagnetism, optics, physiology, neuroscience, and so forth, that is related to color and its perception. Jackson then asks us to imagine that one day, Mary steps outside for the first time, and sees a red rose. He maintains that she learns something upon doing so that she did not know before, namely, what it is like to see red. Thus, Jackson concludes, not all information is physical information. This poses a problem for the physicalist because, according to physicalist, this should not be possible. There is nothing to know about color and its perception outside of the complete collection of physical information associated with color and its perception.
Lewis’s response to the qualia problem can be found in his Postscript to ‘Mad Pain and Martian Pain’ (1983b: 130–32), ‘What Experience Teaches’ (1988c), ‘Reduction of Mind’ (1994b), and ‘Should a Materialist Believe in Qualia?’ (1995). He credits it to Laurence Nemirow (1979, 1980, and 1990), and, in short, it is the idea that when Mary exits the room and sees a rose, she does not learn a new piece of information, instead, she gains a new ability. In particular, she gains the ability to make certain comparisons and to imagine certain sorts of objects that she was not able to do before. Now that she has seen the rose, she can go further out into the world and distinguish between things that are the same color as the rose and those which are not. And she can imagine what a red car would look like, even if she has not seen one. These are things she was not able to do before. But they are not propositional knowledge, in the sense that they are not things that can be expressed by a sentence of a language.
10. Other Work and Legacy
There are numerous aspects of Lewis’s work which this article has not discussed. He has influential views about the nature of dispositions, a discussion of which can be found in ‘Finkish Dispositions’ (1997b). He writes on free will in ‘Are We Free to Break the Laws?’ (1981a). And his discussions of his theory of mental content can be found in, for example, ‘Attitudes De Dicto and De Se’ (1979a) and ‘Reduction of Mind’ (1994b: 421 ff.). In addition to metaphysics, the philosophy of language, and the philosophy of mind, Lewis contributed to other subfields, including epistemology and philosophy of mathematics. The reader can find what Lewis has to say about knowledge in ‘Elusive Knowledge’ (1996b). His main focus in the philosophy of mathematics is on squaring his materialistic commitments with his liberal use of set theory (in, for example, his theory of properties). After all, sets are, prima facie, abstract objects. Lewis’s strategy is to provide an analysis of set theory in mereological terms. The parthood relation does much of the work that the membership relation does in set theory. A set of some objects is, for him, just their mereological sum. With this idea in place, Lewis is able to make sense of set-theoretic talk in terms of concrete objects which stand in parthood relationships to one another. The interested reader can find discussions of this issue in his book Parts of Classes (1991) and his articles ‘Nominalistic Set Theory’ (1970c) and ‘Mathematics is Megethology’ (1993b).
Lewis discusses central issues in the philosophy of religion, including the ontological argument in ‘Anselm and Actuality’ (1970a), and the problem of evil in ‘Evil for Freedom’s Sake’ (1993a) and the posthumous ‘Divine Evil’ (2007). In the philosophy of science, he discusses inter-theoretic reduction in ‘How to Define Theoretical Terms’ (1970b) and verificationism in ‘Statements Partly About Observation’ (1988b). Lewis also writes extensively on chance and probabilistic reasoning in, for example, ‘Prisoners’ Dilemma Is a Newcomb Problem’ (1979c), ’A Subjectivist’s Guide to Objective Chance’ (1980b), ‘Causal Decision Theory’ (1981b), ‘Why Ain’cha Rich?’ (1981c), ‘Probabilities of Conditionals and Conditional Probabilities’ (1976a), ‘Probabilities of Conditionals and Conditional Probabilities II’ (1986d), ‘Human Supervenience Debugged’ (1994a), and ‘Why Conditionalize?’ (1999b). And he discusses certain issues that fall at the intersection of probabilistic and practical reasoning in ‘Desire as Belief’ (1988a) and ‘Desire as Belief II’ (1996a).
Lewis makes contributions to deontic logic, which is a formal modal language used to express claims of obligation and permission, whose operators are interpreted to mean ‘it is obligatory that’ and ‘it is permissible that’, in, for example, ‘Semantic Analyses for Dyadic Deontic Logic’ (1974). Lewis also has well-developed views about ethics, metaethics, and applied ethics. In ‘Dispositional Theories of Value’ (1989b), Lewis develops a materialism-friendly theory of value in terms of things’ dispositions to affect us in appropriate ways (or to generate appropriate attitudes in us) in ideal conditions. These attitudes are certain (intrinsic, as opposed to instrumental) second-order desires. That is, one values something only if she desires that she desires it. As a result, Lewis is officially a subjectivist about value. But he thinks (or at least hopes) that there is enough commonality among moral agents that a more-or-less fixed set of values can be discerned. Lewis does not develop a systematic ethical system. But he delivers critiques of consequentialist ethical theories (according to which what makes an action right or wrong is determined by the nature of its consequences) like utilitarianism (according to which what makes an action right/wrong is that it maximizes/fails to maximize the benefit to the largest number of people). See, for example, ‘Reply to McMichael’ (1978), ‘Devil’s Bargains and the Real World’ (1984), and Plurality (1986b: 128). One general constraint Lewis does make explicit about his positive view is that an ethical theory should be compatible with there being multiple, potentially conflicting, moral values. Similarly, he thinks it might be impossible to provide a binary evaluation of someone’s character as good or bad, overall. It might be that we can only point to respects in which an individual has good or bad character. Nolan (2005: 189) takes it to be likely that Lewis’s positive ethical theory, to the extent it can be discerned in his writings, is a version of virtue ethics, and thus that he bases the rightness or wrongness of a particular act on whether a moral agent with appropriate virtues and in appropriate circumstances would perform it (see, for example, Lewis 1986b: 127). Lewis focuses on several issues in applied ethics, including punishment in ‘The Punishment that Leaves Something to Chance’ (1987) and ‘Do We Believe in Penal Substitution?’ (1997a), tolerance in ‘Academic Appointments: Why Ignore the Advantage of Being Right?’ (1989a) and ‘Mill and Milquetoast’ (1989c), and nuclear deterrence in ‘Devil’s Bargains and the Real World’ (1984), ‘Buy Like a MADman, Use Like a NUT’ (1986a), and ‘Finite Counterforce’ (1989b).
Truly, then, Lewis’s contributions to philosophy range much more widely than his most-known work. It is difficult to summarize Lewis’s legacy. He makes important contributions to understanding probability and probabilistic reasoning, and his work on conditionals—counterfactuals in particular—can only be described as foundational. His work on causation is very important as well. In particular, his move from a simpler counterfactual analysis of causation to one invoking the notion of influence is reflected in more recent interventionist accounts of causation, according to which the cause of an event E is something which, when manipulated in some way (for example, by slightly changing the time at which it occurs or the manner in which it occurs), one can modify E. And, as Woodward (2016, sec. 9) notes, interventionist accounts are ultimately counterfactual accounts, and so they are also in this way indebted to Lewis’s earlier work on causation as well as to his work on counterfactuals. While dualism about the mind is much more popular in the first two decades of the twenty-first century than in Lewis’s day, his argument for his identity theory, which appeals to the explanatory closure of the physical world, is an important foil for the dualists who emerged in the 1980s and 90s. And his and Nemirow’s response to the problem of qualia was also a must-address for those dualists.
Lewis’s discussion of time and perdurance in Plurality generated a large debate in that area, and to a great extent set its parameters. Recall (see section 4) that he sets out three ways of solving the problem of temporary intrinsics: regarding intrinsic properties like shape to be relations to times, presentism, and his own worm theory. A lot of work was done exploring the tenability of each of these options, and exploring other nearby options. In addition, Lewis’s paper ‘The Paradoxes of Time Travel’ (1976b) is arguably responsible for an entire sub-literature on that topic.
Lewis’s metaphysics is, by and large, nominalist. But realism about universals is much more popular today than it was in the mid-20th century. As nominalistic as his views are, Lewis makes important moves away from the ideas which formed the environment in which his philosophical development took place. Quine, of course, believed that there is “no entity without identity” (for example, 1969: 23). What he intended by this is that we must have clear identity conditions for any entity whose existence we posit. This is one of the reasons why Quine was happy to recognize the existence of sets, which are individuated extensionally, that is, according to which members they have, but was skeptical of such things as properties. Lewis makes properties extensional by identifying them with sets, but goes a step further by allowing their extensions to range across all possibilia, rather than just actual entities. Lewis then goes even further in conceding, in ‘New Work for a Theory of Universals’ (1983a), that universals can do things which properties, as conceived by Lewis, cannot do. His basic distinction between properties which are perfectly natural and those which are not is rather anti-nominalistic, and this position can be understood as a bridge connecting the Quinean extensional picture of the world with the new hyperintensional picture of it, which allows for distinctions amongst entities, such as properties or propositions, that are not only extensionally equivalent, in that they apply to the same things or are all true or false at the actual world, but are intensionally equivalent, that is, they do so or are so at every possible world. An example are the properties, mentioned in section 3, being a triangular polygon and being a trilateral (three-sided) polygon. Sider (2011) generalizes Lewis’s idea from properties, which are the worldly correlates of predicates, to other sorts of entities, including the worldly correlates of predicate modifiers, sentential connectives, and quantifiers. He ends up with a very general notion of joint-carving–ness, which is a feature of certain of our linguistic expressions, and he uses the notion to characterize the notion of fundamentality, as Lewis does with naturalness (for Lewis, the perfectly natural properties are the fundamental properties, all other properties being definable in terms of them—see, for example, 1994a: 474). It is hard to say exactly what the philosophical world today would be like without Lewis. But we can be sure that it would be very different than it is.
11. References and Further Reading
Note: Many of the papers below have been reprinted, sometimes with postscripts, in one of the collections Lewis 1983b, 1986c, 1998, 1999a, and 2000b; below, only the first appearance is cited.
a. Primary Sources
Lewis, David K. 1966. An Argument for the Identity Theory. Journal of Philosophy 63, 17–25.
Lewis, David K. 1968. Counterpart Theory and Quantified Modal Logic. Journal of Philosophy 65, 113–26.
Lewis, David K. 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press.
Lewis, David K. 1970a. Anselm and Actuality. Noûs 4, 175–88.
Lewis, David K.1970b. How to Define Theoretical Terms. Journal of Philosophy 67, 427–46.
Lewis, David K. 1970c. Nominalistic Set Theory. Noûs 4, 225–40. Reprinted in Lewis 1998, 186–202.
Lewis, David K. 1971. Counterparts of Persons and Their Bodies. Journal of Philosophy 68, 203–11.
Lewis, David K. 1973a. Causation. Journal of Philosophy 70, 556–67.
Lewis, David K. 1973b. Counterfactuals. Oxford: Blackwell.
Lewis, David K. 1974. Semantic Analyses for Dyadic Deontic Logic. In Sören Stenlund (ed.), Logical Theory and Semantic Analysis: Essays Dedicated to Stig Kanger on His Fiftieth Birthday. Dordrecht: Reidel.
Lewis, David K. 1975. Languages and Language. In Keith Gunderson (ed.), Minnesota Studies in the Philosophy of Science. University of Minnesota Press, 3–35.
Lewis, David K. 1976a. Probabilities of Conditionals and Conditional Probabilities. Philosophical Review 85, 297–315.
Lewis, David K. 1976b. The Paradoxes of Time Travel. American Philosophical Quarterly 13, 145–52.
Lewis, David K. 1978. Reply to McMichael. Analysis 38, 85–86.
Lewis, David K. 1979a. Attitudes De Dicto and De Se. The Philosophical Review 88, 513–43.
Lewis, David K. 1979b. Counterfactual Dependence and Time’s Arrow. Noûs 13, 455–76.
Lewis, David K. 1979c. Prisoners’ Dilemma Is a Newcomb Problem. Philosophy and Public Affairs 8, 235–40.
Lewis, David K. 1980a. Mad Pain and Martian Pain. In Ned Block (ed.), Readings in Philosophy of Psychology, Vol. 1. Cambridge, MA: Harvard University Press, 216–22.
Lewis, David K. 1980b. A Subjectivist’s Guide to Objective Chance. In Richard C. Jeffrey (ed.), Studies in Inductive Logic and Probability, Vol. II. Berkeley, CA: University of California Press, 263–93.
Lewis, David K. 1981a. Are We Free to Break the Laws? Theoria 47, 113–21.
Lewis, David K. 1981b. Causal Decision Theory. Australasian Journal of Philosophy 59, 5–30.
Lewis, David K. 1981c. Why Ain’cha Rich? Noûs 15, 377–80.
Lewis, David K. 1983a. New Work for a Theory of Universals. Australasian Journal of Philosophy 61, 343–77.
Lewis, David K. 1983b. Philosophical Papers, Vol. I. Oxford: Oxford University Press.
Lewis, David K. 1984. Devil’s Bargains and the Real World. In Douglas MacLean (ed.), The Security Gamble: Deterrence in the Nuclear Age. Totowa, NJ: Rowman and Allenheld, 141–154.
Lewis, David K. 1986a. Buy Like a MADman, Use Like a NUT. QQ 6: 5–8.
Lewis, David K. 1986b. On the Plurality of Worlds. Oxford: Blackwell.
Lewis, David K. 1986c. Philosophical Papers, Vol. II. Oxford: Oxford University Press.
Lewis, David K. 1986d. Probabilities of Conditionals and Conditional Probabilities II. Philosophical Review 95, 581–89.
Lewis, David K. 1987. The Punishment that Leaves Something to Chance. In Proceedings of the Russellian Society (University of Sydney) 12, 81–97. Also in Philosophy and Public Affairs 18, 53–67.
Lewis, David K. 1988a. Desire as Belief. Mind 97, 323–32.
Lewis, David K. 1988b. Statements Partly About Observation. Philosophical Papers 17, 1–31.
Lewis, David K. 1988c. What Experience Teaches. Proceedings of the Russellian Society (University of Sydney) 13, 29–57.
Lewis, David K. 1989a. Academic Appointments: Why Ignore the Advantage of Being Right? In Ormond Papers, Ormond College, University of Melbourne.
Lewis, David K. 1989b. Finite Counterforce. In Henry Shue (ed.), Nuclear Deterrence and Moral Restraint. Cambridge: Cambridge University Press, 51–114.
Lewis, David K. 1989c. Mill and Milquetoast. Australasian Journal of Philosophy 67, 152–71.
Lewis, David K. 1991. Parts of Classes. Oxford: Blackwell.
Lewis, David K. 1993a. Evil for Freedom’s Sake. Philosophical Papers 22, 149–72.
Lewis, David K. 1993b. Mathematics is Megethology. Philosophia Mathematica 3, 3–23.
Lewis, David K. 1994a. Humean Supervenience Debugged. Mind 103, 473–90.
Lewis, David K. 1994b. Reduction of Mind. In Samuel Guttenplan (ed.), A Companion to the Philosophy of Mind. Oxford: Blackwell, 412–31.
Lewis, David K. 1995. Should a Materialist Believe in Qualia? Australasian Journal of Philosophy 73, 140–44.
Lewis, David K.1996a. Desire as Belief II. Mind 105, 303–13.
Lewis, David K. 1996b. Elusive Knowledge. Australasian Journal of Philosophy 74, 549–67.
Lewis, David K.1997a. Do We Believe in Penal Substitution? Philosophical Papers 26, 203–09.
Lewis, David K. 1997b. Finkish Dispositions. The Philosophical Quarterly 47, 143–58.
Lewis, David K. 1998. Papers in Philosophical Logic. Cambridge: Cambridge University Press.
Lewis, David K. 1999a. Papers on Metaphysics and Epistemology. Cambridge: Cambridge University Press.
Lewis, David K. 1999b. Why Conditionalize? In Lewis 1999a. (Written in 1972.)
Lewis, David K. 2000a. Causation as Influence. Journal of Philosophy 97, 182–97.
Lewis, David K. 2000b. Papers in Ethics and Social Philosophy. Cambridge: Cambridge University Press.
Lewis, David K. 2002 Tensing the Copula. Mind 111, 1–13.
Lewis, David K. 2004. Causation as Influence (extended version). In John Collins, Ned Hall, and L. A. Paul (eds), Causation and Counterfactuals. Cambridge, MA: MIT Press, 75–106.
Lewis, David K. 2007. Divine Evil. In Louise M. Antony (ed.), Philosophers without Gods: Meditations on Atheism and the Secular Life. Oxford: Oxford University Press.
b. Secondary Sources
Armstrong, David M. 1978a. Universals and Scientific Realism, Vol. I: Nominalism and Realism. Cambridge: Cambridge University Press.
Armstrong, David M. 1978b. Universals and Scientific Realism, Vol. II: A Theory of Universals. Cambridge: Cambridge University Press.
Armstrong, David M. 1983. What Is a Law of Nature? Cambridge: Cambridge University Press.
Churchland, Paul 1981. Eliminative Materialism and the Propositional Attitudes. Journal of Philosophy 78, 67–90.
Fine, Kit 1975. Critical Notice of Counterfactuals. Mind 84, 451–58.
Goodman, Nelson 1947. The Problem of Counterfactual Conditionals. Journal of Philosophy 44, 113–28.
Goodman, Nelson 1955. Fact, Fiction, and Forecast. Cambridge, MA: Harvard University Press.
Hall, Ned. 2004. Two Concepts of Causation. In John Collins, Ned Hall, and L.A. Paul (eds), Causation and Counterfactuals. Cambridge, MA: The MIT Press, 225–76.
Kripke, Saul A. 1980. Naming and Necessity. Cambridge, MA: Harvard University Press.
Nemirow, Laurence 1979. Functionalism and the Subjective Quality of Experience. Doctoral Dissertation, Stanford University.
Nemirow, Laurence 1980. Review of Thomas Nagel, Moral Questions. Philosophical Review 89, 475–76.
Nemirow, Laurence 1990. Physicalism and the Cognitive Role of Acquaintance. In William G. Lycan (ed.), Mind and Cognition. Oxford: Blackwell.
Nolan, Daniel 2005. David Lewis. Chesham: Acumen.
Quine, William Van Orman. 1969. Ontological Relativity and Other Essays. New York: Columbia University Press.
Sider, Theodore. 1996. All the World’s a Stage. Australasian Journal of Philosophy 74, 433–53.
Sider, Theodore. 2001. Four-Dimensionalism: An Ontology of Persistence and Time. Oxford: Oxford University Press.
Sider, Theodore. 2011. Writing the Book of the World. Oxford: Oxford University Press.
Stalnaker, Robert C. 1968. A Theory of Conditionals. In Nicolas Rescher (ed.), Studies in Logical Theory, American Philosophical Quarterly Monograph Series, Vol. 2. Oxford: Blackwell, 98–112.
van Inwagen, Peter. 1990. Material Beings. New York: Cornell University Press.
Weatherson, Brian. 2016. David Lewis. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy.
Woodward, James. 2016. Causation and Manipulability. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy.
c. Further Reading
Nolan, Daniel 2005. David Lewis. Chesham: Acumen.
Jackson, Frank and Graham Priest. 2004. Lewisian Themes: The Philosophy of David K. Lewis. Oxford: Oxford University Press.
Loewer, Barry and Jonathan Schaffer. 2015. A Companion to David Lewis. Oxford: Blackwell.
Weatherson, Brian. 2016. David Lewis. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy.
Truth-conditional semantics explains meaning in terms of truth-conditions. The meaning of a sentence is given by the conditions that must obtain in order for the sentence to be true. The meaning of a word is given by its contribution to the truth-conditions of the sentences in which it occurs.
What a speaker says by the utterance of a sentence depends on the meaning of the uttered sentence. Call what a speaker says by the utterance of a sentence the content of the utterance. Natural languages contain many words whose contribution to the content of utterances varies depending on the contexts in which they are uttered. The typical example of words of this kind is the pronoun ‘I’. Utterances of the sentence ‘I am hungry’ change their contents depending on who the speaker is. If John is speaking, the content of his utterance is that John is hungry, but if Mary is speaking, the content of her utterance is that Mary is hungry.
The words whose contribution to the contents of utterances depends on the context in which the words are uttered are called context-sensitive. Their meanings are guidance for speakers to use language in particular contexts for expressing particular contents.
This article presents the main theories in philosophy of language that address context-sensitivity. Section 1 presents the orthodox view in truth-conditional semantics. Section 2 presents linguistic pragmatism, also known as ‘contextualism’, which comprises a family of theories that converge on the claim that the orthodox view is inadequate to account for the complexity of the relations between meanings and contexts. Sections 3 and 4 present indexicalism and minimalism, which from two different perspectives try to resist the objections raised by linguistic pragmatism against the orthodox view. Section 5 presents relativism, which provides a newer conceptualization of the relations between meanings and contexts.
1. The Orthodox View in Truth-Conditional Semantics
a. Context-Sensitive Expressions and the Basic Set
The orthodox view in truth-conditional semantics maintains that the content (proposition, truth-condition) of an utterance of a sentence is the result of assigning contents, or semantic values, to the elements of the sentence uttered in accord with their meanings and combining them in accord with the syntactic structure of the sentence. The content of the utterance is determined by the conventional meanings of the words that occur in the sentence.
Conventional meanings are divided into two kinds. Meanings of the first kind determine semantic values that remain constant in all contexts of utterance. Meanings of the second kind provide guidance for the speaker to exploit information from the context of utterance to express semantic values. Linguistic expressions governed by meanings of the second kind are context-sensitive and can be used to express different semantic values in different contexts of utterance. The following is a list of some context-sensitive expressions (Donaldson and Lepore 2012):
Personal pronouns: I, you, she
Demonstratives: this, that
Adjectives: present, local, foreigner
Adverbs: here, now, today
Nouns: enemy, foreigner, native
Cappelen and Lepore (2005) call the set of expressions that exhibit context-sensitivity in their conventional meaning the Basic Set. Compare the following pair of utterances:
(1) I am hungry (uttered by John).
(2) John is hungry (uttered by Mary).
Utterance (1) and utterance (2) have the same truth-conditional content. Both are true if and only if John is hungry. Yet, the sentence ‘I am hungry’ and the sentence ‘John is hungry’ have different meanings. The meaning of the first-person pronoun ‘I’ prescribes that only the speaker can utter it to refer to herself. Only John can utter the sentence ‘I am hungry’ to say that John is hungry. In a context where the speaker is not John, the sentence ‘I am hungry’ cannot be uttered to say that John is hungry. The meaning of the proper name ‘John’, instead, allows for speakers in different contexts of utterance to refer to John. In all contexts of utterance the sentence ‘John is hungry’ can be uttered to say that John is hungry.
b. Following Kaplan: Indexes and Characters
Since David Kaplan’s works (1989a, 1989b) in formal semantics, the conventional meaning of a word is a function from an index, which represents features of the context of utterance, to a semantic value. The features of the context of utterance include who is speaking, when, where, the object the speaker refers to with a demonstrative, and the possible world where the utterance takes place. Adopting Kaplan’s terminology, philosophers call the function from indexes to semantic values character. The semantic values of the words in a sentence relative to an index are composed into a function that distributes truth-values at points of evaluation, pairs of possible worlds and times. The formal semantic machinery determines the condition under which a sentence relative to a given index is true at a world and a time. For example, John’s utterance (1) is represented as the pair formed of the sentence ‘I am hungry’ and an index i that contains John as speaker. The semantic machinery determines the truth-condition of this pair so that the sentence ‘I am hungry’ at the index i is true at a possible world w and a time t if and only if the speaker of i is hungry in w at t; that is, if and only if John is hungry in w at time t. If Mary uttered the sentence ‘I am hungry’, another index i* with Mary as speaker would be needed to represent her utterance. The semantic machinery would ascribe to the sentence ‘I am hungry’ at the index i* the content that is true at a possible world w and a time t if and only if Mary is hungry in w at time t.
In formal semantics, then, context-sensitive meanings are characters that vary depending on the indexes that represent features of the contexts of utterance, where indexes are tuples of slots, or parameters, to be filled in in order for sentences at indexes to have a truth-conditional content. The meanings of context-insensitive expressions, instead, are characters that remain constant in all indexes. For example, the meaning of the proper name ‘John’ is a constant character that returns John as semantic value in all indexes. No matter who is speaking, when, or where, John is the semantic value of the proper name ‘John’, and the sentence ‘John is hungry’, relative to all indexes, is true at a world w and time t if and only if John is hungry in w at time t.
It is convenient here to introduce an aspect relevant to section 5. Since the indexes that are used to represent features of contexts of utterance contain possible worlds and times, the semantic machinery distributes unrelativised truth-values to index-sentence pairs. A sentence S at index i is true (simpliciter) if and only if S is true in iw at it, where iw and it are the possible world and the time of index; see Predelli (2005: 22). For example, if John utters the sentence ‘I am hungry’ at noon on 23 November 2019, the index that represents the features of John’s context of utterance contains the time noon on 23 November 2019 and the actual world. John’s utterance is true (simpliciter) if and only if John is hungry at noon on 23 November 2019 in our actual world.
c. Context-Sensitivity and Saturation
The orthodox truth-conditional view in semantics draws the distinction between the meaning of an expression type and the content of an utterance of the expression. The meaning of the expression type is the linguistic rule that governs the use of the expression. Context-insensitive expressions are governed by linguistic rules that determine their contents (semantic values), which remain invariant in all contexts of utterance. Context-sensitive expressions, instead, are governed by linguistic rules that prescribe how the speaker can use them to express contents in contexts of utterance.
The meanings of context-sensitive expressions specify what kinds of contextual factors play certain roles with respect to utterances. More precisely, the meanings of context-sensitive expressions fix the parameters that have to be filled in in order for utterances to have contents. Philosophers and linguists use the technical term saturation for what the speaker does by filling in the demanded parameters with values taken from contextual factors. Indexicals are typical examples of context-sensitive expressions. For example, the meaning of the pronoun ‘I’ establishes that an utterance of it refers to the agent that produces it. The meaning of the demonstrative ‘that’ establishes that an utterance of it refers to the object that plays the role of demonstratum in the context of utterance. Thus, the meaning of ‘I’ demands that the speaker fill in an individual, typically herself, as the value of the parameter speaker of the utterance. And the meaning of ‘that’ demands that the speaker fill in a particular object she has in mind as the value of the parameter demonstratum.
In formal semantics the parameters that are filled in with values are represented with indexes, and the meanings of expressions are functions—characters—from indexes to contents. The meanings of context-insensitive expressions are constant characters, while the meanings of context-sensitive expressions are variable characters. If a sentence contains no context-sensitive expressions, it can be uttered to express the same content in all contexts of utterance. On the contrary, if a sentence contains context-sensitive expressions, it can be used to express different contents in different contexts of utterances.
d. Grice on What is Said and the Syntactic Constraint
One of the main tenets of the orthodox truth-conditional view is that all context-sensitivity is linguistically triggered in sentences or in their logical forms. The presence of each component of the truth-conditional content of an utterance of a sentence is mandatorily governed by a linguistic element occurring in the uttered sentence or in its logical form. For this reason, some philosophers equate the distinction between the meanings of expression types and the contents of utterances with Paul Grice’s (1989) distinction between sentence meaning and what is said by an utterance of a sentence. The sentence meaning is given by the composition of the meanings of the words that occur in the sentence. What is said corresponds to the truth-conditional content that the speaker expresses by undertaking the processes of disambiguation, reference assignment, and saturation that are required by her linguistic and communicative intentions and by the meanings of the uttered words.
Grice held that what is said is part of the speaker’s meaning. The speaker’s meaning is the content that the speaker intends to communicate by an utterance of a sentence. In Grice’s view, the speaker’s meaning comprises two parts: What is said and what is implicated. What is said is the content that the speaker explicitly and directly communicates by the utterance of a sentence. What is implicated is the content the speaker intends to convey indirectly. Grice called the contents that are indirectly conveyed implicatures. Implicatures can be inferred from what is said and general principles governing communication: the cooperative principle and its maxims. To illustrate Grice’s distinctions, suppose that at a party A, pointing to Bob and speaking to B, utters the following sentence:
(3) That guest is thirsty.
Following Grice, the utterance of (3) can be analysed at three distinct levels. (i) The level of sentence meaning is given by the linguistic conventions that govern the use of the words in the sentence. Due to linguistic competence alone, the hearer B understands that A’s utterance is true if and only if the individual, to whom A refers with the complex demonstrative ‘that guest’, is thirsty. (ii) The second level is given by what A says, that is, the truth-conditional content A’s utterance expresses. What is said—the content of A’s utterance—is that Bob is thirsty. To understand this content, B must consider A’s expressive and communicative intentions. B must understand that A has Bob in mind and wants to refer to him. To do so, B needs to rely on his pragmatic competence and contextual information. Mere linguistic competence is not enough. (iii) Finally, there is the level of what is meant through a conversational implicature. A intends that B offer Bob some champagne. Grice’s idea is that to understand what A intended to communicate, B must first understand the content of what A said—that Bob is thirsty—and then understand the implicature that it would be nice to offer Bob some champagne.
One very important aspect of Grice’s view is that each element that enters the content of what is said corresponds to some linguistic expression in the sentence. Grice maintained that what is said is “closely related to conventional meanings of words” (1989: 25). Grice imposed a syntactic constraint on what is said, according to which each element of what is said must correspond to an element of the sentence uttered. Carston (2009) speaks of the ‘Isomorphism Principle’, which states that if an utterance of a sentence S expresses the propositional content P, then the constituents of P must correspond to the semantic values of some constituents of S or of its logical form.
e. Semantic Contents of Utterances
Some philosophers reject the equation of the notion of content of an utterance with Grice’s notion of what is said. For example, Korta and Perry (2007) maintain that the content of an utterance is determined by the conventional meanings of the words the speaker utters and by the fact that the speaker undertakes all the semantic burdens that are demanded by those meanings, in particular disambiguation, reference assignment, and saturation of context-sensitive expressions. Korta and Perry call the content of an utterance so determined locutionarycontent (see also Bach 2001) and argue that there are clear cases in which the locutionary content does not coincide with Grice’s what is said, which is always part of what the speaker intends to communicate, that is, the speaker’s meaning. Irony is a typical example of this distinction. When, pointing to X, a speaker utters the sentence:
(4) He is a fine friend
ironically, the speaker does not intend to communicate that X is a fine friend, but the opposite. Nonetheless, without identifying the referent of ‘he’ and the literal content of ‘is a fine friend’, that is, without understanding the locutionary content of (4), the hearer is not able to understand the speaker’s ironic comment.
To illustrate in detail the debate on Grice’s notion of what is said goes beyond the purpose of this article. It is important to remark here that, according to the orthodox truth-conditional view—at least when speakers use language literally—what is said by an utterance of a sentence corresponds to the content that is determined by the conventional meanings of the words in the uttered sentence: The speaker undertakes all the semantic burdens that are demanded by those meanings, such as disambiguation, reference assignment, and saturation of context-sensitive expressions. When a speaker uses language literally, the content of an utterance of a sentence is what one gets by composing the semantic values of the expressions that occur in accord with their conventional meanings and the syntactic structure of the sentence. This content is a fully propositional one with a determinate truth-condition. This picture, which underlies the orthodox truth-conditional view in semantics, has been challenged by philosophers who call for a new theoretical approach. This new approach is called linguistic pragmatism and it expands the truth-conditional roles of pragmatics. The following section presents it.
2. Departing from the Orthodox View: Linguistic Pragmatism
a. Underdetermination of Semantic Contents
Neale (2004) coined the term ‘linguistic pragmatism’, though some philosophers and linguists prefer the term ‘contextualism’. Linguistic pragmatism comprises a family of theories (Bach 1994, 2001, Carston 2002, Recanati 2004, 2001, Sperber and Wilson 1986) that converge on one main thesis, that of semantic underdetermination. Linguistic pragmatists maintain that the meanings of most expressions—perhaps all, according to radical versions of linguistic pragmatism—underdetermine their contents in contexts, and pragmatic processes that are not linguistically governed are required to determine them. The main point of linguistic pragmatism is the distinction between semantic underdetermination and indexicality.
The orthodox view accepts that context-sensitivity is codified in the meanings of indexical expressions, which demand saturation processes. Linguistic pragmatists too accept this form of context-sensitivity, but according to them indexicality does not exhaust context-sensitivity. Linguistic pragmatists say that the variability of contents in contexts of many expressions is not codified in linguistic conventions. Rather, the variability of contents in contexts is due to the fact that the meanings of the expressions underdetermine their contents. Speakers must complete the meanings of the expressions with contents that are not determined by linguistic conventions codified in those meanings. The pragmatic operations that intervene in the process of completing the contents in context are not governed by conventions of the language, that is, by linguistic information, but work on more general contextual information.
Linguistic pragmatists make use of three kinds of arguments to support their view:
(i) Context-shifting arguments test people’s intuitions about the content of sentences in actual or hypothetical contexts of utterance. If people have the intuition that a sentence S expresses differing contents in different contexts, despite the fact that no overt context-sensitive expression occurs in S, it is evidence that some expression that occurs in S is semantically underdetermined. Consider the following example. Mark is 185 cm tall, and George utters the sentence:
(5) Mark is short
in a conversation about the average height of basketball players and then in a conversation about the average height of American citizens. People have the intuition that what George said in the first context is true while what he said in the second context is false. Linguistic pragmatists draw the conclusion that the content of (5) varies through contexts of utterance, despite the fact that the adjective ‘short’ is not an overt context-sensitive expression. They argue that the content of ‘short’ is underdetermined by its conventional meaning and explain the variation in content from context to context as a result of pragmatic processes that are not linguistically governed but nonetheless complete the meaning of ‘short’.
(ii) Incompleteness arguments too test people’s intuitions about the contents of sentences in context, pointing at people’s inability to evaluate the truth-value of a sentence without taking into account contextual information. Suppose George utters the sentence:
(6) Anna is ready.
People cannot say whether George’s utterance is true or false without considering what Anna is said to be ready for. The conclusion now is that (6) does not express a full propositional content with determinate truth-conditions. There is no such thing as Anna’s being ready simpliciter. The explanation is semantic underdetermination: The adjective ‘ready’ does not provide an invariant contribution to a full propositional content and it does not provide guidance to determine such a contribution either, because it is not an overt context-sensitive expression. The enrichment that is required to determine a full truth-conditional content is the result of a pragmatic process that is not governed by the meaning of ‘ready’.
(iii) Inappropriateness arguments spot the difference between the content that is strictly encoded in a sentence and the contents that are expressed by utterances of that sentence in different contexts. Suppose a math teacher utters the following sentence in the course of a conversation about her class:
(7) There are no French girls.
People usually understand the math teacher to say that there are no French girls attending the math class. Some philosophers say that in this case there is an invariant semantic content composed out of the meanings of the words in the sentence: French girls do not exist. However, it seems awkward both to claim that in uttering (7) the speaker says that French girls do not exist and to claim that hearers understand (7) as denying the existence of French girls in general. On the contrary, it seems convenient to suppose that both speakers and hearers restrict the interpretation of (7) to a particular domain, such as the students attending the math class.
b. Completions and Expansions
The claim on which all versions of linguistic pragmatism agree is that very often the content of an utterance is richer than the content obtained composing the semantic values of the expressions in the uttered sentence. Adopting a terminology from Bach (1994), it is common to distinguish two cases of pragmatic enrichments: completions and expansions.
With completions, the content determined by the meanings of the expressions that occur in a sentence is incomplete because it lacks full truth-conditions. These cases often recur in context-shifting arguments and incompleteness arguments:
(5) Mark is short.
(6) Anna is ready.
People do not know what conditions a person must meet to be short or ready simpliciter, so it appears there are no determinate conditions making a person so. To obtain a truth-conditional content it is necessary to add elements that do not correspond to any expression in (5) and (6). Linguistic pragmatists maintain that what is said is a completion of the content that is obtained by composing the meanings of the expressions in the sentence with some completion taken from the context. For instance, the contents of (5) and (6) could be completed in ways that might be expressed as follows:
(5*) Mark is short with respect to the average height of basketball players.
(6*) Anna is ready to climb Eiger’s North Face.
With expansions, the content of an utterance of a sentence is an enrichment of the literal content obtained by composing the semantic values of the expressions in the sentence. Some interesting cases of expansions are employed in inappropriateness arguments. Consider the following examples:
(8) All the students got an A.
(9) Anna has nothing to wear.
In these cases, there is a complete content that does not correspond to the content of the utterance. (8) expresses the content that all students in existence got an A, and (9) expresses the content that Anna has no clothes to wear at all. However, these sentences are usually used to express different contents. For example, (8) can be used by the logic professor to say that all students in her class got an A, and (9) can be used to say that Anna has no appropriate dress for a particular occasion.
c. Saturation and Modulation
Linguistic pragmatists maintain that completions and expansions are obtained through pragmatic processes that are not linguistically driven by conventional meanings. Recanati draws a distinction between saturation and modulation: Processes of saturation are mandatory pragmatic processes required to determine the semantic contents of linguistic expressions (bottom-up or linguistically driven processes). Processes of modulation are optional pragmatic processes that yield completions and expansions (top-down or ‘free’ processes).
Pragmatic processes of saturation are directed and governed by the linguistic meanings of context-sensitive expressions. For instance, the linguistic meaning of the demonstrative ‘that’ demands the selection of a salient object in the context of utterance to determine the referent of the demonstrative. In contrast, pragmatic processes of modulation are optional because they are not activated by linguistic meanings. They are not activated for the simple reason that the elements that form completions and expansions do not match any linguistic expression in the sentence. Recanati distinguishes three types of pragmatic processes of modulation:
(i) Free enrichment is a process that narrows the conditions of application of linguistic expressions. Some of the above examples are cases of free enrichment. In (8) the domain of the quantifier ‘all students’ is restricted to the logic class and in (9) the domain of ‘nothing to wear’ is restricted to appropriate dresses for a given occasion. In (5) the conditions of application of the adjective ‘short’ are restricted to people whose height is lower than the average basketball player. In (6) the conditions of application of the adjective ‘ready’ are restricted to people who acquired technical and physical ability for climbing Eiger’s North Face.
(ii) Loosening is a process that widens the conditions of application of words specifying the degree of approximation. Here is one example used by Recanati:
(10) The ATM swallowed my credit card.
Literally speaking, an ATM cannot swallow anything because it does not have a digestive system. In this case, the conditions of application of the verb ‘swallow’ are made loose so as to include a wider range of actions. Another example of loosening is the following:
(11) France is hexagonal.
This sentence does not say that the borders of France draw a perfect hexagon, but that it does so approximately.
(iii) Semantic transfer is a process that maps the meaning of an expression onto another meaning. The following is an example of semantic transfer. Suppose a waiter in a bar says to his boss:
(12) The ham sandwich left without paying.
Through a process of modulation, the meaning of the phrase ‘the ham sandwich’ is mapped onto the meaning of the phrase ‘the customer who ordered the ham sandwich’.
d. Core Ideas and Differences among Linguistic Pragmatists
The orthodox truth-conditional view distinguishes two kinds of pragmatic processes, primary ones and secondary ones. Primary pragmatic processes contribute to determine the contents of utterances for context-sensitive expressions. Secondary pragmatic processes contribute to conversational implicatures and are activated after the composition of the contents of utterances has been accomplished. The fundamental aspect of the orthodox view that linguistic pragmatists reject is the idea that primary pragmatic processes are only processes of saturation, which are activated and driven by conventional meanings of words. Linguistic pragmatists affirm that primary pragmatic processes also include processes of modulation that are not encoded in linguistic meanings. According to linguistic pragmatism, the process of truth-conditional composition that gives the contents of utterances is systematically underdetermined by linguistic meanings.
The different versions of linguistic pragmatism are all unified by the criticism of the orthodox view. Recanati calls the content of an utterance in the pragmatist conception ‘pragmatic truth-conditions’, Bach speaks of ‘implicitures’, Carston of ‘explicatures’. There are important and substantive differences among these notions. For Bach an impliciture is a pragmatic enrichment of the strict semantic content that is determined by linguistic meanings alone and can be truth-conditionally incomplete. The strict semantic content is like a template that needs to be filled. Recanati argues that Bach’s strict semantic content is only a theoretical abstraction that does not perform any proper role in the computation of what is said. Carston and relevance theorists like Sperber and Wilson adopt a similar view, but—in contrast with Recanati—they affirm that primary and secondary pragmatic processes are, from a cognitive point of view, processes of the same kind that are explained by the principle of relevance, according to which one accepts the interpretation that satisfies the expectation of relevance with the least effort.
However, there is something on which Bach, Recanati, Carston, Sperber and Wilson all agree: Very often, semantic interpretation alone gives at most semantic schemata, and only with the help of pragmatic processes of modulation can a complete propositional content be obtained.
Finally, the most radical views of Searle (1978), Travis (2008), and Unnsteinsson (2014) claim that conventional meanings do not exist. Speakers rely upon models of past applications of words and any new interpretation of a word arises from a process of modulation from one of its past applications. The latest works by Carston (2019) tend to develop a similar view. Radical linguistic pragmatists reject even the idea that semantics provides schemata to be pragmatically enriched by modulation processes. In their view, the difficulty is to explain what such an incomplete semantic content might be for many expressions. Think, for example, of ‘red’. It is difficult to individuate a semantic content, no matter how incomplete, that is shared in ‘red car’, ‘red hair’, ‘red foliage’, ‘red rashes’, ‘red light’, ‘red apple’, etc. It is even more difficult to explain how this alleged incomplete content could be enriched into the contents that people convey with those expressions.
The next section is devoted to indexicalism, a family of theories that react against linguistic pragmatism.
3. Defending the Orthodox View: Indexicalism
a. Extending Indexicality and Polysemy
Indexicalists attempt to recover the orthodox truth-conditional approach in semantics from the charge of semantic underdetermination raised by linguistic pragmatists. Indexicalists reject the thesis of semantic underdetermination and explain the variability of utterances’ contents in contexts with the resources of the orthodox truth-conditional view, mainly by enlarging the range of indexicality and the range of polysemy. The typical examples of variability of contents in contexts invoked by linguistic pragmatists are the following:
(13) John is tall.
(14) Mary is ready.
(15) It is raining.
(16) Everybody got an A.
(17) Mary and John got married and had a child.
In the course of a conversation about basketball players, an utterance of (13) might express the content that John is tall with respect to the average height of basketball players. In the course of a conversation about the next logic exam, an utterance of (14) might express the content that Mary is ready to take the logic exam. If Mary utters (15) while in Rome, her utterance might express the content that it is raining in Rome at the time of the utterance. If the professor of logic utters (16), her utterance might express the content that all the students in her class got an A. Mostly, if a speaker utters (17), she expresses the content that Mary and John got married before having a child.
Linguistic pragmatists argue that, in order for utterances of sentences like (13)-(17) to express those contents, the conventional meanings encoded in the sentences are not sufficient. Linguistic pragmatists hold that the presence in the content expressed of a comparison class for ‘tall’, of a course of action for ‘ready’, of a location for weather reports, of a restricted domain for quantified noun phrases, and of the temporal/causal order for ‘and’ is not the result of a process that is governed by a semantic convention. Linguistic pragmatists generalize this claim and argue that what is true of expressions like ‘tall’, ‘ready’, ‘it rains’, ‘everybody’, and ‘and’, is true of nearly all expressions in natural languages. According to linguistic pragmatists, semantic conventions provide at most propositional schemata—propositional radicals—that lack determinate truth-conditions.
The indexicalists’ strategy for resisting the call for a new theoretical approach raised by linguistic pragmatists is to enlarge both the range of indexicality, thought of as the result of linguistically governed processes of saturation, and the range of polysemy. Michael Devitt says, there is more linguistically governed context-sensitivity and polysemy in our language than linguistic pragmatists think. Indexicalists try to explain examples like (13)-(16) by conventions of saturation: It is by linguistic conventions codified in language that people use ‘tall’ having in mind a class of comparison, ‘ready’ a course of action, ‘it rains’ a location, and ‘everyone’ a domain of quantification. Some indexicalistsexplain examples like (17) by polysemy: ‘And’ is a polysemous word having multiple meanings, one for the truth-functional conjunction and one for the temporally/causally ordered conjunction.
Indexicalism too comprises a family of theories, and there are deep and fundamental differences among them. As said, on an orthodox semantic theory the meaning of context-sensitive expressions sets up the parameters, or slots, that must be loaded with contextual values. Sometimes the parameters are explicitly expressed in the sentence, as with indexicals. Sometimes, instead, the parameters do not figure at the level of surface syntax. Philosophers and linguists disagree on where the parameters, which do not show up at the level of surface syntax, are hidden. Some (Stanley 2005a, Stanley and Williamson 1995, Szabo 2001, Szabo 2006) hold that such parameters are associated with elements that occur in the logical form. Taylor (2003) advances a different theory and argues that hidden parameters are represented in the syntactic basement of the lexicon. They are constituents not of sentences but of words. On Taylor’s view, the lexical representations of words specify the parameters that must be filled in with contextual values in order for utterances of sentences to have determinate truth-conditions. In a different version of indexicalism, some authors (Rothshield and Segal 2009) argue that the expressions that are regularly used to express different contents in different contexts ought to be treated as ordinary context-sensitive expressions and added to the Basic Set.
What all indexicalist theories have in common is the view that the variability of contents in contexts is always linguistically governed by conventional meanings of expressions. In all versions of indexicalism the phenomenon of semantic underdetermination is denied: The presence of each component of the content of an utterance of a sentence is mandatorily governed by a linguistic element occurring in the sentence either at the level of surface syntax or at the level of logical form.
b. Two Objections to Linguistic Pragmatism: Overgeneration and Normativity
There are two connected motivations that underlie the indexicalists’ defence of the orthodox view. One is a problem with overgeneration, the other is a problem with the normativity of meaning.
Linguistic pragmatists aim at keeping in place the distinctions among the level of linguistic meaning, the level of the contents of utterances, and the level of what speakers communicate indirectly by means of implicatures. To this end, linguistic pragmatists need a principled way to distinguish the contents of utterances (Sperber and Wilson’s and Carston’s explicatures, Bach’s implicitures, Recanati’s pragmatic truth-conditions) from implicatures. The canonical definition of explicature—and from now on this article adopts the term ‘explicature’ for pragmatically enriched contents of utterances—is the following:
An explicature is a pragmatically inferred development of logical form, where implicatures are held to be purely pragmatically inferred—that is, unconstrained by logical form.
The difficulty arises because explicatures are taken to be pragmatic developments of logical forms but not all pragmatic developments of logical forms count as explicatures. Linguistic pragmatists need to keep developments of logical forms that are explicatures apart from developments of logical forms that are not. Explicatures result from pragmatic processes that are not linguistically driven. There is a problem of overgeneration. As Stanley points out, if explicatures are linguistically unconstrained, then there is no explanation of why an utterance of sentence (18) can never have the same content as an utterance of sentence (19), or why an utterance of sentence (20) can have the same content as an utterance of sentence (21) but never the same content as an utterance of sentence (22):
(18) Everyone likes Sally.
(19) Everyone likes Sally and her mother.
(20) Every Frenchman is seated.
(21) Every Frenchman in the classroom is seated.
(22) Every Frenchman or Dutchman in the classroom is seated.
Carston and Hall (2012) try to answer Stanley’s objection of overgeneration from within the camp of linguistic pragmatists. For an assessment and criticism of their attempts, see Borg (2016). However, the point of Stanley’s objection of overgeneration is clear: Once pragmatic processes are allowed to contribute to direct contents of utterances in ways that are not linguistically governed by conventional meanings, it is difficult to draw the distinction between what speakers directly say and what they indirectly convey, so that the distinction between explicatures and implicatures collapses.
The other objection against linguistic pragmatism concerns the normativity of meaning. According to indexicalists, the explanation of contents of utterances supplied by semantics in the orthodox approach is superior to the explanation supplied by linguistic pragmatism because the former accounts for the normative aspect of meaning while the latter does not. Normativity is constitutive of the notion of meaning. If there are meanings, there must be such things as going right and going wrong with the use of language. The use of an expression is right if it conforms with its meaning, and wrong otherwise. If literal contents of speech acts are thought of in truth-conditional terms, conformity with meaning amounts to constraints on truth-conditions. In cases of expressions with one meaning the speaker undertakes the semantic burden of using them for expressing their conventional semantic values. In cases of polysemy the speaker undertakes the semantic burden of selecting a convention that fixes a determinate contribution to the truth-conditional contents expressed by utterances of sentences. In cases of expressions governed by conventions of saturation, the speaker undertakes the semantic burden of loading the demanded parameters with contextual values. Whenever the speaker fulfils these semantic burdens, she goes right with her use of language, otherwise she goes wrong, unless the speaker is speaking figuratively. As said above, the speaker who utters sentences (13)-(16) undertakes the semantic burden of loading a comparison class for ‘tall’, a course of action for ‘ready’, a location for ‘it rains’, a restricted domain of quantification for ‘everybody’. And a speaker who utters (17) undertakes the semantic burden of selecting the convention for ‘and’ that fixes the truth-functional conjunction or the convention that fixes the temporal/causal ordered conjunction.
Indexicalists say that the problem for linguistic pragmatism is to provide an account of how the meanings of expressions constrain truth-conditional contents of utterances, if the composition of truth-conditions is not governed by linguistic conventions, and how, lacking such an explanation, linguistic pragmatism can preserve the distinction between going right and going wrong with the use of language.
The remainder of this section gives a short illustration of the version of indexicalism that tries to explain the variability of contents in contexts by adding hidden variables in the logical form of sentences. The next two sections introduce some technicalities, and the reader who is content with a general introduction to context-sensitivity can skip to section 4.
c. Hidden Variables and the Binding Argument
Some indexicalists (Stanley, Szabo, Williamson) reinstate the Gricean syntactic constraint, rejected by linguistic pragmatists, at the level of logical form. They maintain that every pragmatic process that contributes to the determination of the truth-conditional content of a speech act is a process of saturation that is always activated by the linguistic meaning of an expression. If there is no trace of such expression in the surface syntactic structure, then there must be an element in the logical form that triggers a saturation process. The variables in the logical form work as indexicals that require contextual assignments of values. The pragmatic processes that assign the values of those variables are processes that are governed by linguistic rules; they are not optional.
Here are some examples, with some simplifications, given that a correct rendering of the logical form would require more technicalities. Suppose that, while on the phone to Mary on 25 November 2019, answering a question about the weather in London, George says:
(15) It’s raining.
People tend to agree that George said that it is raining in London on that date. Linguistic pragmatists concede that the reference to the day is due to the present tense of the verb, which works as an indexical expression that refers to the time of the utterance. However, the reference to the place, the city of London, is given by free enrichment. For linguistic pragmatists (15) can be represented as follows:
(15*) It’s raining (t).
The variable ‘t’ corresponds to the present tense of the verb. In the logical form there is no variable taking London as value. On the contrary, indexicalists claim that (17) can be represented as follows:
(15**) It’s raining (t, l).
In (15**) the variable ‘l’ takes London as a value. The process that assigns London to the variable ‘l’ is of the same kind as the process that assigns a referent to the indexical ‘here’ and it is linguistically driven because it is activated by an element of the logical form.
The variables that indexicalists insert in logical forms have a more complex structure. In (15**) the variable ‘t’ has the structure ‘ƒ(x)’ and the variable ‘l’ has the structure ‘ƒ*(y)’. ‘x’ is a variable that takes contextually salient entities as values and ‘ƒ’ is a variable that ranges over functions from entities to temporal intervals. The variable ‘y’ also takes contextually salient entities as values, and ‘ƒ*’ ranges over functions from entities to locations. The reason for this complexity will be explained in the next section. For now, it suffices to note that in simple cases like (15**), ‘x’ take instants as values and ‘ƒ ’ takes the identity function, so that ƒ(x) = x. Likewise, ‘y’ takes locations as values and ‘ƒ*’ takes the identity function, so that ƒ*(y) = y.
Here is another example. Consider Mark, the player whose coach makes the following assertion:
(5) Mark is short.
The coach said that Mark is short with respect to the average height of basketball players. Indexicalists explain this case by inserting a variable in (5):
(5*) Mark is short (h).
‘h’ is a variable that takes standards of height as values. The variable ‘h’ too has a structure of the kind ‘ƒ(x)’, where ‘x’ ranges over contextually salient entities (for example, the set of basketball players) and ‘ƒ’ over functions that map the salient entities to other entities (for instance, the subset of the basketball players that are shorter than the average height of basketball players).
Here is an example with quantifiers. Consider the following sentence, asserted by the professor of logic:
(8) All students got an A.
The professor said that all students that took the logic class got an A. Indexicalists claim that in the quantifier ‘all students’ there is a variable that assigns domains of quantification:
(8*) [all x: student x]ƒ(y) (got an A x).
In this example the value of the variable ‘y’ is the professor of logic and the value of ‘ƒ’ is a function that maps y onto the set of students who took the logic class taught by y. This set becomes the domain of the quantifier ‘all students’.
Stanley and Szabo present a strategy for justifying the insertion of hidden variables in logical forms, the so-called binding argument: to show that an element of the truth-conditional content of an utterance of a sentence is the result of a process of saturation, it is enough to show that it can vary in accordance with the values of a variable bound by a quantifier.
Consider the following sentence:
(23) Whenever Bob lights a cigarette, it rains.
An interpretation of (23) is the following: Whenever Bob lights a cigarette, it rains where Bob lights it. In this interpretation, the location where it rains varies in relation to the time when Bob lights a cigarette. Therefore, the value of the variable ‘l’ in ‘it rains (t, l)’ depends on the value of the variable ‘t’ that is bound by a quantifier that ranges over times. This interpretation can be obtained if (23) is represented as follows:
(23*) [every t: temporal interval t Ù Bob lights a cigarette at t](it rains (ƒ(t), ƒ*(t))).
The value of ‘ƒ’ is the identity function so that ƒ(t) = t, and the value of ‘ƒ*’ is a function that assigns to the time that is the value of ‘t’ the location where Bob lights a cigarette at that time.
d. Objections to the Binding Argument
Some philosophers (Cappelen and Lepore 2002, Breheny 2004) raise an objection of overgeneration against the binding argument. In their view, the binding argument forces the introduction of too many hidden variables, even when there is no need for them. The strongest objection against the binding argument has been raised by Recanati (2004: 110), who argues that the binding argument is fallacious. Recanati summarizes the binding argument as follows:
Linguistic pragmatism maintains that in ‘it rains’ the implicit reference to the location is the result of a process of modulation that does not require any covert variable.
In the sentence ‘whenever Bob lights a cigarette, it rains’, the reference to the location varies according to the value of the variable bound by the quantifier ‘whenever Bob lights a cigarette’.
There can be no binding without a variable in the logical form.
In the logical form of ‘it rains’ there is a variable for locations, although phonologically not realized.
Therefore:
Linguistic pragmatism is wrong: In ‘it rains’, the reference to the location is mandatory, because it is articulated in the logical form.
Recanati argues that this argument is fallacious because of an ambiguity in premise 4, where the sentence ‘it rains’ can be intended either in isolation or as a part of compound phrases. According to Recanati, the sentence ‘it rains’ contains a covert variable when it occurs as a part of the compound sentence ‘whenever Bob lights a cigarette, it rains’, but it does not contain any variable when it occurs alone.
Recanati proposes a theory that admits that binding requires variables in the logical form, but at the same time it rejects indexicalism. Recanati makes use of expressions that modify predicates. Given an n-place predicate, a modifier can form an n+1 place or an n-1 place predicate. A modifier expresses a function from properties/relations to other properties/relations. For example, Recanati says that ‘it rains’ expresses the property of raining, which is predicated of temporal intervals. Expressions like ‘at’, ‘in’, and so forth, transform the predicate ‘it rains (t)’ from a one-place predicate to a two-place predicate: ‘it rains (t, l)’. Expressions like ‘here’ or ‘in London’ are special modifiers that transform the predicate ‘it rains’ from a one-place predicate to a two-place predicate but also provide a value for the new argument place. Recanati argues that expressions like ‘whenever Bob lights a cigarette’ are modifiers of the same kind as ‘here’ and ‘in London’. They change the number of predicate places and provide a value to the new argument through the value of the variable they bind. Recanati’s conclusion is that although binding requires variables in the logical form of compound sentences, there is no need to insert covert variables in sub-sentential expressions or sentences in isolation.
The next section presents a different approach to semantics, one that distinguishes between semantic contents and speech act contents.
4. Defending the Autonomy of Semantics: Minimalism
a. Distinguishing Semantic Content from Speech Act Content
Indexicalists and linguistic pragmatists share the view that the goal of semantics is to explain the explicit contents of speech acts performed by utterances of sentences. They both agree that there must be a close explanatory connection between the meaning encoded in a sentence S and the contents of speech acts performed by utterances of S. One important corollary of this conception is that if a sentence S is systematically uttered for performing speech acts with different contents at different contexts, this phenomenon calls for an explanation on behalf of semantics. The point of disagreement between indexicalists and linguistic pragmatists is that the former think that semantics can provide such an explanation while the latter think that semantics alone is not sufficient and a new theoretical model is needed, one in which pragmatic processes, semantically unconstrained, contribute to determine the contents of speech acts. As said above, indexicalists explain the variability of contents in contexts in terms of context-sensitivity by enlarging the range of indexicality and polysemy, whereas linguistic pragmatists explain it in terms of semantic underdetermination. The debate between indexicalists and linguistic pragmatists starts taking for granted the explanatory connection between semantics and contents of speech acts.
Minimalism in semantics is a family of theories that reject the explanatory connection between semantics and contents of speech acts. Minimalists (Borg 2004, 2012, Cappelen and Lepore 2005, Soames 2002) maintain that semantics is not in the business of explaining the contents of speech acts performed by utterances of sentences. Minimalists work with a notion of semantic content that does not play the role of speech acts content. According to them the semantic content of a sentence is a full truth-conditional content that is obtained compositionally by the syntactic structure of the sentence and the semantic values of the expressions in the sentence that are fixed by conventional meanings. Moreover, they claim that the Basic Set of genuinely context-sensitive expressions, which are governed by conventions of saturation, comprises only overt indexicals (pronouns, demonstratives, tense markers, and a few other words). Minimalists call the semantic content of a sentence its minimal content.
The above statement that minimal contents are not contents of speech acts requires qualification. Cappelen and Lepore argue indeed for speech act pluralism. They argue that speech acts have a plurality of contents and the minimal content of a sentence is always one of many contents that its utterances express. In order to protect speech act pluralism from the objection that very often speakers are not aware of having made an assertion with the minimal content, and, if asked, they would deny having made an assertion with the minimal content, Cappelen and Lepore argue that speakers can sincerely assert a content without believing it and without knowing they have asserted it. For example, if Mary looks into the refrigerator and says ‘there are no beers’, Mary herself would deny that she asserted that there are no beers in existence and deny that she believes that there are no beers in existence, although that there are no beers in existence is the minimal content that the sentence ‘there are no beers’ semantically expresses.
The main line of the minimalists’ attack on indexicalism and linguistic pragmatism is methodological. Minimalists argue that both indexicalists and linguistic pragmatists adhere to the methodological principle that says that a semantic theory is adequate just in case it accounts for the intuitions people have about what speakers say, assert, claim, and state by uttering sentences. Minimalists claim that this principle is mistaken just because it conflates semantic contents and contents of speech acts. Semantics is the study of the semantic values of the lexical items and their contribution to the semantic contents of complex expressions. Contents of speech acts, instead, are contents that can be used to describe what people say by uttering sentences in particular contexts of utterance.
b. Rebutting the Arguments for Linguistic Pragmatism
Minimalists dismiss context-shifting arguments and inappropriateness arguments just on the grounds that they conflate intuitions about semantic contents of sentences and intuitions about contents of speech acts. Incompleteness arguments are a subtler matter and require more articulated responses. Cappelen and Lepore’s (2005) response and Borg’s (2012) response are presented in the following. An incompleteness argument aims at showing that there is no invariant content that a sentence S expresses in all contexts of utterance. For example, with respect to:
(14) Mary is ready,
an incompleteness argument starts from the observation that if (14) is taken separately from contextual information specifying what Mary is ready for, people are unable to evaluate it as true or false. This evidence leads to the conclusion that there is no minimal content—that Mary is ready (simpliciter)—that is invariant and semantically expressed by (14) in all contexts of utterance. In general, then, the conclusions of incompleteness arguments are that minimal contents do not exist: without pragmatic processes, many sentences in our language do not express full propositional contents with determinate truth-conditions.
Cappelen and Lepore accept the premises of incompleteness arguments, that is, that people are unable to truth-evaluate certain sentences, but they argue that from these premises it does not follow that minimal contents do not exist. Borg adopts a different strategy. Borg tries to block incompleteness arguments by rejecting their premises and explaining away people’s inability to truth-evaluate certain sentences.
Cappelen and Lepore raise the objection that incompleteness arguments try to establish metaphysical conclusions, for example about the existence of the property of being ready (simpliciter) as a building block of the minimal content that Mary is ready (simpliciter), from premises that concern psychological facts regarding people’s ability to evaluate sentences as true or false. They point out that psychological data are not relevant in metaphysical matters. The data about people’s dispositions to evaluate sentences might reveal important facts about psychology and communication but have no weight at all in metaphysics. Cappelen and Lepore say that people’s inability to evaluate sentences like (14) as true or false independently of contextual information does not provide evidence against the claim that the property of being ready exists and is the semantic content of the adjective ‘ready’. On the one hand, they acknowledge that the problem of giving the analysis of the property of being ready is a difficult one, but it is for metaphysicians, not for semanticists. On the other hand, they argue that semanticists have no difficulty at all in stating what invariant minimal content is semantically encoded in (14). Sentence (14) semantically expresses the minimal content that Mary is ready. There is no difficulty in determining its truth-conditions either: ‘Mary is ready’ is true if and only if Mary is ready.
Cappelen and Lepore address the immediate objection that if the truth-condition of (14) is represented by a disquotational principle like the one reported above, then nobody is able to verify whether such truth-condition is satisfied or not. This fact is witnessed by people’s inability to evaluate (14) as true or false independently of information specifying what Mary is ready for. Cappelen and Lepore respond that it is not a task for semantics to ascertain how things are in the world. For example, it is not a task for semantics to say whether (14) is true or false. That a semantic theory for a language L does not provide speakers with a method of verifying sentences of L is not a defect of that semantic theory. Cappelen and Lepore say that those theorists who think otherwise indulge in verificationism. For an objection to Cappelen and Lepore see Recanati (2004), Clapp (2007), Penco and Vignolo (2019).
In Pursuing Meaning Borg offers a different strategy for blocking incompleteness arguments. Borg’s strategy is to explain away the intuitions of incompleteness. Borg agrees that speakers have intuitions of incompleteness with respect to sentences like ‘Mary is ready’, but she argues that intuitions of incompleteness emerge from some overlooked covert and context-insensitive syntactic structure. Borg says that ‘ready’ is lexically marked as an expression with two argument places. On Borg’s view ‘ready’ always denotes the same relation, the relation of readiness, which holds between a subject and the thing for which they are held to be ready. When only one argument place is filled at the surface level, the other is marked by an existentially bound variable in the logical form. Thereby ‘ready’ makes exactly the same contribution in any context of utterance to any propositional content literally expressed. For example, Borg says that in a context where what is salient is the property being ready to join the fire service, the sentence ‘Mary is ready’ literally expresses the minimal content that Mary is ready for something not that Mary is ready to join the fire service. As Borg points out, the minimal content that Mary is ready for something is almost trivially true. Yet, Borg warns not to conflate intuitions about the informativeness of a propositional content with intuitions about its semantic completeness.
Borg’s explanation of the intuitions of incompleteness is that speakers are aware of the need for the two arguments, which is in tension with the phonetic delivery of only one argument. Speakers are uneasy to truth-evaluate sentences like ‘Mary is ready’ not because the sentence is semantically incomplete and lacks a truth-condition, but because their expectation for the second argument to be expressed is frustrated and the minimal content that is semantically expressed, when the argument role corresponding to the direct object is not filled at the surface level, is barely informative. For a critical assessment of Borg’s strategy, see Clapp (2007) and Penco and Vignolo (2019).
The following subsection illustrates the tenets that characterise minimalism and the central motivation for it.
c. Motivation and Tenets of Minimalism
Minimalism is characterised by four main theses (Borg 2007) and one main motivation. The first thesis is propositionalism. Propositionalism states that sentence types, relativized to indexes representing contexts of utterance, express full propositional contents with determinate truth-conditions. These semantic contents are the minimal ones, which are invariant through contexts of utterance when sentence types do not contain overt context-sensitive expressions. Propositionalism distinguishes minimalism from radical minimalism, which is a philosophical view sustained by Bach (2007). Bach acknowledges the existence of semantic contents of sentence types, but he rejects the view that such contents are always fully propositional with determinate truth-conditions. According to Bach, most semantic contents are propositional radicals. As Borg points out, despite the fact that Bach insists on avowing that he is not a linguistic pragmatist, it is not easy to spot substantial differences between Bach’s view and linguistic pragmatism. Although Bach’s semantically incomplete sentences are not context-sensitive unless they contain overt context-sensitive expressions, linguistic pragmatists need not deny that semantic theories are possible. They simply maintain that in most cases semantic theories deal with sub-propositional contents. Bach and linguistic pragmatists agree that, in many if not most cases, in order to reach full propositional contents theorists need to focus on speech acts and not on sentence types.
The second important thesis of minimalism is the Basic Set assumption. The Basic Set assumption states that the only genuine context-sensitive expressions that trigger and drive pragmatic processes for the determination of semantic values are those that are listed in the Basic Set, that is, overt indexicals like ‘I’, ‘here’, ‘now’, ‘that’, plus or minus a bit. Expressions like ‘ready’, ‘tall’, ‘green’, quantified noun phrases, and so on, are not context-sensitive.
The third tenet of minimalism is the distinction between semantic contents and speech acts contents: Semantic contents are not what speakers intend to explicitly and directly communicate. The contents explicitly communicated are pragmatic developments of semantic contents. As said, this move serves to disarm batteries of arguments advanced by indexicalists and linguistic pragmatists. Even if in almost all cases semantic contents are not the contents of speech acts, they nonetheless play an important theoretical role in communication. Semantic contents are fallback contents that people are able to understand on the sole basis of their linguistic competence when they ignore or mistake the intentions of the speakers and the contextual information needed for understanding what speakers are trying to communicate. Minimal contents can play this role in communication just because they can be grasped simply in virtue of linguistic competence alone.
The fourth and last thesis of minimalism is a commitment to formalism. Formalism is the view that the processes that compute the truth-conditional contents of sentence types are entirely formal and computational. There is an algorithmic route to the semantic content of each sentence (relative to an index representing contextual features), and all contextual contributions to semantic contents are formally tractable. More precisely, all those contextual contributions that depend on speakers’ intentions must be kept apart from semantic contents. This last claim puts a further constraint on context-sensitive expressions, which ought to be responsive only to objective aspects of contexts of utterance, like who is speaking, when, and where. These are the features that Bach (1994, 1999, 2001) and Perry (2001) termed narrow features of contexts and play a semantic role, as opposed to wide features that depend on speakers’ intentions and play a pragmatic role. It is also a claim that relates to Kaplan’s distinction between pure (automatic) indexicals, which refer semantically by picking out objective features of the context of utterance, and intentional indexicals, which refer pragmatically in virtue of intentions of speakers (Kaplan 1989a, Perry 2001).
Formalism is related to one of the main motivations for minimalism. Minimalism is compatible with a modular account of meaning understanding. The modularity theory of mind is the view that the mind is constituted of discrete and relative autonomous modules, each of which is dedicated to the computation of particular cognitive functions. A module possesses a specific body of information and specific rules working computationally on that body of information. Among such modules there is one, the module of the faculty of language, which is dedicated to the comprehension of literal contents of sentences. This model includes phonetic/orthographic information and related rules, syntactic information and related rules, and semantic information and related rules.
A minimalist semantics fits well as part of the language module since it derives the truth-conditional contents of sentences, relative to indexes, in a computational way operating on representations of semantic properties of the lexicon and with formal rules working on such representations. Thus, if linguistic comprehension is modular, minimalism offers a theory that is consistent with the existence of the language module.
The following data are often-invoked evidence to justify the claim that linguistic comprehension is modular. The understanding of literal meanings of sentences seems to be the result of domain-specific and encapsulated processes. Linguistically competent people understand the literal meaning of a sentence even when they ignore salient aspects of the context of utterance and the communicative intentions of the speaker. Moreover, the understanding of literal meaning is carried out independently of any sort of encyclopaedic information. The processes that yield literal truth-conditional contents of sentences are mandatory, very fast, and mostly unavailable to consciousness. People cannot help reading certain signs and hearing certain sounds as utterances of sentences in languages they understand. Competent speakers interpret straightforwardly and very quickly those signs and sounds as sentences with literal contents without being aware of the information and the rules operating on it that yield such an understanding. Finally, linguistic understanding is associated with localized neuronal structures that undergo regularities in acquisition and development processes, and regularities of breakdown due to neuronal damages. In conclusion, for those who believe that this is good evidence that comprehension of literal meaning is modular, minimalism offers a semantic theory that can be coherently taken to be part of the language faculty module.
The presentation of minimalism closes with the discussion of the tests that Cappelen and Lepore propose in order to select the only context-sensitive expressions that go into the Basic Set. The following subsection contains some technicalities. The reader who is mainly interested in an overview on context-sensitivity can skip to section 5.
d. Testing Context-Sensitivity
Cappelen and Lepore propose different tests for distinguishing the expressions in the Basic Set that are genuinely context-sensitive from those that are not. Here only one of their tests is illustrated, but it is sufficient to give a hint of their work.
Test of inter-contextual disquotational indirect reports: Suppose that Anna, who had planned to climb Eiger’s North Face on July 1 but cancelled, utters the following sentence on July 2:
(24) Yesterday I was not ready.
Suppose that on July 3 Mary indirectly reports what Anna said on July 2. Mary cannot use the same words as Anna used. If she did, she would make the following report:
(25) Anna said that yesterday I was not ready.
From this example it is clear that context-sensitive expressions like ‘I’and ‘yesterday’ generate inter-contextual disquotational indirect reports that are false or inappropriate.
Cappelen and Lepore say that it is possible to make inter-contextual disquotational indirect reports with the adjective ‘ready’, and this fact provides evidence that ‘ready’ is not context-sensitive. Assume that on July 5 Mary utters the following sentence:
(26) On July 1 Anna was not ready.
Then, on July 6 George might report what Mary said with the utterance of the following sentence:
(27) Mary said that on July 1 Anna was not ready.
These results generalize to all expressions that do not belong to the Basic Set.
Another case is the following. Suppose Mary utters ‘Anna is ready’ in a context C1 to say that Anna is ready to climb Eiger’s North Face and makes a second utterance of it in a context C2 to say that Anna is ready to sit her logic exam. Cappelen and Lepore argue that in a context C3 the following reports are true:
(28) Mary said that Anna is ready (with respect to the utterance in C1).
(29) Mary said that Anna is ready (with respect to the utterance in C2).
(30) In C1 and C2 Mary said that Anna is ready.
Cappelen and Lepore say that linguistic pragmatism and indexicalism have difficulty explaining the truth of the above inter-contextual disquotational indirect reports. It is not obvious, however, that the difficulty Cappelen and Lepore propose is insurmountable. The context C3 might differ from C1 and C2 because the speaker, the time, and the place of the utterance are different, but the same contextual information might be available in C3 and be relevant for the interpretation of the utterance in C1 or C2. In C3 the speaker (and the audience too) might be aware that Mary was talking about alpinism in C1 and of logic exams in C2.
According to a suggestion by Stanley (2005b), and Cappelen and Hawthorne (2009), sentence (30) might be represented as follows:
(30*) C1 and C2 lx (in x Mary said that Anna is readyƒ(x)).
Here the variable ‘x’ takes contexts as values and the variable ‘ƒ’ takes a function that maps contexts to kinds of actions or activities salient in those contexts. This analysis yields the interpretation that the report (30) is true if and only if in C1 Mary said that Anna is ready to climb Eiger’s North Face and in C2 Mary said that Anna is ready to take her logic exam. On the other hand, if one supposes that the speaker in C3 has the erroneous belief that Mary was talking about Anna’s readiness to go out with friends, linguistic pragmatists and indexicalists will doubt the truth of the reports (28)-(30) and reduce the debate to a conflict of intuitions.
The test of inter-contextual disquotational indirect reports and the other tests that Cappelen and Lepore present, such as the test of inter-contextual disquotation and the test of collective descriptions, raised an intense debate. For critical assessments of these tests, see Leslie (2007) and Taylor (2007). Cappelen and Hawthorne (2009) present the test of agreement, while Donaldson and Lepore (2012) add the test of collective reports. Limits of space prevents deeper detail of the debate on tests for context-sensitivity. The foregoing suffices to give an idea of the kind of arguments that philosophers involved in that debate deal with.
While minimalism is a strong alternative to linguistic pragmatism and indexicalism, another approach develops in a new way the idea of invariant semantic contents: relativism. The next section presents the view of relativism, which reconceptualises the relations between meaning and context.
a. Indexicality, Context-Sensitivity, and Assessment-Sensitivity
Relativism in semantics provides a new conceptualization of context dependence. Relativists (Kolbel 2002, MacFarlane 2014, Richard 2008) recover invariant semantic contents and explain some forms of context dependence not in terms of variability of contents in contexts of utterance but in terms of variability of extensions in contexts of assessment. A context of utterance is a possible situation in which a sentence might be uttered and a context of assessment is a possible situation in which a sentence might be evaluated as true or false.
As said in section 1b, Kaplan represents meanings as functions that return contents in contexts of utterances. Contents are functions that distribute extensions in circumstances of evaluation. The content of a sentence in a context of utterance is a function that returns truth-values at standard circumstances of evaluation composed of a possible world and a time. MacFarlane shows that the technical machinery of Kaplan’s semantics is apt to draw conceptual distinctions among what he calls indexicality, context-sensitivity, and assessment-sensitivity. MacFarlane’s notion of indexicality covers the standard variability of contents in contexts. His notions of context-sensitivity and assessment-sensitivity cover new semantic phenomena, according to which expressions might change extensions while maintaining the same contents. MacFarlane’s notions are defined as follows:
Indexicality:
An expression E is indexical if and only if its content at a context of utterance depends on features of the context of utterance.
Context-sensitivity:
An expression E is context-sensitive if and only if its extension at a context of utterance depends on features of the context of utterance.
Assessment-sensitivity:
An expression E is assessment-sensitive if and only if its extension at a context of utterance depends on features of a context of assessment.
For example, consider two utterances of (5): a true utterance in a conversation about basketball players and a false utterance in a conversation about the general population.
(5) Mark is short.
Indexicality: The standard account in terms of indexicality affirms that the two utterances have different contents because the adjective ‘short’ is treated as an expression that expresses different contents in different contexts of utterance. According to indexicalism, the meaning of ‘short’ demands that the speaker fill in a standard of height that is operative in the context of utterance in order to determine the content of the utterance. Thus, the speaker in the first conversation expresses a different content than that expressed in the second conversation. Since the difference in truth-values between the two utterances is explained in terms of a difference in contents, the context of utterance—in our example speaker’s intentions referring to different standard of height—has a content-determinative role.
Context-sensitivity: Context-sensitivity, in MacFarlane’s sense, explains the difference in truth-values in terms of a difference in the circumstance of evaluation. The circumstance of evaluation is enriched with non-standard parameters. In our example, the circumstance of evaluation is enriched with a parameter concerning the standard of height. The meaning of ‘short’ returns the same content in all contexts of utterance. The content of ‘short’ is invariant across contexts of utterance, but it returns different extensions in circumstances of evaluation that comprise a possible world, a time, and a standard of height. The standard of height that is operative in the first conversation enters the circumstance of evaluation with respect to which ‘short’ has an extension in that context of utterance. According to that standard of height, Mark does belong to the extension of ‘short’. The standard of height that is operative in the second conversation enters the circumstance of evaluation with respect to which ‘short’ has an extension in that context of utterance. According to that standard of height, Mark does not belong to the extension of ‘short’. With context-sensitivity (in MacFarlane’s sense) the context of utterance has a circumstance-determinative role, since it fixes the non-standard parameters that enter the circumstance of evaluation with respect to which expressions have extensions at the context of utterance.
Context-sensitivity so defined is not relativism. For any context of utterance, expressions have just one, if any, extension at that context. In particular, sentences in contexts have absolute truth-values. Truth for sentences in contexts is defined as follows:
A sentence S at a context of utterance i is true if and only if S is true in iw at it and with respect to ih1…ihn, where iw and it are the world and the time of the context of utterance i, and ih1…ihn are all the non-standard parameters, demanded by the expressions in S, which are operative in i (in the above example the standard of height demanded by ‘short’, that is, the average height of basketball players in the first context and the average height of American citizens in the second context).
On the contrary, relativism holds that the extensions of expressions at contexts of utterance are relative to contexts of assessment. So, if contexts of assessment change, extensions too might change. In particular, sentences are true or false at contexts of utterance relative to contexts of assessment. Relative truth is defined as follows:
A sentence S at a context of utterance i is true relative to a context of assessment a if and only if S is true in iw at it and with respect to ah1…ahn, where iw and it are the world and the time of the context of utterance i, and ah1…ahn are all the non-standard parameters, demanded by the expressions in S, that are operative in the context of assessment a.
Relativism requires small revisions of the technical machinery of standard truth-conditional semantics in order to define the notion of relative truth, but it provides a radical reconceptualization of the ways in which meaning, contents, and extensions are context-dependent. Different authors apply relativism to different parts of language. MacFarlane (2014) presents a relativistic semantics for predicates of taste, knowledge attributions, epistemic modals, deontic modals, and future contingents. Kompa (2002) and Richard (2008) offer a relativist treatment of comparative adjectives like ‘short’. Predelli (2005) suggests a view close to relativism for colour words like ‘green’.
The major difficulty for relativists is not technical but conceptual. Relativism must explain what it is for a sentence at a context of utterance to be true relative to a context of assessment. The next subsection presents MacFarlane’s attempt to answer this conceptual difficulty. The final subsection discusses the case of faultless disagreement, which many advocates of relativism employ to show it superior to rival theories in semantics.
b. The Intelligibility of Assessment-Sensitivity
Many philosophers, following Dummett, say that the conceptual grasp of the notion of truth is due to a clarification of its role in the overall theory of language. In particular, the notion of truth has been clarified by its connection with the notion of assertion. One way to get this explication is to take the norm of truth as constitutive of assertion. The norm of truth can be stated as follows:
Norm of truth: Given a context of utterance C and a sentence S, an agent is permitted to assert that S at C only if S is true. (Remember that a sentence S at a context of utterance C is true if and only if S is true in the world of C at the time of C.)
Relativism needs to provide the explication of what it is for a sentence at a context of utterance to be true relative to a context of assessment. If the clarification of the notion of relative truth is to proceed along with its connection to the notion of assertion, what is needed is a norm of relative truth that relates the notion of assertion to the notion of relative truth. It would seem intuitive to employ the following norm of relative truth that privileges the context of utterance and selects it as the context of assessment:
Norm of relative truth: given a context of utterance C and a sentence S, an agent is permitted to assert that S at C only if S at context C is true as assessed from context C itself.
The problem, as MacFarlane points out, is that if the adoption of the norm of relative truth is all that can be said in order to explicate the notion of relative truth, then assessment-sensitivity is an idle wheel with no substantive theoretical role. Relativism becomes a notational variant of standard truth-conditional semantics. The point is that when the definition of relative truth is combined with the norm of relative truth, which picks out the context of utterance and makes it the context of assessment, relativism has the same prescriptions for the correctness of assertions as standard truth-conditional semantics, which works with the definition of truth (simpliciter) combined with the norm of truth.
MacFarlane argues that in order to clarify the notion of relative truth, the norm of relative truth is necessary but not sufficient. In order to gain a full explication of the notion of relative truth, a norm for retraction of assertions must be added to the norm of relative truth. MacFarlane presents the norm for retraction as follows:
Norm for retraction: An agent at a context of assessment C2 must retract an assertion of the sentence S, uttered at a context of utterance C1, if S uttered at C1 is not true as assessed from C2.
Relativism together with the norm of relative truth and the norm for retraction predicts cases of retraction of assertions that other semantic theories are not able to predict. Consider the following example: Let C1 be the context of utterance consisting of John, a time t in the year 1982, and the actual world @; let C2 be the context of utterance consisting of John, a time t´ in 2019, and the actual world @. Let C3 be the context of assessment in which John’s taste in 1982 is operative and C4 the context of assessment in which John’s taste in 2019 is operative.
Suppose John did not like green tea in 1982, when he was ten years old, but he likes green tea a lot in 2019, when he is forty-seven years old. Green tea is not in the extension of ‘tasty’ at C1 as assessed from C3 but it is in the extension of ‘tasty’ at C2 as assessed form C4. Suppose John utters:
(31) ‘Green tea is not tasty’ at C1
and
(32) ‘Green tea is tasty’ at C2.
Relativism predicts that both assertions are correct. John does not violate the norm of relative truth. However, relativism also predicts that in 2019 John must retract the assertion he made in 1982, because in 1982 John uttered a sentence that is false as assessed from C4.
Notice that John’s retraction of his assertion made in 1982 is predicted only by relativism, which treats the adjective ‘tasty’ as assessment-sensitive. If ‘tasty’ is treated as an indexical expression, then John’s assertions in 1982 and in 2019 have two distinct contents, and there is no reason why in 2019 John ought to retract his assertion made in 1982, because his assertion made in 1982 is true. There is no reason why John ought to retract his assertion if ‘tasty’ is treated as a context-sensitive expression. In this case too John’s assertion made in 1982 is true, because the circumstance of evaluation of his 1982 assertion contains the taste that is operative for John in 1982. Retraction is made possible only if ‘tasty’ is assessment-sensitive, making it possible to assess an assertion made in a context of utterance with respect to parameters that are operative in another context (the context of assessment).
c. Faultless Disagreement
Even if one accepts MacFarlane’s explanation of the intelligibility of relativism, it remains an open question whether languages contain assessment-sensitive expressions. It is important, then, to clarify whether there are linguistic phenomena that relativism explains better than linguistic pragmatism, indexicalism, or minimalism. Relativists address a representative phenomenon: faultless disagreement. In a pre-theoretic sense there is faultless disagreement between two parties when they disagree about a speech act or an attitude and neither of them violates any epistemic or constitutive norm governing speech acts or attitudes.
Faultless disagreement is very helpful to model disputes about non-objective matters, for instance, disputes on aesthetic values like tastes. Such disputes show the distinctive linguistic traits of genuine disagreement when the parties involved say ‘No, that is false’, ‘What you are saying is false’, ‘You are wrong, I disagree with you’, and so on. However, many philosophers feel compelled to avoid the account of disagreement that characterizes matters of objective fact, which in subjective areas of discourse would impute implausible cognitive errors and chauvinism to the parties in disagreement.
First, it is important to identify what kinds of disagreement are made intelligible in different semantic theories. Then, given an area of discourse, one must ask which of these kinds of disagreement can be found in it. Thus, semantic theories can be assessed on the basis of which of them predicts the kind of disagreement that is present in that area of discourse.
By employing the notion of relative truth, MacFarlane defines the following notion of accuracy for attitudes and speech acts:
Assessment-sensitive accuracy: An attitude or speech act occurring at a context of utterance C1 is accurate, as assessed from a context of assessment C2, if and only if its content at C1 is true as assessed from C2.
Based on the notion of assessment-sensitive accuracy, MacFarlane defines the following notion of disagreement:
Preclusion of joint accuracy: Agent A disagrees with agent B if and only if the accuracy of the attitudes or speech acts of A, as assessed from a given context, precludes the accuracy of the attitudes or speech acts of B, as assessed from the same context.
There are also different senses in which an attitude or speech act can be faultless. One of them is the absence of violation of constitutive norms governing attitudes or speech acts. According to MacFarlane, the kind of faultless disagreement given by preclusion of joint accuracy together with absence of violation of constitutive norms of attitudes or speech acts is typical of disputes in non-objective matters like taste.
Consider the sentence ‘Green tea is tasty’. Relativism accommodates the idea that its truth depends on the subjective taste of the assessor. Whether green tea is tasty is not an objective state of affairs. Suppose John utters the sentence ‘Green tea is tasty’ and George utters the sentence ‘Green tea is not tasty’. John and George disagree to the extent that there is no context of assessment from which both John’s and George’s assertions are accurate, but neither of them violates the norm of relative truth and the norm of retraction. John’s assertion is accurate if assessed from John’s context of assessment where John’s standard of taste is operative. George’s assertion is accurate if assessed from George’s context of assessment where George’s standard of taste is operative. They are both faultless. Moreover, George will acknowledge that ‘Green tea is tasty’ is true if assessed from John’s standard of taste and vice versa. Finally, suppose that after trying green tea several times, George starts appreciating it. George now says:
(33) Green tea is tasty.
George must retract his previous assertion and say:
(34) What I said (about green tea) is false.
Relativism predicts this pattern of linguistic uses of the adjective ‘tasty’. On the contrary, other semantic theories cannot describe the dispute between John and George as a case of faultless disagreement defined as preclusion of joint accuracy and absence of violation of constitutive norms governing attitudes/speech acts.
Linguistic pragmatism and indexicalism affirm that John’s and George’s tastes have a content-determinative role. Uttered by John, ‘tasty’ means tasty in relation to John’s standard of taste, and uttered by George it means tasty in relation to George’s standard of taste. Therefore, the sentence ‘green tea is tasty’ has a different content in John’s context of utterance than in George’s, with the consequence that disagreement is lost.
Minimalism says that the content of ‘tasty’, the objective property of tastiness, is invariant through all contexts of utterance and its extension in a given possible world is invariant through all contexts of assessment. Therefore, either green tea is in the extension of ‘tasty’ or is not. In this case, John and George are in disagreement but at least one of them is at fault.
6. References and Further Reading
a. References
Bach, Kent, 1994. ‘Conversational Impliciture’, Mind and Language, 9: 124-162.
Bach, Kent, 1999. ‘The Semantics-Pragmatics Distinction: What It Is and Why It Matters’, in K. Turner (ed.), The Semantics-Pragmatics Interface from Different Points of View, Oxford: Elsevier, pp. 65-84.
Bach, Kent, 2007. ‘The Excluded Middle: Minimal Semantics without Minimal Propositions’, Philosophy and Phenomenological Research, 73: 435-442.
Borg, Emma, 2004. Minimal Semantics, Oxford: Oxford University Press.
Borg, Emma, 2007. ‘Minimalism versus Contextualism in Semantics’, in Preyer & Peter (2007), pp. 339-359.
Borg, Emma, 2012. Pursuing Meaning, Oxford: Oxford University Press.
Borg, Emma, 2016. ‘Exploding Explicatures’, Mind and Language, 31(3): 335-355.
Breheny, Richard, 2004. ‘A Lexical Account of Implicit (Bound) Contextual Dependence’, in R. Young, and Y. Zhou (eds.), Semantics and Linguistic Theory (SALT) 13, pp. 55-72.
Cappelen, Herman, and Lepore, Ernie, 2002. ‘Indexicality, Binding, Anaphora and A Priori Truth’, Analysis, 62, 4: 271-81.
Cappelen, Herman, and Lepore, Ernest, 2005. Insensitive Semantics: A Defence of Semantic Minimalism and Speech Act Pluralism, Oxford: Blackwell.
Cappelen, Herman, and Hawthorne, John, 2009. Relativism and Monadic Truth, Oxford: Oxford University Press.
Carston, Robyn, 2002. Thoughts and Utterances: The Pragmatics of Explicit Communication, Oxford: Blackwell.
Carston, Robyn, 2009. ‘Relevance Theory: Contextualism or Pragmaticism?’, UCL Working Papers in Linguistics 21: 19-26.
Carston, Robyn, 2019. ‘Ad Hoc Concepts, Polysemy and the Lexicon’ In K. Scott, R. Carston, and B. Clark (eds.) Relevance, Pragmatics and Interpretation, Cambridge: Cambridge University Press, pp. 150-162.
Carston, Robyn, and Hall, Alison, 2012. ‘Implicature and Explicature’, in H. J. Schmid and D. Geeraerts (eds.), Cognitive Pragmatics, Vol. 4. Berlin: Mouton de Gruyter, 47–84.
Clapp, Lenny, 2007. ‘Minimal (Disagreement about) Semantics’, in Preyer & Peter (2007) below, pp. 251-277.
Donaldson, Tom, and Lepore, Ernie, 2012. ‘Context-Sensitivity’, in D. G. Fara, and G. Russell (eds.), 2012, pp. 116-131.
Grice, Herbert Paul, 1989. Studies in the Way of Words, Cambridge, MA: Harvard University Press.
Kaplan, David, 1989a. ‘Demonstratives’, in J. Almog, J. Perry, and H. Wettstein (eds.), Themes from Kaplan, Oxford: Oxford University Press, pp. 481-563.
Kaplan, David, 1989b. ‘Afterthoughts’, in J. Almog, J. Perry, and H. Wettstein (eds.), Themes from Kaplan, Oxford: Oxford University Press, pp. 565-614.
Korta, Kepa, and John, Perry, 2007. ‘Radical Minimalism, Moderate Contextualism.’ In Preyer & Peter (2007), pp. 94-111.
Kolbel, Max, 2002. Truth without Objectivity, London: Routledge.
Leslie, Sarah-Jane, 2007. ‘How and Why to be a Moderate Contextualist’, in Preyer & Peter (2007), pp. 133-168.
MacFarlane, John, 2014. Assessment Sensitivity, Oxford: Oxford University Press.
Neale, Stephen, 2004. ‘This, That, and the Other’, in A. Bezuidenhout, and M. Reimer (eds.), Descriptions and Beyond, Oxford: Oxford University Press, pp. 68-182.
Penco, Carlo, and Vignolo, Massimiliano, 2019. ‘Some Reflexions on Conventions’, Croatian Journal of Philosophy, Vol. XIX, No. 57: 375-402.
Perry, John, 2001. Reference and Reflexivity, Stanford, CSLI Publications.
Predelli, Stefano, 2005. Contexts: Meaning, Truth, and the Use of Language, Oxford: Oxford University Press.
Recanati, Francois, 2004. Literal Meaning, New York: Cambridge University Press.
Soames, Scott, 2002. Beyond Rigidity: The Unfinished Semantic Agenda of Naming and Necessity, Oxord: Oxford University Press.
Sperber, Dan, and Wilson, Deindre, 1986. Relevance: Communication and Cognition, Oxford: Blackwell.
Stanley, Jason, 2005a. Language in Context, Oxford: Oxford University Press.
Stanley, Jason, 2005b. Knowledge and Practical Interests, Oxford: Oxford University Press.
Stanley, Jason, and Williamson, Timothy, 1995. ‘Quantifiers and Context-Dependence’, Analysis, 55: 291-295.
Szabo, Zoltan Gendler, 2001. ‘Adjectives in context’. In I. Kenesi, and R. Harnish (eds.), Perspectives on Semantics, Pragmtics, and Discourse. Amsterdam: John Benjamins, pp. 119-146.
Szabo, Zoltan Gendler, 2006. ‘Sensitivity Training’, Mind and Language, 21: 31-38.
Taylor, Kenneth, 2003. Reference and the Rational Mind, Stanford, CA: CSLI Publications.
Taylor, Kenneth, 2007. ‘A Little Sensitivity Goes a Long Way’, in (Preyer & Peter (2007), pp. 63-92.
Travis, Charles, 2008. Occasion-Sensitivity: Selected Essays, Oxford: Oxford University Press.
Unnsteinsson, Elmar Geir, 2014. ‘Compositionality and Sandbag Semantics’, Synthese, 191: 3329–3350.
b. Further Reading
Bianchi, Claudia (ed.), 2004. The Semantic/Pragmatic Distinction, Stanford: CSLI.
A collection on context-sensitivity.
Domaneschi, Filippo, and Penco, Carlo (eds.), 2013. What is Said and What is Not, Stanford: CSLI.
A collection on context-sensitivity.
Fara, Delia Graff, and Russell, Gillian (eds.), 2012. The Routledge Companion to Philosophy of Language, New York: Routledge.
A companion to the philosophy of language that covers many of the topics that are discussed in this encyclopedia article.
Garcia-Carpintero, Manuel, and Kolbel, Max (eds.), 2008. Relative Truth, Oxford: Oxford University Press.
A collection on relativism.
Preyer, Gerhard, and Peter, George (eds.), 2007. Context-Sensitivity and Semantic Minimalism: New Essays on Semantics and Pragmatics, Oxford: Oxford University Press.
A collection on minimalism.
Recanati, Francois, Stojanovic, Isidora, and Villanueva, Neftali (eds.), 2010. Context Dependence, Perspective, and Relativity, Berlin: De Gruyter.
A collection on context-sensitivity.
Szabo, Zoltan Gendler (ed.), 2004. Semantics versus Pragmatics, Oxford: Oxford University Press.
A collection on context-sensitivity.
Author Information
Carlo Penco
Email: penco@unige.it
University of Genoa
Italy
Although there is no canonical view of “Constructivism” within analytic metaphysics, here is a good starting definition:
Constructivism: Some existing entities are constructed by us in that they depend substantively on us.
Constructivism is a broad view with many, more specific, iterations. Versions of Constructivism will vary depending on who does the constructing, for example, all humans, an ideal subject, certain groups. It will also vary depending on what is constructed, for example, concrete objects, abstract objects, facts), and what the constructed entity is constructed out of (for example, natural objects, nonmodal stuff, concepts). Most Constructivists take the constructing relation to be constitutive, that is, it is part of the very nature of constituted objects that they depend substantively on humans. Some, however, take the constituting relation to be merely causal. Some versions of Constructivism are relativistic; others are not. Another key difference between versions of Constructivism concerns whether they take the constructing relation to be global in scope (so everything—or, at least every object we have epistemic access to—is a constructed object) or local (so there are unconstructed objects, as well as constructed ones).
Given the many dimensions along which versions of Constructivism differ, one might wonder what unites them—what, that is, do all versions of Constructivism have in common that marks them out as versions of Constructivism? Constructivists are united first in their opposition to certain forms of Realism—namely, those that claim that x exists and is suitably independent of us. Constructivists about x agree that x exists, but they deny that it is suitably independent of us. Constructivism is distinguished from other versions of anti-Realism by the emphasis it places on the constructing relation. Constructivists are united by all being anti-Realists about x and by believing this is due to x’s being, in some way, constructed by us.
There is no canonical definition of “Constructivism” within philosophy. The following, however, can serve as a good starting point definition for understanding constructivism:
Constructivism: Some extant entities are constructed by us in that they depend substantively on us. (Exactly what it is for an entity to “depend substantively on us” varies between views.)
Constructivism can be further elucidated by noting that constructing is a three-place relation Cxyz (x constructs y out of z) which involves a constructor x (generally taken to be humans), a constructed entity y, and a more basic entity z which serves as a building block for the constructed entity. (Some would take constructing to be a four-place relation Cxyzt—x constructs y out of z at time t. To simplify, the time variable is left out of the relation. It is straightforward to add it in. Each of the terms that are related are examined below before the examination of the constructing relation itself.
Regarding x, who does the constructing? There is no orthodox view regarding which humans do the constructing; different constructivists give different answers. Constructivists frequently (though not always) emphasize the role language and concepts play in constructing entities. Since language and concepts both arise at the level of the group, rather than the level of the individual, it is generally the group (for example, of language speakers or concept users) rather than the individual which is taken to be the constructor. (Lynne Ruder Baker, for example, is typical of Constructivists when she argues that constructed objects rely on our societal conventions as a whole, rather than on the views of any lone individual: “I would not have brought into existence a new thing, a bojangle; our conventions and practices do not have a place for bojangles. It is not just thinking that brings things into existence” (Baker 2007, 44). See also Thomasson (2003, 2007) and Remhof (2017).) Some Constructivists (for example, Kant) take the constructor to be all human beings; other Constructivists (for example, Goodman, Putnam) take the constructor to be a subset of all human beings (for example, society A, society B). There are some versions of Constructivism which take it to be individuals, rather than groups, which do the constructing. (See Goswick, 2018a, 2018b.) These views are more likely to rely on overt responses (for example, how Sally responds when presented with some atoms arranged rockwise) than on language and concepts.
Regarding y, what is constructed? Versions of Constructivism within analytic philosophy can be distinguished based on which entities they focus on. Constructivism in the philosophy of science, for instance, tends to focus on the construction of scientific knowledge. (Scientific “Constructivists maintain that … scientific knowledge is ‘produced’ primarily by scientists and only to a lesser extent determined by fixed structures in the world” (Downes 1-2). See also Kuhn (1996) and Feyerabend (2010).) Constructivism in aesthetics focuses on the construction of an artwork’s meaning and/or on the construction of aesthetic properties more generally. (Aesthetic Constructivists argue that “rather than uncovering the meaning or representational properties of an artwork, an interpretation instead generates an artwork’s meaning” (Alward 247). See also Werner (2015).) Constructivism in the philosophy of mathematics focuses on mathematics objects. (Mathematical Constructivists argue that, when we claim a mathematical object exists, we mean that we can construct a proof of its existence (Bridges and Palmgren 2018).) Constructivism within ethics concerns the origin and nature of our ethical judgments and of ethical properties. (Ethical Constructivists argue that “the correctness of our judgments about what we ought to do is determined by facts about what we believe, or desire, or choose and not, as Realism would have it, by facts about a prior and independent normative reality” (Jezzi 1). Ethical Constructivism has been defended by Korsgaard, Scanlon, and Rawls. For an explication of their views, see Jezzi (2019) and Street (2008, 2010).) Social Constructivism focuses on the construction of distinctly social categories such as race, gender, and sexuality. (See Hacking (1986, 1992, 1999) and Haslanger (1995, 2003, 2012).) Constructivism in metaphysics focuses on the construction of physical objects. (See, for example, Baker (2004, 2007), Goodman (1978, 1980), Putnam (1982, 1987), Thomasson (2003, 2007).)
Regarding z, what is the more basic entity that serves as a building block for the constructed entity? There is no general answer to this question, as different versions of Constructivism give different answers. Some Constructivists (for example, Goswick) take physical stuff to be the basic building blocks of constructed entities. Goswick argues that modal objects are composite objects which have physical stuff and sort-properties as their parts (Goswick 2018b). Some Constructivists (for example, Goodman) take worlds to be the basic building blocks of constructed entities. Goodman argues that it is constructivism all the way down, so each world we construct is itself built out of other worlds.
Regarding C, what is the relation of constructing? Constructivists vary widely regarding the exact details of the constructing relation. In particular, versions of Constructivism vary with regard to whether the constructing relation is (1) global or local, (2) causal or constitutive, (3) temporally and counterfactually robust or not, and (4) relative or absolute. Each of these dimensions of difference are examined in turn.
Regarding 1, is the constructing relation global or local? Historically, the term “constructivism” has been associated with the global claim that every entity to which we have epistemic access is constructed. (Ant Eagle (personal correspondence) points out that there could be an even more global form of Constructivism which claims that all entities, even those to which we do not have epistemic access, are constructed. This is an intriguing view. However, since it has not yet been defended in analytic metaphysics, it is not discussed here.) Kant held this view; as did the main 20th-century proponents of Constructivism (Goodman and Putnam). In the 21st century, philosophers have explored a more local constructing relation in which only some of the entities we have epistemic access to are constructed. Searle, for instance, argues that social objects (for example, money, bathtubs) are constructed but natural objects (for example, trees, rocks) are not. Einheuser argues that modal objects are constructed but nonmodal stuff is not.
Regarding 2, is the constructing relation causal or constitutive? For example, when an author claims that we construct money does she mean that we bear a causal relation to money (that is, we play a causal role in bringing about the existence of money or in money’s having the nature it has) or does she mean that we bear a constitutive relation to money (that is, part of what it is for money to exist or for money to have the nature it has is for us to bear the constitutor-of relation to it)? We can define the distinction as follows: (See also Haslanger (2003, pp. 317-318) and Mallon (2019, p. 4).)
y is causally constructed by x iff x caused y to exist or to have the nature it has.
For example, we caused that $20 bill to come into existence when we printed it at the National Mint and we caused that $20 bill to have the nature it has when we embedded it in the American currency system.
y is constitutively constructed by x iff what it is for y to exist is for x to F or what it is for y to have the nature it has is for x to F.
For example, what it is for a stop sign to exist is for something with physical features P1–Pn to play role r in a human society and what it is for a y to have stop-sign-nature is, in part, for humans to treat y as a stop sign.
Some Constructivists (for example, Goodman, Putnam) do not discuss whether they intend their constructing to be causal or constitutive. (Presumably because the central aims they intend to accomplish by endorsing Constructivism can be satisfied via either a causal or a constitutive version. We can easily modify their views to be explicitly causal or explicitly constitutive. For a Constructivism that is causal, endorse the standard Goodman/Putnam line and add to it that the constructing is to be taken causally. For a Constructivism that is constitutive, endorse the standard Goodman/Putnam line and add to it that the constructing is to be taken constitutively.) Other Constructivists are explicit about whether the constructing relation they utilize is causal or constitutive. Thomasson, for example, notes that
The sort of dependence relevant to [Constructivism] is logical dependence, i.e. dependence which is knowable a priori by analyzing the relevant concepts, not a mere causal or nomological dependence. The very idea of an object’s being money presupposes collective agreement about what counts as money. The very idea of something being an artifact requires that it have been produced by a subject with certain intentions. (Thomasson 2003, 580)
Remhof argues that an object is constructed “iff the identity conditions of the object essentially depend on (i.e., are partly constituted by) our intentional activities” (Remhof 2014, 2). And Searle notes that “part of being a cocktail party is being thought to be a cocktail party; part of being a war is being thought to be a war. This is a remarkable feature of social facts; it has no analogue among physical facts” (Searle 33-34). (For more on constitutive versions of Constructivism, see Haslanger (2003) and Baker (2007, p. 12). For examples of Constructivisms which are causal, see Hacking (1999) and Goswick (2018b). Regarding Hacking, Haslanger notes: “The basis of Hacking’s social constructivism is the historical [constructivist] who claims that, ‘Contrary to what is usually believed, x is the contingent result of historical events and forces, therefore x need not have existed, is not determined by the nature of things, etc.’ … He says explicitly that construction stories are histories and the point, as he sees it, is to argue for the contingency or alterability of the phenomenon by noting its social or historical origins” (Haslanger 2003, 303).)
Regarding 3, is the constituting relation temporally and counterfactually robust or not? Temporal robustness concerns whether constructed entity e exists and has the nature it has prior to and posterior to our constructing it. If yes, then e is temporally robust; otherwise, e is not temporally robust. Counterfactual robustness concerns whether constructed entity e would exist and have the nature it has if certain things were different than they actually are, for example, if we had never existed or had had different conventions/responses/intentions/systems of classification than we actually have. If it would, then the constructing relation is counterfactually robust; otherwise, it is not. Some Constructivists (for example, Putnam, Goodman) deny that the constructing relation is temporally/counterfactually robust. They believe that before we existed there were no stars and that, if we employed different systems of classification, there would be no stars. Other Constructivists take the constructing relation to be temporally/counterfactually robust. Remhof, for instance, argues that even “if there had been no people there would still have been stars and dinosaurs; there would still have been things that would be constructed by humans were they around” (Remhof 2014, 3). Schwartz adds that:
In the process of fashioning classificatory schemes and theoretical frameworks, we organize our world with a past, as well as a future, and provide for there being objects or states of affairs that predate us. Although these facts may be about distant earlier times, they are themselves retrospective facts, not readymade or built into the eternal order. (Schwartz 1986, 436)
An advantage of taking the constructing relation to be temporally/counterfactually robust is that many find it difficult to believe that, for example, there were no stars before there were people or that there would not have been stars had people employed different systems of classification. A disadvantage of endorsing a temporally/counterfactually robust Constructivism is that it is difficult to give an account which is temporally/counterfactually robust but still respects the genuine role Constructivists take humans to play in constructing. After all, if the stars would have been there even if we never existed, why think we play any substantial role in constructing them? At the very least, any role we do play must be non-essential.
Regarding 4, is the constituting relation relative or absolute? Some philosophers (for example, Kant) take the constructing relation to be absolute. Kant thought that all humans, by virtue of being human, employed the same categories and thus created the same phenomena. Other philosophers (for example, Goodman and Putnam) take the constructing relation to be relative. Both argued that worlds exist only relative to a conceptual scheme. Although relativism is often associated with Constructivism (presumably because the most prominent Constructivists of the 20th century also happened to be relativists), the two views are orthogonal. There are relativist and absolutist versions of Constructivism. Moreover, it is easy to slightly tweak relativist views to make them absolutist, or to slightly tweak absolutist views to make them relativist.
At this point, four ways in which constructing relations can differ from one another have been examined: with regard to whether they are (i) global or local, (ii) causal or constitutive, (iii) temporally/counterfactually robust or not, and (iv) relativistic or absolute. The starting point definition of Constructivism is:
Constructivism: Some extant entities are constructed by us in that they depend substantively on us.
Exactly what it is for an entity to “depend substantively on us” varies between views. This definition holds up well to scrutiny. It captures the commonalities one finds across a wide swath of views across sub-disciplines of philosophy (for example, the philosophy of mathematics, aesthetics, metaphysics) and is general enough to accommodate the many differences between views (for example, some Constructivists take constructing to be constitutive, others take it to be merely causal; some Constructivists take the scope of Constructivism to be global, others take it to be very limited in scope and claim there are very few constructed entities). There is some worry, however, that—being so general—the given definition is too broad: are there any views that do not fall under the Constructivist umbrella?
Constructivism has historically been developed in opposition to Realism; and examining the tension between Constructivism and Realism can help us further understand Constructivism. Although the word “realism” is used widely within philosophy and different philosophers take it to mean different things, several fairly canonical uses have evolved: (i) the linguistic understanding of Realism advocated by Dummett which sees the question of Realism as concerning whether sentences have evidence-transcendent truth conditions or verificationist truth conditions, (ii) an understanding of Realism developed within the philosophy of science which centers on whether the aim of scientific theories is truth understood as correspondence to an external world, and (iii) an understanding of Realism developed within metaphysics which centers on whether x exists and is suitably independent of humans. The understanding of Realism relevant to elucidating Constructivism is this final one:
Ontological Realism (about x): x exists and is suitably independent of us.
Constructivism (about x) stands in opposition to Ontological Realism (about x). The Ontological Realist takes x to be “suitably independent of us,” whereas the Constructivist takes x to “depend substantively on us for either its existence or its nature.” Whatever suitable independence is, it rules out depending substantially on us. Although one does still hear philosophers talk simply of “Realism,” it has become far more common, within analytic metaphysics, to talk of “Realism about x” and to take Realism to be a first-order metaphysical view concerning the existence and/or human independence of specific types of entities (for example, properties, social objects, numbers, ordinary objects) rather than a general stance one has (concerning, for example, the purpose of philosophical investigation). Following this trend in the literature on Realism (that is, the move away from talking about Realism and anti-Realism in general to talking specifically of Realism about x) can help us make more precise the definition of Constructivism.
Constructivism (about x): x exists and depends substantively on us for either its existence or its nature.
This definition of Constructivism is still very general (that is, because it does not spell out what “depends substantively on” entails/requires). However, given that it is standard within the literature on Realism to give a definition which is general enough to encompass many different understandings of “suitably independent of” and that Constructivism has historically been developed in opposition to Realism, it makes sense to mimic this level of generality in defining Constructivism.
One last precisification is in order before we move on to discussing the details of specific versions of Constructivism. A wide array of differences track whether the constructing relation is taken to be global or local. Global and local versions of Constructivism differ with regard to when they were/are endorsed (global: in the 20th century versus local: subsequently), why they are endorsed (global: thinks Realism itself is somehow defective versus local: likes Realism in general but thinks there is at least one sort of object it can’t account for), and what the best objections to the view are (global: general objections to constructing versus local: specific objections regarding whether some x really is constructed). Given this, it is useful to separate our discussion of Constructivism into Global Constructivism and Local Constructivism.
Global Constructivism: For all existing xs to which we have epistemic access, x depends substantively on us for either its existence or its nature.
Local Constructivism: For only some existing xs to which we have epistemic access, x depends substantively on us for either its existence or its nature.
2. 20th-Century Global Constructivism in Analytic Metaphysics
Who are the global constructivists? Who is it, that is, who argues that
[All physical objects we have epistemic access to are] constructed in a way that reflects our contingent needs and interests. [Global Constructivists think that we] can only make sense of there being a fact of the matter about the world after we have agreed to employ some descriptions of it as opposed to others, that prior to the use of those descriptions, there can be no sense to the idea that there is a fact of the matter “out there” constraining which of our descriptions are true and which false. (Boghossian 25, 32)
The number of Global Constructivists within analytic metaphysics is small. (Constructivism has a long and healthy history within Continental philosophy and is still much more widely discussed within contemporary Continental metaphysics than it is within contemporary analytic metaphysics. See Kant (1965), Foucault (1970), and Remhof (2017).) Scouring the literature will yield only a handful. The best-known proponents are Goodman and Putnam. Schwartz supported Goodman’s view in the 1980s and most recently wrote an article supporting the view in 2000. Kant (late 1700s) and James (early 1900s) were early proponents of the view. Rorty and Dummett each endorse the view in passing. These seven authors exhaust the list of analytic Global Constructivists. (Al Wilson (personal communication) suggests this list might be expanded to include Rudolf Carnap, Simon Blackburn, and Huw Price.) Their motivation for endorsing Global Constructivism is worries they have about the cogency of Realism. They think that, if Realism were true, we would have no way to denote objects or to know about them. Since we can denote objects and do have knowledge of them, Realism must not be the correct account of them. The correct account is, rather, Constructivism. Although their number is small, their influence—especially that of Goodman and Putnam—has reverberated within analytic metaphysics. The remainder of this section examines the views of each of the central defenders of Global Constructivism.
Goodman defended Global Constructivism is a series of articles and books clustering around the 1980s: Ways of Worldmaking (1978), “On Starmaking” (1980), “Notes on the Well-Made World” (1983), “On Some Worldly Worries” (1993). Goodman, himself, described his view as “a radical relativism under rigorous restraints, that eventuates in something akin to irrealism” (1978 x). He believed that there were many right worlds, that these worlds exist only relative to a set of concepts, and that the building blocks of constructed objects are other constructed objects: “Worldmaking as we know it always starts from worlds already on hand; the making is a remaking” (1978 6-7). Goodman thought that there is “no sharp line to be drawn between the character of the experience and the description given by the subject” (Putnam 1979, 604). Goodman is perhaps the most earnest and sincere defender of the global scope of Constructivism. Whereas others tend to find the idea that we construct, for example, stars nearly incoherent; Goodman finds the idea that we did not construct the stars nearly incoherent:
Scheffler contends that we cannot have made the stars. I ask him which features of the stars we did not make, and challenge him to state how these differ from features clearly dependent on discourse. … We make a star as we make a constellation, by putting its parts together and marking off its boundaries. … The worldmaking mainly in question here is making not with hands but with minds, or rather with languages or other symbol systems. Yet when I say that worlds are made, I mean it literally. … That we can make the stars dance, as Galileo and Bruno made the earth move and the sun stop, not by physical force but by verbal invention, is plain enough. (Goodman 1980 213 and 1983 103)
Goodman takes the constructors of reality to be societies (rather than lone individuals). He takes constructing to be relative, so, for example, society A constructs books and plants, whereas, faced with the same circumstances, society B constructs food and fuel (Goodman 1983, 103). He does not comment on whether the constructing relation is causal or constitutive. Like all relativistic versions of Constructivism, his view is not temporally/counterfactually robust. Goodman’s motivation for endorsing Global Constructivism is that he thinks it is clear that we can denote and know about, for example, stars and he thinks we would not be able to do this were Realism true.
Schwartz defends Goodmanian Global Constructivism in two articles: “I’m Going to Make You a Star” (1986) and “Starting from Scratch: Making Worlds” (2000). Since Goodman’s writings on constructivism can often be difficult to understand, examining Schwartz’s writings can serve to give us further insight into Goodman’s view. Schwartz writes that:
In shaping the concepts and classification schemes we employ in describing our world, we do take part in constituting what that reality is. Whether there are stars, and what they are like, … are facts that are carved out in the very process of devising perspicuous theories to aid in understanding our world. … Until we fashion star concepts and related categories, and integrate them into ongoing theories and speculations, there is no interesting sense in which the facts about stars are really one way rather than another. (Schwartz 1986, 429)
Schwartz emphasizes the role we play in making it the case that certain properties are instantiated and, thus, in drawing out ordinary objects from the mass of undifferentiated stuff which exists independently of people:
In natura rerum there are no inherent facts about the properties [x] has. It is no more a star, than it is a Big Dipper star and belongs to a constellation. … From the worldmaker’s perspective, the unmade world is a world without determinate qualities and shape. Pure substance, thisness, or Being may abound, but there is nothing to give IT specific character. (Schwartz 2000, 156)
Schwartz notes that, “no argument is needed to show that we do have some power to create by conceptualization and symbolic activity. Poems, promises, and predictions are a few obvious examples” (Schwartz 1986, 428). For example, it is uncontroversial that part of what it is to be a Scrabble joker (one of those blank pieces of wood that you can use as any letter when playing the game of Scrabble) is to be embedded in a certain human context: “These bits of wooden reality could no more be Scrabble jokers without the cognitive carving out of the features and dimensions of the concept, than they could be Scrabble jokers had they never been carved from the tree” (Schwartz 1986, 430-431). Schwartz, and Global Constructivists in general, differ from non-constructivists in that they think all ordinary objects (and, in fact, all the objects we have epistemic access to) are like Scrabble jokers. Of course, there is something that exists independently of us. But this something is amorphous, undefined, and plays no role in our epistemic lives. What we are aware of is the objects we create out of this mass by the (often unconscious) imposition of our concepts.
The other key defender of Global Constructivism is Putnam. Like Goodman, Putnam defended Global Constructivism is a series of articles and books which cluster around the 1980s, see, for example, “Reflections on Goodman’s Ways of Worldmaking” (1979), Reason, Truth, and History (1981), “Why There Isn’t a Ready-Made World” (1982), and The Many Faces of Realism (1987). Putnam thinks philosophy should look to science, and he shares the Positivists’ skepticism about traditional metaphysics:
There is … nothing in the history of science to suggest that it either aims at or should aim at one single absolute version of “the world”. On the contrary, such an aim, which would require science itself to decide which of the empirically equivalent successful theories in any given context was “really true”, is contrary to the whole spirit of an enterprise whose strategy from the first has been to confine itself to claims with clear empirical significance. … Metaphysics, or the enterprise of describing the “furniture of the world”, the “things in themselves” apart from our conceptual imposition, has been rejected by many analytic philosophers. … apart from relics, it is virtually only materialists [i.e. physicalists] who continue the traditional enterprise. (Putnam 1982 144 and 164)
Contrary to Putnam’s hopes, in the twenty-first century the materialists have won, and most metaphysicians recognize the sharp subject/object divide that Putnam rejected. Putnam argues that objects “do not exist independently of conceptual schemes. We cut up the world into objects when we introduce one scheme or another” (Cortens 41). Putnam takes the constructors of reality to be societies, the constructing to be relative, and does not comment on whether the constructing relation is causal or constitutive. Like all relativistic versions of Constructivism, his view is not temporally/counterfactually robust. Putnam’s motivation for endorsing Global Constructivism is that he rejects the sharp division between object and subject which Realism presupposes. He thinks analytic philosophy erred when it responded to 17th-century science by introducing a distinction between primary and secondary qualities (Putnam 1987). He argues that we should instead have taken everything that exists to be a muddled combination of the objective and subjective; there is no way to neatly separate out the two. By recognizing the role we play in constructing objects, Global Constructivism pays homage to this lack of separation; Realism does not. Thus, Putnam prefers Global Constructivism to Realism. (See Hale and Wright (2017) for further discussion of Putnam’s rejection of Realism.)
Other adherents of Global Constructivism include Kant, James, Rorty, and Dummett. (See Kant (1965), James (1907), Rorty (1972), and Dummett (1993).) In “The World Well Lost” (1972), Rorty argues that “the realist true believer’s notion of the world is an obsession rather than an intuition” (Rorty 661). He endorses an account of alternative conceptual frameworks which draws heavily on continental philosophers (Hegel, Kant, Heidegger), as well as on Dewey. Ultimately, he concludes that we should stop focusing on trying to find an independent world that is not there and should recognize the role we play in constructing the world. In Frege: Philosopher of Language (1993), Dummett argues that the “picture of reality as an amorphous lump, not yet articulated into discrete objects, thus proves to be a correct one. [The world does not present] itself to us as already dissected into discrete objects” (Dummett 577). Rather, in the process of developing language, we develop the criterion of identity associated with each term and then, with this in place, the world is individuated into distinct objects.
The heyday of analytic Global Constructivism was the 1980s. No one in analytic metaphysics defends the view Schwartz’s defense in 2000. The view has now more or less been abandoned. Remhof discussed the view in 2014, but he did not endorse it. However, Global Constructivism continues to be influential in discussions, where it serves primarily as a rallying point for the Realists who argue against it—see, for example, Devitt (1997) and Boghossian (2006). Although there are no contemporary Global Constructivists, Local Constructivism—which is an heir to Global Constructivism—is alive and well. The next section examines the many versions of Local Constructivism which proliferate in the twenty-first century.
3. 21st-Century Local Constructivism in Analytic Metaphysics
You will not find the term “constructivism” bandied about within contemporary analytic metaphysics with anything approaching the frequency with which the term is used in other sub-disciplines of analytic philosophy or within Continental philosophy. (Why is not the term “constructivism” used more frequently in contemporary analytic metaphysics? The reluctance to use the term “constructivism” probably stems from the current sociology of analytic metaphysics. Realism has a strong grip on analytic metaphysics. Moreover, many anti-Realist metaphysics writings are strikingly bad, and most philosophers currently working within analytic philosophy can easily recall the criticism that was directed toward Global Constructivism: “Barring a kind of anti-realism that none of us should tolerate” (Hawthorne 2006, 109). “[Constructivism] is such a bizarre view that it is hard to believe that anyone actually endorses it” (Boghossian 25). “We should not close our eyes to the fact that Constructivism is prima facie absurd, a truly bizarre doctrine” (Devitt 2010, 105). These factors conspire to make contemporary analytic metaphysics a particularly unappealing place to launch any theory which might smell of anti-Realism, and to be a Constructivist about x is to be an anti-Realist about x.) However, if one looks at the content of views within analytic metaphysics rather than at what the views are labeled, it quickly becomes apparent that many of them meet the definition of Local Constructivism.
Local Constructivism: For only some existing xs to which we have epistemic access, x depends substantively on us for either its existence or its nature.
Although they may be Realists about many kinds of entities (and may self-identify as “Realists”), many metaphysicians of the twenty-first century are Constructivists about at least some kinds of entities. (See, for example, Baker (2004 and 2007), Einheuser (2011), Evnine (2016), Goswick (2018a), Kriegel (2008), Searle (1995), Sidelle (1989), Thomasson (2003 and 2007), Varzi (2011).) Let’s consider the views of several of these metaphysicians. In particular, let’s look at Local Constructivism with regard to vague objects (Heller), modal objects (Sidelle, Einheuser, Goswick), composite objects (Kriegel), artifacts (Searle, Thomasson, Baker, Devitt), and objects with conventional boundaries (Varzi).
Although not himself a Constructivist, in The Ontology of Physical Objects (1990) Heller presents a view which is a close ancestor of contemporary Local Constructivism. Since a minor tweak turns his view into Local Constructivism, since he was one of the first in the general field of Local Constructivism, and since his work has been so influential on contemporary Local Constructivists, it is worth taking a quick look at exactly what Heller says and why Local Constructivists have found inspiration in his book. Heller distinguishes between what he calls “real objects” and what he calls “conventional objects.” Real objects are four-dimensional hunks of matter which have precise spatiotemporal boundaries; we generally do not talk or think about real objects (since we tend not to individuate so finely as to denote objects with precise spatiotemporal boundaries). “Conventional object” is the name Heller gives to objects which we think exist, but do not really (due to the fact that, if they did exist they would have vague spatiotemporal boundaries and nothing that exists has vague spatiotemporal boundaries) (Heller 47). For example, Heller thinks there is no statue and no lump of clay:
The [purported] difference [between the statue and the clay] is a matter of convention. … This difference cannot reflect a real difference in the objects. There is only one object in the spatiotemporal region claimed to be occupied by both the statue and the lump of clay. There . are no coincident entities; there are just . different conventions applicable to a single physical object. (Heller 32)
What really exists (in the rough vicinity we intuitively think contains the statue) are many precise hunks of matter. None of these hunks is a statue or a lump of clay (because “statue” and “lump of clay” are both ordinary language terms which are not precise enough to distinguish between, for example, two hunks of matter which differ only with regard to the fact that one includes, and the other excludes, atom a), but we mistakenly think there is a statue (where really there are just these various hunks of matter). Heller is an Eliminativist about conventional objects: there are none. However, it is a short step from Heller’s Eliminativism about vague objects to Constructivism about vague objects. The framework is in place; Heller has already provided a thorough account of the difference between nonconventional objects (hunks of matter) and conventional objects (objects—such as rocks, dogs, mountains, and necklaces—which have vague spatiotemporal boundaries) and of how our causal interaction with nonconventional objects gives rise to our belief that there are conventional objects. To be a Constructivist rather than an Eliminativist about Heller’s conventional objects, one need only argue, contra Heller, that our conventions in fact bring new objects—objects which are constructed out of hunks of matter and our conventions—into existence. (Just to re-iterate, Heller is opposed to this: “There are other alternatives that can be quickly discounted. For instance, the claim that we somehow create a new physical object by passing legislation involves the absurd idea that without manipulating or creating any matter we can create a physical object” (Heller 36). However, by so thoroughly examining nonconventional objects, conventional objects, and the relationship between them, he laid the groundwork for the Local Constructivists that would come after him.)
Local Constructivists about modal objects share Heller’s skepticism about the ability of Realism to account for ordinary objects. However, whereas Heller worries that ordinary objects have vague spatiotemporal boundaries but that all objects that really exist have precise spatiotemporal boundaries and resolves this worry by being an Eliminativist about ordinary objects, Local Constructivists about modal objects worry that ordinary objects have “deep” modal properties but that all objects that Realism is true of have at most “shallow” modal properties. (Where a “deep” modal property is any de re necessity or de re possibility which is non-trivial and a “shallow” modal property is any modal property which is not “deep.” See Goswick (2018b) for a more detailed discussion.) Rather than being Eliminativists about ordinary objects, they resolve this worry by endorsing Local Constructivism about objects which have at least one “deep” modal property (henceforth, such objects will be referred to as “modal objects”).
Sidelle and Einheuser both defend Local Constructivism about modal objects. Sidelle’s goal in his (1989) is to defend a conventionalist account of modality. He argues that conventionalism about modality requires Constructivism about modal objects (1989 77). He relies on (nonmodal) stuff as the basic building block out of which modal objects are constructed: “[The] conventionalist should … say that what is primitively ostended is ‘stuff’, stuff looking, of course, just as the world looks, but devoid of modal properties, identity conditions, and all that imports. For a slogan, one might say that stuff is preobjectual” (1989 54-55). Modal objects come to exist when humans provide individuating conditions. It is because we respond to stuff s as if it is a chair and apply the label “chair” to it that there is a chair with persistence conditions c rather than just some stuff. Einheuser’s goal in her (2011) is to ground modality. She argues that the best way to do this is to endorse a conceptualist account of modality and that so doing requires endorsing Constructivism about modal objects. Like Sidelle, she endorsees preobjectual stuff: “the content of the spatio-temporal region of the world occupied by an object [is] the stuff of the object” (Einheuser 303). She argues that this stuff “does not contain … built-in persistence criteria. … It is ‘objectually inarticulate’” (Einheuser 303). Modal objects are created out of such mere stuff by the imposition of our concepts:
Concepts like statue and piece of alloy impose persistence criteria on portions of material stuff and thereby “configure” objects. That is, they induce objects governed by these persistence criteria. Our concept statue is associated with one set of persistence criteria. Applied to a suitable portion of stuff, the concept statue configures an object governed by these criteria. (Einheuser 302)
Einheuser emphasizes the fact that what we are doing is creating a new object (a piece of alloy) rather than adding modal properties to pre-existing stuff. (Einheuser on why we must be Local Constructivists about modal objects rather than Local Constructivists about only modal properties: “There is the view that our concepts project modal properties onto otherwise modally unvested objects. This view appears to imply that objects have their modal properties merely contingently. [The piece of alloy may be necessarily physical] but that is just a contingent fact about [it] for our concepts might have projected a different modal property [on to it]. That seems tantamount to giving up on the idea of de re necessity. … The conceptualist considered here maintains conceptualism not merely about modal properties but about objects: Concepts don’t project modal properties onto objects. Objects themselves are, in a sense to be clarified, projections of concepts” (302).)
Kriegel endorses Local Constructivism about composite objects. He takes Realism to be true of non-composite objects and uses them as the basic building blocks of his composite objects. He worries that, given Realism, there is simply no fact of the matter regarding whether the xs compose an o (Kriegel 2008). He argues that we should be conventionalists about composition: “the xs compose an o iff the xs are such as to produce the response that the xs compose an o in normal intuiters under normal forced-choice conditions” (Kriegel 10). A side effect of this conventionalism about composition is Local Constructivism about composite objects, namely, Kriegel is a Realist about some physical entities r (the non-composite objects) to which we have epistemic access, and he thinks that by acting in some specified way (having the composition intuition) with regard to these physical entities we thereby bring new physical objects (the composite ones) into existence.
Local Constructivism about artifacts is the most wide-spread form of Local Constructivism. It is endorsed by Seale, Thomasson, Baker, and Devitt, among others. (See also Evnine (2016).) Searle is a Realist about natural objects such as Mt. Everest, bits of metal, land, stones, water, and trees (Searle 153, 191, 4). He is a Constructivist about artifactual objects such as money, cars, bathtubs, restaurants, and schools (Searle xi, 4). He takes the natural objects to be the basic building blocks of the artifactual ones:
[The] ontological subjectivity of the socially constructed reality requires an ontological objective reality out of which it is constructed, because there has to be something for the construction to be constructed out of. To construct money, property, and language, for example, there have to be the raw materials of bits of metal, paper, land, sounds, and marks. And the raw materials cannot in turn be socially constructed without presupposing some even rawer materials out of which they are constructed, until eventually we reach a bedrock of brute physical phenomena independent of all representations. (Searle 191)
Thomasson’s Local Constructivism about artifacts arises from her easy ontology. She claims that terms have application and co-application conditions and that, when these conditions are satisfied, the term denotes an object of kind k (Thomasson 2007). Although humans set the application and co-application conditions for natural kind terms such as “rock,” humans play no role in making it the case that these conditions are satisfied. Thus, Realism about natural objects is true. However, with regard to artifactual kind terms such as “money,” humans both set the application and co-application conditions for the term and play a role in making it the case that these conditions are satisfied: “The very idea of something being an artifact requires that it have been produced by a subject with certain intentions” (Thomasson 2003, 580). Intentions, alone, however are not enough:
Although artifacts depend on human beliefs and intentions regarding their nature and their existence, the way they are also partially depends on real acts, e.g. of manipulating things in the environment. Many of the properties of artifacts are determined by physical aspects of the artifacts without regard for our beliefs about them. (Thomasson 2003, 581)
Every concrete artifact includes unconstructed properties which serve as the basis for the object’s constructed properties.
Baker distinguishes between what she calls “ID objects” and non-ID objects. ID objects are objects—such as stop signs, tables, houses, driver’s licenses, and hammocks—that could not exist in a world lacking beings with beliefs, desires, and intentions (Baker 2007, 12). Non-ID objects are objects which could exist in a world which lacked such beliefs, desires, and intentions, for example, dinosaurs, planets, rocks, trees, dogs. Artifacts are ID objects. They are constructed out of our doing certain things to and having certain attitudes toward non-ID objects.
When a thing of one primary kind is in certain circumstances, a thing of another primary kind—a new thing, with new causal powers—comes to exist. [Sometimes this new thing is an ID object.] For example, when an octagonal piece of metal is in circumstances of being painted red with white marks of the shape S-T-O-P, and is in an environment that has certain conventions and laws, a new thing—a traffic sign—comes into existence. (Baker 2007, 13)
Baker advocates a constitution theory according to which coinciding objects stand in a hierarchical relation of constitution. Aggregates are fundamental, non-ID objects, and serve as the ground-level building blocks out of which all ID objects, including artifacts, are built: “Although … thought and talk make an essential contribution to the existence of certain objects [e.g., artifacts], … thought and talk alone [do not] bring into existence any physical objects: conventions, practices, and pre-existing materials [i.e., non-ID aggregates] are also required” (Baker 2007, 46). (Unlike nearly all the other advocates of Local Constructivism about artifacts, Baker does not take constructed objects to be inferior to non-constructed ones: “An artifact has as great a claim as a natural object to be a genuine substance. This is so because artifactual kinds are primary kinds. Their functions are their essences” (Baker 2004, 104).)
Devitt is another defender of Local Constructivism about artifacts. He distinguishes between artifactual objects whose “natures are functions that involve the purposes of agents” (Devitt 1997, 247) and natural objects whose nature is not such a function: “A hammer is a hammer in virtue of its function for hammerers. A tree is not a tree in virtue of its function” (Devitt 1997, 247). Devitt argues that every constructed artifact can also be described as a natural object which is not constructed: “Everything that is [an artifact] is also a [natural object]; thus, a fence may also be a row of trees” (Devitt 1997, 248). He is at pains to distance his Local Constructivism from Global Constructivism and emphasizes the role unconstructed objects play in bringing about the existence of constructed objects:
No amount of thinking about something as, say, a hammer is enough to make it a hammer. … Neither designing something to hammer nor using it to hammer is sufficient to make it a hammer. [Only] things of certain physical types could be [hammers]. In this way [artifacts] are directly dependent on the [unconstructed] world. (Devitt 1997, 248-249)
The final version of Local Constructivism to be examined is Varzi’s Local Constructivism about objects with conventional boundaries. Varzi distinguishes between objects with natural boundaries and those with conventional boundaries. He argues that, “If a certain entity enjoys natural boundaries, it is reasonable to suppose that its identity and survival conditions do not depend on us; it is a bona fide entity of its own” (Varzi 137). On the other hand, if an entity’s “boundaries are artificial—if they reflect the articulation of reality that is effected through human cognition and social practices—then the entity itself is to some degree a fiat entity, a product of our world-making” (Varzi 137). Varzi is quick to point to the role objects with natural boundaries play in our construction of objects with conventional boundaries: “the parts of the dough [the objects with natural boundaries] provide the appropriate real basis for our fiat acts. [They] are whatever they are [independently of us] and the relevant mereology is a genuine piece of metaphysics” (Varzi 145). Varzi also emphasizes the compatibility of Local Constructivism with a generally Realist picture:
It is worth emphasizing that even a radical [constructivist] stance need not yield the nihilist apocalypse heralded by postmodern propaganda. [Constructed objects] lack autonomous metaphysical thickness. But other individuals may present themselves. For instance, on a Quinean metaphysics, there is an individual corresponding to “the material content, however heterogeneous, of some portion of space-time, however disconnected and gerrymandered”. … Such individuals are perfectly nonconventional, yet the overall [Quinean] picture is one that a [constructivist] is free to endorse. (Varzi 147-148)
Having examined five versions of Local Constructivism—constructivism about vague objects, modal objects, composite objects, artifacts, and objects with conventional boundaries—I turn now to describing what all these view have in common that marks them out as constructivist views. Taking note of what each view takes to be unconstructed and what each view takes to be constructed can provide insight into what all the views have in common:
Author
Unconstructed Entities
Constructed Entities
neo-Hellerian
4D hunks of matter
vague objects
Sidelle/Einheuser/Goswick
nonmodal stuff
modal objects
Kriegel
simple objects
composite objects
Searle/Thomasson/Baker/Devitt
natural objects
artifactual objects
Varzi
natural boundaries
conventional boundaries
The definitive thing that each version of Local Constructivism has in common that makes it a Local Constructivist view is that (i) each takes there to be something unconstructed to which we have epistemic access, and (ii) each thinks that by acting in some specified way with regard to these unconstructed entities we thereby bring new physical objects (the constructed ones) into existence. The views differ with regard to what they think the unconstructed entities are and with regard to what they think we have to do in order to utilize these unconstructed entities to construct new entities, but they are all alike in endorsing (i) and (ii). This is what marks them out as local and constructivist. They are local—rather than global—in scope because they all think only some of the entities that we have epistemic access to are constructed. They are Constructivist—rather than Realist—about vague objects or modal objects or … objects because they take these entities to depend substantially (either causally or constitutively) on us for either their existence or nature.
Broadly speaking, all Local Constructivists share the same motivation for endorsing Constructivism—namely, they think that although Realism is generally a good theory there are little bits of the world that it cannot account for. Although Local Constructivists tend to be fond of Realism, they are even fonder of certain entities which they take Realism to be unable to accommodate. They resolve this tension (that is, between the desire to be Realists and the desire to have entities e in their ontology) by endorsing Local Constructivism about entities e. The appeal of Local Constructivism springs from an inherent tension between naturalism and Realism. Most analytic metaphysicians of the twenty-first century are naturalists: they think that metaphysics should be compatible with our best science, that philosophy has much to learn from studying the methods used in science, and that, at root, the basic entities philosophy puts in its ontology had better be ones that are scientifically respectable (quarks, leptons, and forces are in; God, dormative powers, and Berkeleyan ideas are out). It is not obvious, however, that there is a place within our best science for the ordinary objects we know and love. (“We have already seen that ordinary material objects tend to dissolve as soon as we acknowledge their microscopic structure: this apple is just a smudgy bunch of hadrons and leptons whose exact shape and properties are no more settled than those of a school of fish” (Varzi 140).) Metaphysicians’ naturalism inclines them to be Realists only about those entities our best science countenances. (Searle, for example, wonders how there can “be an objective world of money, property, marriage, governments, elections, football games, cocktail parties, and law courts in a world that consists entirely of physical particles in fields of force” (Searle xi).) They worry that there is no room within this naturalistic picture of the world for, for example, modal objects, composite objects, or artifacts. This places them in a bind: they do not want to abandon naturalism or Realism, but they also do not want to exclude entities e (whose existence/nature is not countenanced by naturalistic Realism) from their ontology. This underlying situation makes it the case that analytic metaphysicians will often end up endorsing Local Constructivism for some entities, that is, because doing so allows them to include such objects in their ontology whilst recognizing that they are defective in a way many other objects included in their ontology are not (that is, because they are existence or nature depends on us in some way the existence/nature of other objects does not). (This discussion of Local Constructivism has focused on concrete objects. There is also a literature concerning the construction of abstract objects. See, for example, Levinson (1980), Thomasson (1999), Irmak (2019), Korman (2019).)
4. Criticisms of Constructivism in Analytic Metaphysics
The previous two sections examined two central versions of Constructivism within analytic metaphysics and provided overviews of the works of their most prominent adherents. The article concludes by asking what—all things considered—we should make of Constructivism in analytic metaphysics. Before the question can be answered, there must be an examination of the central criticisms of Constructivism. These criticisms can be divided into two main sorts: (1) coherence criticisms—which argue that Constructivism is in some way internally flawed to the extent that we cannot form coherent, evaluable versions of the view, and (2) substantive criticisms—which take Constructivism to be coherent and evaluable, but argue that we have good reason to think it is false.
a. Coherence Criticisms
Consider these four coherence criticisms: (i) Constructivism is not a distinct view, (ii) The term “constructivism” is too over-used to be valuable, (iii) Constructivism is too metaphorical, and (iv) Constructivism is incoherent.
Consider, first, whether Constructivism is a distinct view within the anti-Realist family of metaphysical views. Meta-ethicists, for instance, sometimes worry about whether Ethical Constructivism is sufficiently distinct from other views (for example, emotivism or response-dependence) within ethics. (See, for example, Jezzi (2019) and Street (2008 and 2010).) Does a similar worry arise with regard to Constructivism in analytic metaphysics? It does not. Constructivism is a broad view within anti-Realism; there are many more specific versions of it, but Constructivism is sufficiently distinct from other anti-Realist views. It is not, for example, Berkeleyan Idealism (that is, because Berkeleyan Idealism requires that God play a central role in determining what exists and Constructivism has no such reliance on God) or Eliminativism (that is, because Eliminativists about x deny that x exists, whereas Constructivists about x claim that x exists).
Consider, next, whether the term “constructivism” is too over-used to be valuable. Haslanger notes that, “The term ‘social construction’ has become commonplace in the humanities. [The] variety of different uses of the term has made it increasingly difficult to determine what claim authors are using it to assert or deny” (Haslanger 2003, 301-302). The term “constructivism” certainly is not over-used with analytic metaphysics. If anything, it is underused; authors only very rarely use the term “constructivism” to refer to their own views. We need not fear that the variety of uses which plagues the humanities in general will be an issue in analytic metaphysics. The term is uncommon within analytic metaphysics; and there is value in introducing the label within analytic metaphysics—as such labels serve to emphasize the similarity both in content and in underlying motivation between views whose authors use quite disparate terms to identify their own views.
Consider, third, whether “constructivism,” as used in analytic metaphysics, is too metaphorical. This criticism has been directed primarily at Global Constructivism. Understandably when, for instance, Goodman writes, “The worldmaking mainly in question here is making not with hands but with minds, or rather with languages or other symbol systems. Yet when I say that worlds are made, I mean it literally” (Goodman 1980 213), we want to know exactly what it is to literally make a world with words—it is difficult to parse this phrase if we do not take either the making or the world to be metaphorical. Global Constructivists, themselves, often stress—as Goodman does in the above passage—that they mean their views to be taken non-metaphorically: we really do construct the stars, the planets, and the rocks. Critics of Global Constructivism, however, often find it almost irresistible to take the writings of Global Constructivists to be metaphorical, namely “The anti-realist [Constructivist] is of course speaking in metaphor. It we took him to be speaking literally, what he says would be wildly false—so much so that we would question his sanity” (Devitt 2010, 237—quoting Wolterstorff). There is something to the worry that what Global Constructivists say is just so radical (and frequently, so convoluted) that the only way we can make any sense of it at all is to take it metaphorically (regardless of whether its proponents intend us to take it this way).
A final coherence criticism is that Constructivism is simply incoherent: we cannot make enough sense of what the view is to be in a position to evaluate it. This criticism takes various forms, including that Constructivism (a) is incompatible with what we know about our terms; (b) relies on a notion of a conceptual scheme which is, itself, incoherent; (c) requires unconstructed entities of a sort Global Constructivism cannot accept; (d) relies on a notion of unconstructed objects which is itself contradictory; and (e) allows for the construction of incompatible objects.
Consider, first, the claim that Constructivism is incompatible with what we know about our terms. Boghossian, for example, writes:
Isn’t it part of the very concept of an electron, or of a mountain, that these things were not constructed by us? Take electrons, for example. Is it not part of the very purpose of having such a concept that it is to designate things that are independent of us? If we insist on saying that they were constructed by our descriptions of them, don’t we run the risk of saying something not merely false but conceptually incoherent, as if we hadn’t quite grasped what an electron was supposed to be? (Boghossian 39)
The idea behind Boghossian’s worry is that linguistic and conceptual competence reveal to us that the term “electron” and the concept electron denote something which is independent of us. If so, then any theory that proposes that electrons depend on us is simply confused about the meaning of the term “electron” or, more seriously, about the nature of electrons. There are a variety of ways one can address this concern. One could argue that externalism is true and, thus, that competent users can be radically mistaken about what their terms refer to and still successfully refer. Historically, we have often been mistaken both about what exists and about what the nature of existing objects is. We were able to successfully refer to water even when we thought it was a basic substance (rather than a composite of H2O) and we can refer successfully to electrons even if we are deeply mistaken about their nature, that is, we think they are independent entities when they are really dependent entities. The more serious version of Boghossian’s worry casts it as a worry about changing the subject matter rather than as a worry about reference. It may be that electrons-which-depend-on-us are so radically different from what we originally thought electrons were that Constructivists (who claim electrons so depend) are (i) proposing Eliminativism about electrons-which-are-independent-of-us, and (ii) introducing an entirely new ontology, namely electrons-which-depend-on-us. (See Evnine (2016) for arguments that taking electrons to depend on humans changes the subject matter so radically that Eliminativism is preferable.) The critic could press this point, but it is not very convincing. To see this, hold a rock in your hand. On the most reasonable way of casting the debate, the Realist to your right and the Constructivist to your left can both point to the rock and utter, “we have different accounts of the existence and nature of that rock.” It is uncharitable to interpret them as talking about different objects, rather than as having different views about the same object. Boghossian overestimates the extent of our knowledge of, for example, the term “electron,” the concept electron, and the objects electrons. We are not so infallible with regard to such terms, concepts, and objects that views which dissent from the mainstream Realist position are simply incoherent.
Consider, next, the criticism that Constructivism relies on a notion of a conceptual scheme which is, itself, incoherent. Goodman and Putnam both endorsed relativistic versions of Global Constructivism which rely on different cultures having different conceptual schemes and on the idea that truth can be relative to a conceptual scheme. Davidson (1974) attacks the intelligibility of truth relative to a conceptual scheme. Cortens (2002) argues that, “Many relativists run into serious trouble on this score; rarely do they provide a satisfactory explanation of just what sort of thing a conceptual scheme is” (Cortens 46). Although there are responses to this criticism, they are not presented here. (See the entries for Goodman, Putnam, and Schwartz in the bibliography.) Goodman/Putnam’s Global Constructivism is a dated view, and contemporary versions of Constructivism do not utilize the old-fashioned notion of a conceptual scheme or of truth relative to a conceptual scheme.
Another criticism which attacks the coherence of Constructivism is the claim that Constructivism requires unconstructed entities of a sort Global Constructivism cannot accept. Boghossian (2006) and Scheffler (2009) argue that Constructivism presupposes the existence of at least some unconstructed objects which we have epistemic access to. If this is correct, then Global Constructivism is contradictory, that is, since it would require unconstructed objects we have epistemic access to (to serve as the basis of our constructing) whilst also claiming that all objects we have epistemic access to are constructed:
If our concepts are cutting lines into some basic worldly dough and thus imbuing it with a structure it would not otherwise possess, doesn’t there have to be some worldly dough for them to work on, and mustn’t the basic properties of that dough be determined independently of all this [constructivist] activity. (Boghossian 2006, 35)
There are various answers Constructivists can give to this worry. Goodman, for instance, insists that everything is constructed:
The many stuffs—matter, energy, waves, phenomena—that worlds are made of are made along with the worlds. But made from what? Not from nothing, after all, but from other worlds. Worldmaking as we know it always starts from worlds already on hand; the making is a remaking (Goodman 1978, 6-7)
Goodman’s view may be hard to swallow, but it is not internally inconsistent. Another approach is to argue that although all objects are constructed, there are other types of entities (for example, Sidelle’s nonmodal stuff, Kant’s noumena) which are not constructed. (See also Remhof (2014).)
A fourth incoherence criticism is that Constructivism relies on a notion of unconstructed objects which is itself (at worst) contradictory or (at best) under explained. How cutting a worry this is depends on what a particular version of Constructivism takes to be unconstructed. Kriegel’s Local Constructivism about composite objects, for instance, allows that all mereologically simple objects are unconstructed—such simples provide a rich building base for his constructivism. Similarly, Local Constructivists about artifacts claim that natural objects are unconstructed. They are, that is, Realists about all the objects Realists typically give as paradigms. This, too, provides a rich and uncontroversially non-contradictory building base for their constructed objects. Other views—such as Global Constructivism and Local Constructivism about modal objects—do face a difficulty regarding how to allow unconstructed entities to have enough structure that we can grasp what they are, without claiming they have so much structure that they become constructed entities. Wieland and Elder give voice to this common Realist complaint against Constructivism:
When it comes to [the question of what unconstructed entities are], those who are sympathetic to [Constructivism] are remarkably vague. … The problem [is that constructivists] want to reconcile our freedom of carving with serious, natural constraints. … [The] issue is about the elusive nature of non-perspectival facts in a world full of facts which do depend on our perspective. (Wieland 22)
[Constructivists] are generally quite willing to characterize the world as it exists independently of our exercise of our conceptual scheme. It is simply much stuff, proponents say, across which a play of properties occurs. … But just which properties is it that get instantiated in the world as it mind-independently exists? (Elder 14)
Global Constructivists are quite perplexing when they try to explain how they can construct in the absence of any unconstructed entities to which we have epistemic access. This is a central problem with Global Constructivism and one reason it lacks contemporary adherents. The situation is different with Local Constructivism. Local Constructivists are vocal about the fact that they endorse the existence of unconstructed entities to which we have epistemic access and that such entities play a crucial role in our constructing. (Baker, for example, notes that, “I do not hold that thought and talk alone bring into existence any physical objects … pre-existing materials are also required” (2007 46). Devitt argues that, “Neither designing something to hammer nor using it to hammer is sufficient to make it a hammer … only things of certain physical types could be [hammers]” (1991 248). Einheuser emphasizes that the application of our concepts to stuff is only object creating when our concepts are directed at independently existing stuff which has the right nonmodal properties (Einheuser 2011).) Local Constructivists—even those such as Sidelle who think unconstructed entities have no “deep” modal properties—can provide an account of unconstructed entities which is coherent. There are a variety of ways to do this. (See, for example, Sidelle (1989), Goswick (2015, 2018a, 2018b), Remhof (2014).) Rather than presenting any one of them, there will be a few general points which should enable the reader to understand for herself that Local Constructivists about modal objects can provide a coherent view of unconstructed entities. The easiest way to see this is to note two things: (1) The Local Constructivist about modal objects does not think that every entity which has a modal property is constructed; they only think that objects which have “deep” modal properties are constructed. So, for example, arguments such as the following will not work: Let F denote some property purportedly unconstructed entity e has. Every entity that is actually F is possibly F. So, e is possibly F. Thus, e has a modal property—which contradicts the Local Constructivists’ claim that unconstructed entities do not have modal properties. But, of course, Local Constructivists are happy for unconstructed objects to have a plethora of modal properties, so long as they are “shallow” modal properties. (A “deep” modal property, remember, is any constant de re necessity or de re possibility which is non-trivial. A “shallow” modal property is any modal property which is not “deep.”) (2) Most of us have no trouble understanding Quine when he defines objects as “the material content of a region of spacetime, however heterogeneous or gerrymandered” (Quine 171). But, of course, Quine rejected “deep” modality. The Local Constructivist about modal objects can simply point to Quine’s view and use Quine’s objects as their unconstructed entities. (See Blackson (1992) and Goswick (2018c).)
A final coherence criticism of Constructivism is the claim that Constructivism licenses the construction of incompatible objects, for example, society A constructs object o (which entails the non-existence of object o*), whilst society B constructs object o* (which entails the non-existence of object o). (Suppose, for example, that there are no coinciding objects, so at most one object occupies region r. Then, society A’s constructing a statue (at region r) rules out the existence of a mere-lump (at region r) and society B’s constructing a mere-lump (at region r) rules out the existence of a statue (at region r).) What, then, are we to say with regard to the existence of o and o*? Do both exist, neither, one but not the other? Boghossian puts the worry this way:
[How could] it be the case both that the world is flat (the fact constructed by pre-Aristotelian Greeks) and that it is round (the fact constructed by us)? [Constructivism faces] a problem about how we are to accommodate the possible simultaneous construction of logically incompatible facts. (Boghossian 39-40)
Different versions of Constructivism will have different responses to this worry, but every version is able to give a response that dissolves the worry. Relativists will say that o exits only relative to society A, whereas o* exists only relative to society B. Constructivists who are not relativists will pick some subject to privilege, for example, society A gets to do the constructing, so what they say goes—o exists and o* does not.
b. Substantive Criticisms
Now that Constructivism has been shown to satisfactorily respond to the coherence criticisms, let’s turn to presenting and evaluating the eight main substantive criticisms of Constructivism: (i) If Constructivism were true, then multiple systems of classification would be equally good, but they are not, (ii) Constructivism is under-motivated, (iii) Constructivism is incompatible with naturalism, (iv) Constructivism should be rejected outright because Realism is so obviously true, (v) Constructivism requires constitutive dependence, but really, insofar as objects do depend on us, they depend on us only causally, (vi) Constructivism is not appropriately constrained, (vii) Constructivism is crazy, and (viii) Constructivism conflicts with obvious empirical facts.
Consider, first, the criticism that if Constructivism were true, then multiple systems of classification would be equally good; but they are not, so Constructivism is not true. The main proponent of this criticism is Elder. He expresses the concern in the following way:
If there were something particularly … unobjective about sameness in natural kind, one might expect that we could prosper just as well as we do even if we wielded quite different sortals for nature’s kinds. (Elder 10)
The basic idea is that, as a matter of fact, dividing up the world into rocks and non-rocks works better for us than does dividing up the world into dry-rocks, wet-rocks, and non-rocks: the sortal rock is better than the alternative sortals dry-rock and wet-rock. Why is this? Elder’s explanation is that rock is a natural kind sortal which traces the existence of real objects. Dry-rock and wet-rock do not work as well as rock because there are rocks and there are not dry-rocks and wet-rocks. Since we cannot empirically distinguish between a rock that is (accidentally) dry and an (essentially dry) dry-rock or between a rock that is (accidentally) wet and an (essentially wet) wet-rock, Elder provides no empirical basis for his claim. The Constructivist will point out that she is not arguing that any set of constructed objects is as good as any other. It may very well be the case that rock works better for us than do dry-rock and wet-rock. The Constructivist attributes this to contingent facts about us (for example, our biology and social history) rather than to its being the case that Realism is true of rocks and false of dry-rocks and wet-rocks. Nothing Elder says blocks this way of describing the facts. Pending some argument showing that the only way (or, at least, the best way) we can explain the fact that rock works better for us than do dry-rock and wet-rock is if Realism is true of rocks, Elder has no argument against the Constructivist.
Another argument one sometimes hears is that Constructivism is undermotivated. Global Constructivism is seen as an overly radical metaphysical response to minor semantic and epistemic problems with Realism. (See, for example, Devitt (1997) and Wieland (2012).) How good a criticism this is depends on how minor the semantic and epistemic problems with Realism are and how available a non-metaphysic solution to them is. This issue is not explored further here because this sort of criticism cannot be evaluated in general but must be looked at with regard to each individual view, for example, is Goodman’s Global Constructivism undermotivated, is Sidelle’s Local Constructivism about modal objects undermotivated, is Thomasson’s Local Constructivism about artifacts undermotivated? Whether the criticism is convincing will depend on how well each view does at showing there’s a real problem with Realism and that their own preferred way of resolving the problem is compelling. If Sidelle is really correct that the naturalist/empiricist stance most analytic philosophers embrace in the twenty-first century is incompatible with the existence of ordinary objects with “deep” modal properties, then we should be strongly motivated to seek a non-Realist account of ordinary objects. If Thomasson’s really right that existence is easy and that some terms really are such that anything that satisfies them depends constitutively on humans, then we should be strongly motivated to seek a non-Realist account of the referents of such terms.
Another argument one sometimes hears is that Constructivism is incompatible with the naturalized metaphysics which is in vogue. Most contemporary metaphysicians are heavily influenced by Lewisian naturalized metaphysics: they believe that there is an objective reality, that science has been fairly successful in examining this reality, that the target of metaphysical inquiry is this objective reality, and that our metaphysical theorizing should be in line with what our best science tells us about reality. If Constructivism really is incompatible with naturalized metaphysics it will ipso facto be unattractive to most contemporary metaphysicians. However, although one frequently hears this criticism, upon closer examination it is seen to lack teeth. The crucial issue—with regard to compatibility with naturalistic metaphysics—is whether one’s view is adequately constrained by an independent, objective, open to scientific investigation reality. All versions of Realism are so constrained, so Realism wears its compatibility with naturalistic metaphysics on its sleeve. Not all versions of Constructivism are so constrained, for example, Goodman and Putnam’s Global Constructivisms are not. But it would be overly hasty to throw out all of Constructivism simply because some versions of Constructivism are incompatible with naturalistic metaphysics. Some versions of Constructivism are more compatible with naturalized metaphysics than is Realism. Suppose Ladyman and Ross are correct when they say our best science shows there are no ordinary objects (2007). Suppose Einheuser is correct when she says our best science shows there are no objects with modal properties (2011). Suppose, however, that in daily human life we presuppose (as we seem to) the existence of ordinary objects with modal properties. Then, Local Constructivism about ordinary objects is motivated from within the perspective of naturalistic metaphysics. One’s naturalism prevents one from being a Realist about ordinary objects, that is, because all the subject-independent world contains is ontic structure (if Ladyman and Ross are correct) or nonmodal stuff (if Einheuser is correct). One’s desire to account for human behavior prevents one from being an Eliminativist about ordinary objects. A constructivism which builds ordinary objects out of human responses to ontic structure/nonmodal stuff is the natural position to take. Although some versions of Constructivism (for example, Global Constructivism) may be incompatible with naturalistic metaphysics, there is no argument from naturalized metaphysics against Constructivism per se.
A fourth substantive criticism levied against Constructivism is that it should be rejected outright because Realism is so obviously true:
A certain knee-jerk realism is an unargued presupposition of this book. (Sider 2011, 18)
Realism is much more firmly based than these speculations that are thought to undermine it. We have started the argument in the wrong place: rather than using the speculations as evidence against Realism, we should use Realism as evidence against the speculations. We should “put metaphysics first.” (Devitt 2010, 109)
[Which] organisms and other natural objects there are is entirely independent of our beliefs about the world. If indeed there are trees, this is not because we believe in trees or because we have experiences as of trees. (Korman 92)
For example, facts about mountains, dinosaurs or electrons seem not to be description-dependent. Why should we think otherwise? What mistake in our ordinary, naive realism about the world has the [Constructivist] uncovered? What positive reason is there to take such a prima facie counterintuitive view seriously. (Boghossian 28)
All that the Constructivist can say in response to this criticism—which is not an argument against Constructivism but rather a sharing of the various authors’ inclinations—is that she does not think Realism is so obviously true. She can, perhaps, motivate others to see it as less obviously true by not casting the debate as a global one between choosing whether the stance one wants to adopt toward the world is Global Constructivist or Global Realist, but rather as a more local debate concerning the ontological status of, for example, tables, rocks, money, and dogs. We are no longer playing a global game; one can be an anti-Realist about, for example, money without thereby embracing global anti-Realism.
Another criticism of Constructivism is that Constructivism is only true if objects constitutively depend on us, but really, insofar as objects do depend on us, they depend on us only causally. As this article has defined “Constructivism,” it has room for both causal versions and constitutive versions. (Hacking (1999) and Goswick (2018b) present causal versions of Constructivism. Baker (2007) and Thomasson (2007) present constitutive versions of Constructivism.) One could, instead, define “Constructivism” more narrowly so that it only included constitutive accounts. This would be a mistake. Consider a (purported) causal version of Local Constructivism about modal objects: Jane is a Realist about nonmodal stuff and claims we have epistemic access to it. She thinks that when we respond to rock-appropriate nonmodal stuff s with the rock-response we bring a new object into existence: a rock. Jane does not think that rocks depend constitutively on us—it is not part of what it is to be a rock that we have to F in order for rocks to exist. But we do play a causal role in bringing about the existence of rocks. If there were some modal magic, then rocks could have existed without us (nothing about the nature of rocks bars this from being the case); but there is no modal magic, so all the rocks that exist do causally depend on us. Now consider a (purported) constitutive version of Local Constructivism about modal objects: James is a Realist about nonmodal stuff and claims we have epistemic access to it. He thinks that when we respond to rock-appropriate nonmodal stuff s with the rock-response we bring a new object into existence: a rock. James thinks that rocks depend constitutively on us—it is part of what it is to be a rock that we have to F in order for rocks to exist. Even if there were modal magic, rocks could not have existed without us. Do Jane and James’ views differ to the extent that one of them deserves the label “Constructivist” and the other does not? Their views are very similar—after all they both take rocks to be composite objects which come to exist when we F in circumstances c, that is, they tell the same origin story for rocks. What they differ over is the nature of rocks: is their dependence on us constitutive of what it is to be a rock (as James says) or is it just a feature that all rocks in fact have ( as Jane says). Jane and James’ views are so similar (and the objections that will be levied against them are so similar) that taking both to be versions of the same general view (that is, Constructivism) is more perspicuous than not so doing. More generally, causal constructivism is similar enough to constitutive constructivism that defining “constructivism” in such a way that in excludes the former would be a mistake.
A sixth substantive criticism of Constructivism is that it is not appropriately constrained.
Putnam does talk, in a Kantian way, of the noumenal world and of things-in-themselves [but] he seems ultimately to regard this talk as “nonsense” … This avoids the facile relativism of anything goes by fiat: we simply are constrained, and that’s that. … [But to] say that our construction is constrained by something beyond reach of knowledge or reference is whistling in the dark. (Devitt 1997, 230)
The worry here is that it is not enough just to say “our constructing is constrained”; what does the constraining and how it does so must be explained. Global Constructivists have fared very poorly with regard to this criticism. They (for example, Goodman, Putnam, Schwartz) certainly intend their views to be so constrained. What is less clear, however, is whether they are able to accomplish this aim. They provide no satisfactory account of how, given that we have no epistemic access to them, the unconstructed entities they endorse are able to constrain our constructing. This is a serious mark against Global Constructivism. Local Constructivists fare better in this regard. They place a high premium on our constructing being constrained by the (subject-independent) world and each Local Constructivist is able to explain what constrains constructing on her view and how it does so. Baker, for example, argues that all constructed objects stand in a constitution chain which eventuates in an unconstructed aggregate. These aggregates constrain which artifacts can be in their constitution chains, namely (i) an artifact with function f can only be constituted by an aggregate which contains enough items of suitable structure to enable the proper function of the artifact to be performed, and (ii) an artifact with function f can only be constituted by an aggregate which is such that the items in the aggregate are available for assembly in a way suitable for enabling the proper function of the artifact to be performed (Baker 2007, 53). For another example, consider Einheuser’s explanation of what constrains her Local Constructivism about modal objects: Every (constructed) modal object coincides with some (unconstructed) nonmodal stuff. A modal object of sort s (for example, a rock) can only exist at region r if the nonmodal stuff that occupies region r has the right nonmodal properties (Einheuser 2011). This ensures that, for example, we cannot construct a rock at a region that contains only air molecules.
A seventh substantive criticism of Constructivism is the claim that Constructivism is crazy. Consider,
We should not close our eyes to the fact that Constructivism is prima facie absurd, a truly bizarre doctrine. … How could dinosaurs and stars be dependent on the activities of our minds? It would be crazy to claim that there were no dinosaurs or stars before there were people to think about them. [The claim that] there would not have been dinosaurs or stars if there had not been people (or similar thinkers) seems essential to Constructivism: unless it were so, dinosaurs and stars could not be dependent on us and our minds. [So Constructivism is crazy.] (Devitt 2010, 105 and Devitt 1997, 238)
The idea that we in any way determine whether there are stars and what they are like seems so preposterous, if not incomprehensible, that any thesis that leads to this conclusion must be suspect. … And a forceful, “But people don’t make stars” is often thought to be the simplest way to bring proponents of such metaphysical foolishness back to their senses. For isn’t it obvious that … there were stars long before sentient beings crawled about and longer still before the concept star was thought of or explicitly formulated? (Schwartz 1986, 429 and 427)
The “but Constructivism is crazy” elocution is not a specific argument but is rather an expression of the utterer’s belief that Constructivism has gone wrong in some serious way. Arguments lie behind the “Constructivism is crazy” utterance and the arguments, unlike the emotive outburst, can be diffused. Behind Devitt’s “it’s crazy” utterance is the worry that Constructivism simply gets the existence conditions for natural objects wrong. It is just obvious that dinosaurs and stars existed before any people did and it follows from this that they must be unconstructed objects. There are two ways to respond to this objection: (1) argue that even if humans construct dinosaurs and stars it can still be the case that dinosaurs and stars existed prior to the existence of humans. (For this approach, see Remhof, “If there had been no people there would still have been stars and dinosaurs; there would still have been things that would be constructed by humans were they around” (Remhof 2014, 3); Searle, “From the fact that a description can only be made relative to a set of linguistic categories, it does not follow that the objects described can only exist relative to a set of categories. … Once we have fixed the meaning of terms in our vocabulary by arbitrary definitions, it is no longer a matter of any kind of relativism or arbitrariness whether representation-independent features of the world that satisfy or fail to satisfy the definitions exist independently of those or any other definitions” (Searle 166); and Schwartz, “In the process of fashioning classificatory schemes and theoretical frameworks, we organize our world with a past, as well as a future, and provide for there being objects or states of affairs that predate us. Although these facts may be about distant earlier times, they are themselves retrospective facts, not readymade or build into the eternal order” (Schwartz 1986, 436).) (2) bite the bullet. Agree that—if Constructivism is true —dinosaurs and stars did not exist before there were any people. Diffuse the counter-intuitiveness of this claim by, for example, arguing that, although dinosaurs per se did not exist, entities that were very dinosaur-like did exist. (For this approach, see Goswick (2018b):
The [Constructivist] attempts to mitigate this cost by pointing out that which ordinary object claims are false is systematic and explicable. In particular, we’ll get the existence and persistence conditions of ordinary objects wrong when we confuse the existence/persistence of an s-apt n-entity for the existence/persistence of an ordinary object of sort s. We think dinosaurs existed because we mistake the existence of dinosaur-apt n-entities for the existence of dinosaurs (Goswick 2018b, 58).
Behind Schwartz’s “Constructivism is crazy” utterance is the same worry Devitt has: namely—that Constructivism simply gets the existence conditions for natural objects wrong. It can be diffused in the same way Devitt’s utterance was.
The final substantive criticism of Constructivism to be considered is the claim that Constructivism conflicts with obvious empirical facts.
It is sometimes said, for example, that were it not for the fact that we associated the word “star” with certain criteria of identity, there would be no stars. It seems to me that people who say such things are guilty of [violating well-established empirical facts]. Are we to swallow the claim that there were no stars around before humans arrived on the scene? Even the dimmest student of astronomy will tell you that this is non-sense. (Cortens 45)
This worry has largely been responded to in responding to the previous criticism. However, Cortens makes one point beyond that which Devitt and Schwartz make. Namely, that it is not just our intuitions that tell us stars existed before humans, but also our best science. Any naturalist who endorses Constructivism about stars will be skeptical—that our best science really tells us this. Even the brightest student of astronomy is unlikely to make the distinctions metaphysicians make, for example, between a star and the atoms that compose it. Does the astronomy student really study whether there are stars or only atoms-arranged-starwise? If not, how can she be in a place to tell us whether there where stars before there were humans or whether there were only atoms-arranged-starwise? The distinction between stars and atoms-arranged-starwise is not an empirical one. In general, the issues Constructivists and Realists differ over are not ones that can be resolved empirically. Given this, it is implausible that Constructivism conflicts with obvious empirical facts. It would conflict with an obvious empirical fact (or, at least, with what our best science takes to be the history of our solar system) if, for example, Constructivists denied that there was anything star-like before there were humans. But Constructivists do not do this; rather, they replace the Realists’ pre-human stars with entities which are empirically indistinguishable from stars but which lack some of the metaphysical features (for example, being essentially F) they think an entity must have to be a star.
5. Evaluating Constructivism within
Analytic Metaphysics
Having explicated what Constructivism within analytic metaphysics is and what the central criticisms of it are, let’s examine what, all things considered, should be made of Constructivism within analytic metaphysics.
Global Constructivism is no longer a live option within analytic metaphysics. Our understanding of Realism, and our ability to clearly state various versions of it, has expanded dramatically since the 1980s. Realists have found answers to the epistemic and semantic concerns which originally motivated Global Constructivism, so the view is no longer well motivated. (See, for example, Devitt (1997) and Devitt (2010).) Moreover, there are compelling objections to Global Constructivism regarding, in particular, how we can construct entities if we have no epistemic access to any unconstructed entities to construct them from, and what can constrain our constructing, namely, given that we have epistemic access only to the constructed, it appears nothing unconstructed can constrain our constructing.
Local Constructivism fares better for reasons both sociological and philosophical. Sociologically, Local Constructivism has not been around for long and, rather than being one view, it is a whole series of loosely connected views, so it has not yet drawn the sort of detailed criticism that squashed Global Constructivism. Additionally, being a Local Constructivist about x is compatible with being a Realist about y, z, a, b, … (all non-x entities). As such, it is not a global competitor to Realism and has not drawn the Realists’ ire in the way Global Constructivism did. Philosophically, Local Constructivism is also on firmer ground than was Global Constructivism. By endorsing unconstructed entities which we have epistemic access to and which constrain our constructing, Local Constructivists are able to side-step many of the central criticisms which plague Global Constructivism. Local Constructivism looks well poised to provide an intuitive middle ground between a naturalistic Realism (which often unacceptably alters either the existence or the nature of the ordinary objects we take ourselves to know and love) and an overly subjective anti-Realism (which fails to recognize the role the objective world plays in determining our experiences and the insights we can gain from science).
6. Timeline of Constructivism in Analytic Metaphysics
1781
Kant’s A Critique of Pure Reason distinguishes between noumena and phenomena, thereby laying the groundwork for future work on constructivism
1907
James’ Pragmatism: A New Name for Some Old Ways of Thinking defends Global Constructivism
1978-1993
Goodman and Putnam publish a series of books and papers defending Global Constructivism
1986 and 2000
Schwartz defends Global Constructivism
1990
Heller defends an eliminativist view of vague objects, along the way to doing so, he shows how to be a constructivist about vague objects
1990s-2000s
Baker, Thomasson, Searle, and Devitt endorse Local Constructivism about artifacts
Post 1988
Sidelle, Einheuser, and Goswick argue that objects having “deep” modal properties are constructed
2008
Kriegel argues that composite objects are constructed
2011
Varzi argues that objects with conventional boundaries are constructed
7. References and Further Reading
a. Constructivism: General
Alward, Peter. (2014) “Butter Knives and Screwdrivers: An Intentionalist Defense of Radical Constructivism,” The Journal of Aesthetics and Art Criticism, 72(3): 247-260.
Boyd, R. (1992) “Constructivism, Realism, and Philosophical Method” in Inference, Explanation, and Other Frustrations: Essays in the Philosophy of Science (ed. Earman). Los Angeles: University of California Press: 131-198.
Bridges and Palmgren. (2018) “Constructive Mathematics” in The Stanford Encyclopedia of Philosophy.
Chakravartty, Anjan. (2017) “Scientific Realism” in The Stanford Encyclopedia of Philosophy.
Downes, Stephen. (1998) “Constructivism” in the Routledge Encyclopedia of Philosophy.
Feyerabend, Paul. (2010) Against Method. USA: Verso Publishing.
Foucault, Michel. (1970) The Order of Things. USA: Random House.
Hacking, Ian. (1986) “Making Up People,” in Reconstructing Individualism: Autonomy, Individuality, and the Self in Western Thought (eds. Heller, Sosna, Wellbery). Stanford: Stanford University Press, 222-236.
Hacking, Ian. (1992) “World Making by Kind Making: Child-Abuse for Example,” in How Classification Works: Nelson Goodman among the Social Sciences (eds. Dougles and Hull). Edinburgh: Edinburgh University Press, 180-238.
Hacking, Ian. (1999) The Social Construction of What? Cambridge: Harvard University Press.
Haslanger, Sally. (1995) “Ontology and Social Construction,” Philosophical Topics, 23(2): 95-125.
Haslanger, Sally. (2003) “Social Construction: The ‘Debunking’ Project,” Socializing Metaphysics: The Nature of Social Reality (ed. Schmitt). Lanham: Roman & Littlefield Publishers, 301-326.
Haslanger, Sally. (2012) Resisting Reality: Social Construction and Social Critique, New York: Oxford University Press.
Jezzi, Nathaniel. (2019) “Constructivism in Metaethics,” Internet Encyclopedia of Philosophy. https://iep.utm.edu/con-ethi/
Kuhn, Thomas. (1996) The Structure of Scientific Revolutions. Chicago: Chicago University Press.
Mallon, Ron. (2019) “Naturalistic Approaches to Social Construction” in the Stanford Encyclopedia of Philosophy.
Rawls, John. (1980) “Kantian Constructivism in Moral Theory,” Journal of Philosophy, 77: 515-572.
Remhof, J. (2017) “Defending Nietzsche’s Constructivism about Objects,” European Journal of Philosophy, 25(4): 1132-1158.
Street, Sharon. (2008) “Constructivism about Reasons,” Oxford Studies in Metaethics, 3: 207-245.
Street, Sharon. (2010) “What Is Constructivism in Ethics and Metaethics?” Philosophy Compass, 5(5): 363-384.
Werner, Konrad. (2015) “Towards a PL-Metaphysics of Perception: In Search of the Metaphysical Roots of Constructivism,” Constructivist Foundations, 11(1): 148-157.
b. Constructivism: Analytic Metaphysics
Baker, Lynne Ruder. (2004) “The Ontology of Artifacts,” Philosophical Explorations, 7: 99-111.
Baker, Lynne Ruder. (2007) The Metaphysics of Everyday Life: An Essay in Practical Realism. USA: Cambridge University Press.
Bennett, Karen. (2017) Making Things Up. Oxford: Oxford University Press.
Dummett, Michael. (1993) Frege: Philosophy of Language. Cambridge: Harvard University Press.
Einheuser, Iris. (2011) “Towards a Conceptualist Solution to the Grounding Problem,” Nous, 45(2): 300-314.
Evnine, Simon. (2016) Making Objects and Events: A Hylomorphic Theory of Artifacts, Actions, and Organisms. Oxford: Oxford University Press.
Goodman, Nelson. (1983) “Notes on the Well-Made World,” Erkenntnis, 19: 99-108.
Goodman, Nelson. (1978) Ways of Worldmaking. USA: Hackett Publishing Company.
Goodman, Nelson. (1993) “On Some Worldly Worries,” Synthese, 95(1): 9-12.
Goswick, Dana. (2015) “Why Being Necessary Really Isn’t the Same As Being Not Possibly Not,” Acta Analytica, 30(3): 267-274.
Goswick, Dana. (2018a) “A New Route to Avoiding Primitive Modal Facts,” Brute Facts (eds. Vintiadis and Mekios). Oxford: OUP, 97-112.
Goswick, Dana. (2018b) “The Hard Question for Hylomorphism,” Metaphysics, 1(1): 52-62.
Goswick, Dana. (2018c) “Ordinary Objects Are Nonmodal Objects,” Analysis and Metaphysics, 17: 22-37.
Goswick, Dana. (2019) “A Devitt-Proof Constructivism,” Analysis and Metaphysics, 18: 17-24.
Hale and Wright. (2017) “Putnam’s Model-Theoretic Argument Against Metaphysical Realism” in A Companion of the Philosophy of Language (eds. Hale, Wright, and Miller). USA: Wiley-Blackwell, 703-733.
Heller, Mark. (1990) The Ontology of Physical Objects. Cambridge: CUP.
Irmak. (2019) “An Ontology of Words,” Erkenntnis, 84: 1139-1158.
James, William. (1907) Pragmatism: A New Name for Some Old Ways of Thinking. New York: Longmans Green Publishing (especially lectures 6 and 7).
James, William. (1909) The Meaning of Truth: A Sequel to Pragmatism. New York: Longmans Green Publishing.
Kant, Immanuel. (1965) The Critique of Pure Reason. London: St. Martin’s Press.
Kitcher, Philip. (2001) “The World As We Make It” in Science, Truth and Democracy. Oxford: Oxford University Press, ch. 4.
Korman. (2019) “The Metaphysics of Establishments,” The Australasian Journal of Philosophy, DOI: 10.1080/00048402.2019.1622140.
Kriegel, Uriah. (2008) “Composition as a Secondary Quality,” Pacific Philosophical Quarterly, 89: 359-383.
Ladyman, James and Ross, Don. Every Thing Must Go: Metaphysics Naturalized. Oxford: Oxford University Press, 2007.
Levinson. (1980) “What a Musical Work Is,” The Journal of Philosophy, 77(1): 5-28.
McCormick, Peter. (1996) Starmaking: Realism, Anti-Realism, and Irrealism. Cambridge: MIT Press.
Putnam, Hilary. (1979) “Reflections on Goodman’s Ways of Worldmaking,” Journal of Philosophy, 76: 603-618.
Putnam, Hilary. (1981) Reason, Truth, and History. Cambridge: Cambridge University Press.
Putnam, Hilary. (1982) “Why There Isn’t a Ready-Made World,” Synthese, 51: 141-168.
Putnam, Hilary. (1987) The Many Faces of Realism. LaSalle: Open Court Publishing.
Quine, W.V.O. (1960) Word and Object. Cambridge: MIT Press.
Remhof, J. (2014) “Object Constructivism and Unconstructed Objects,” Southwest Philosophy Review, 30(1): 177-186.
Rorty, Richard. (1972) “The World Well Lost,” The Journal of Philosophy, 69(19): 649-665.
Schwartz, Robert. (1986) “I’m Going to Make You a Star,” Midwest Studies in Philosophy, 11: 427-438.
Schwartz, Robert. (2000) “Starting from Scratch: Making Worlds,” Erkenntnis, 52: 151-159.
Searle, John. (1995) The Construction of Social Reality. USA: Free Press.
Sidelle, Alan. (1989) Necessity, Essence, and Individuation. London: Cornell University Press.
Thomasson, Amie. (1999) Fiction and Metaphysics. Cambridge: Cambridge University Press.
Thomasson, Amie. (2003) “Realism and Human Kinds,” Philosophy and Phenomenological Research, 67(3): 580-609.
The basic idea underlying a precautionary principle (PP) is often summarized as “better safe than sorry.” Even if it is uncertain whether an activity will lead to harm, for example, to the environment or to human health, measures should be taken to prevent harm. This demand is partly motivated by the consequences of regulatory practices of the past. Often, chances of harm were disregarded because there was no scientific proof of a causal connection between an activity or substance and chances of harm, for example, between asbestos and lung diseases. When this connection was finally established, it was often too late to prevent severe damage.
However, it is highly controversial how the vague intuition behind “better safe than sorry” should be understood as a principle. As a consequence, we find a multitude of interpretations ranging from decision rules over epistemic principles to procedural frameworks. To acknowledge this diversity, it makes sense to speak of precautionary principles (PPs) in the plural. PPs are not without critics. For example, it has been argued that they are paralyzing, unscientific, or promote a culture of irrational fear.
This article systematizes the different interpretations of PPs according to their functions, gives an overview about the main lines of argument in favor of PPs, and outlines the most frequent and important objections made to them.
1. The Idea of Precaution and Precautionary Principles
We can identify three main motivations behind the postulation of a PP. First, it stems from a deep dissatisfaction with how decisions were made in the past: Often, early warnings have been disregarded, leading to significant damage which could have been avoided by timely precautionary action (Harremoës and others 2001). This motivation for a PP rests on some sort of “inductive evidence” that we should reform (or maybe even replace) our current practices of risk regulation, demanding that uncertainty must not be a reason for inaction (John 2007).
Second, it expresses specific moral concerns, usually pertaining to the environment, human health, and/or future generations. This second motivation is often related to the call for sustainability and sustainable development in order to not destroy important resources for short-time gains, but to leave future generations with an intact environment.
Third, PPs are discussed as principles of rational choice under conditions of uncertainty and/or ignorance. Typically, rational decision theory is well suited for situations where we know the possible outcomes of our actions and can assign probabilities to them (a situation of “risk” in the decision-theoretic sense). However, the situation is different for decision-theoretic uncertainty (where we know the possible outcomes, but cannot assign any, or at least no meaningful and precise, probabilities to them) or decision-theoretic ignorance (where we do not know the complete set of possible outcomes). Although there are several suggestions for decision rules under these circumstances, it is far from clear what is the most rational way to decide when we are lacking important information and the stakes are high. PPs are one proposal to fill this gap.
Although they are often asserted individually, these motivations also complement each other: If, as following from the first motivation, uncertainty is not allowed to be a reason for inaction, then we need some guidance for how to decide under such circumstances, for example, in the form of a decision principle. And in many cases, it is the second motivation—concerns for the environment or human health—which makes the demand for precautionary action before obtaining scientific certainty especially pressing.
Many existing official documents cite the demand for precaution. One often-quoted example for a PP is principle 15 of the Rio Declaration on Environment and Development, a result of the United Nations Conference on Environment and Development (UNCED) in 1992. It refers to a “precautionary approach”:
Rio PP—In order to protect the environment, the precautionary approach shall be widely applied by states according to their capabilities. Where there are threats of serious or irreversible damage, lack of full scientific certainty shall not be used as a reason for postponing cost-effective measures to prevent environmental degradation. (United Nations Conference on Environment and Development 1992, Principle 15)
Another prominent example is the formulation that resulted from the Wingspread Conference on the Precautionary Principle 1998, where around 35 scientists, lawyers, policy makers and environmentalists from the United States, Canada and Europe met to define a PP:
Wingspread PP—When an activity raises threats of harm to human health or the environment, precautionary measures should be taken even if some cause and effect relationships are not fully established scientifically. In this context the proponent of an activity, rather than the public, should bear the burden of proof. The process of applying the precautionary principle must be open, informed and democratic and must include potentially affected parties. It must also involve an examination of the full range of alternatives, including no action. (Science & Environmental Health Network (SEHN) 1998)
Both formulations are often cited as paradigmatic examples of PPs. Although they both mention uncertain threats and measures to prevent them, they also differ in important points, for example their strength: The Rio PP makes a weaker claim, stating that uncertainty is not a reason for inaction, whereas the Wingspread PP puts more emphasis on the fact that measures should be taken. They both give rise to a variety of questions: What counts as “serious or irreversible damage”? What does “(lack of) scientific certainty” mean? How plausible does a threat have to be in order to warrant precaution? What counts as precautionary measures? Additionally, PPs face many criticisms, like being too vague to be action-guiding, paralyzing the decision-process, or being anti-scientific and promoting a culture of irrational fear.
Thus, inspired by these regulatory principles in official documents, a lively debate has developed around how PPs should be interpreted in order to arrive at a version applicable in practical decision-making. This resulted in a multitude of PP proposals that are formulated and defended (or criticized) in different theoretical and practical contexts. Most of the existing PP formulations share the elements of uncertainty, harm, and (precautionary) action. Different ways of spelling out these elements result in different PPs (Sandin 1999, Manson 2002). For example, they can vary in how serious a harm has to be in order to trigger precaution, or which amount of evidence is needed. Additionally, PP interpretations differ with respect to the function they are intended to fulfill. They are typically classified based on some combination of the following categories according to their function (Sandin 2007, 2009; Munthe 2011; Steel 2014):
Action-guiding principles tell us which course of action to choose given certain circumstances;
(sets of) epistemic principles tell us what we should reasonably believe under conditions of uncertainty;
procedural principles express requirements for decision-making, and tell us how we should choose a course of action.
These categories can overlap, for example, when action- or decision-guiding principles come with at least some indication for how they should be applied. Some interpretations explicitly aim at integrating the different functions, and warrant their own category:
Integrated PP interpretations: Approaches that integrate action-guiding, epistemic, and procedural elements associated with PPs. Consequently, they tell us which course of action should be chosen through which procedure, and on what epistemic base.
This article starts in Section 2 with an overview of different PP interpretations according to this functional categorization. Section 3 describes the main lines of arguments that have been presented in favor of PPs, and Section 4 presents the most frequent and most important objections that PPs face, along with possible rejoinders.
2. Interpretations of Precautionary Principles
a. Action-Guiding Interpretations
Action-guiding PPs are often seen on a par with decision rules from rational decision theory. On the one hand, authors formalize PPs by using decision rules already established in decision theory, like maximin. On the other hand, they formulate new principles. While not necessarily located within the framework of decision theory, those are intended to work at the same level. Understood as principles of risk management, they are supposed to help to determine a course of action given our knowledge and our values.
i. Decision Rules
The terms used for decision-theoretic categories of non-certainty differ. In this article, they are used as follows: Decision-theoretic risk denotes situations in which we know the possible outcomes of actions and can assign probabilities to them. Decision-theoretic uncertainty refers to situations in which we know the possible outcomes, but either no or only partial or imprecise probability information is available (Hansson 2005a, 27). When we don’t even know the full set of possible outcomes, we have a situation of decision-theoretic ignorance. When formulated as decision rules, the “(scientific) uncertainty” component of PPs is often spelled out as decision-theoretic uncertainty.
Maximin
The idea to operationalize a PP with the maximin decision rule occurred early within the debate and is therefore often associated with PPs (for example, Hansson 1997; Sunstein 2005b; Gardiner 2006; Aldred 2013).
In order to be able to apply the maximin rule, we have to know the possible outcomes of our actions and be able to at least rank them on an ordinal scale (meaning that for each outcome, we can tell whether it is better, worse, or equally good than each other possible outcome). It then tells us to select the option with the best worst case in order to “maximize the minimum”. Thus, the maximin rule seems like a promising candidate for a PP. It pays special attention to the prevention of threats, and is applicable under conditions of uncertainty. However, as has repeatedly been pointed out, maximin is not a plausible rule of choice in general. Consider the decision matrix in Table 1.
Scenario1
Scenario2
Alternative1
7
6
Alternative2
15
5
Table 1: Simplified Decision-Matrix with Two Alternative Courses of Action.
Maximin selects Alternative1. This seems excessively risk-averse because the best case in Alternative2 is much better, and the worst case is only slightly worse, as long as we assume (a) that the utilities in this example are cardinal utilities, and (b) that there is not some kind of relevant threshold passed. If we knew that the probability for Scenario1 is 0.99 and the probability for Scenario2 only 0.01, then it would arguably be absurd to apply maximin. Proponents of interpreting a PP with maximin thus have stressed that it needs be qualified by some additional criteria in order to provide a plausible PP interpretation.
The most prominent example is Gardiner (2006), who draws on criteria suggested by Rawls to determine conditions under which the application of maximin is plausible:
Knowledge of likelihoods for the possible outcomes of the actions is impossible or at best extremely insecure;
the decision-makers care relatively little for potential gains that might be made above the minimum that can be guaranteed by the maximin approach;
the alternatives that will be rejected by maximin have unacceptable outcomes; and
the outcomes considered are in some adequate sense “realistic”, that is, only credible threats should be considered.
Condition (3) makes it clear that the guaranteed minimum (condition 2) needs to be acceptable to the decision-makers (see also Rawls 2001, 98). What it means that ‘gains above the guaranteed minimum are relatively little cared for’ (condition 2) has been spelled out by Aldred (2013) in terms of incommensurability between outcome values, that is, that some outcomes are so bad that they cannot be outweighed by potential gains. It is thus better to choose an option that promises only little gains but guarantees that the extremely bad outcome can’t materialize.
Gardiner argues that a maximin rule that is qualified by these criteria fits well with some core cases where we agree that precaution is necessary and calls it the “Rawlsian Core Precautionary Principle (RCPP)”. He names the purchase of insurance as an everyday-example where his RCPP fits well with our intuitive judgments and where precaution seems already justified on its own. According to Gardiner, it also fits well with often-named paradigmatic cases for precaution like climate change: The controversy whether or not we should take precautions in the climate case is not a debate around the right interpretation of the RCPP but rather about whether the conditions for its application are fulfilled—for example, which outcomes are unacceptable (Gardiner 2006, 56).
Minimax Regret
Another decision rule that is discussed in the context of PPs is the minimax regret rule. Whereas maximin selects the course of action with the best worst case, minimax regret selects the course of action with the lowest maximal regret. The regret of an outcome is calculated by subtracting its utility from the highest utility one could have achieved under this state by selecting another course of action. This strategy tries to minimize one’s regret for not having made the superior choice in hindsight. The minimax regret rule does not presuppose any probability information, like the maximin rule. However, while for the maximin rule it is enough if outcomes can be ranked on an ordinal scale, the minimax rule requires that we are able to assign cardinal utilities to the possible outcomes. Otherwise, regret cannot be calculated.
Take the following example from Hansson (1997), in which a lake seems to be dying for reasons that we do not fully understand: “We can choose between adding substantial amounts of iron acetate, and doing nothing. There are three scientific opinions about the effects of adding iron acetate to the lake. According to opinion (1), the lake will be saved if iron acetate is added, otherwise not. According to opinion (2), the lake will self-repair anyhow, and the addition of iron acetate makes no difference. According to opinion (3), the lake will die whether iron acetate is added or not.” The consensus is that the addition of iron acetate will have certain negative effects on land animals that drink water from the lake, but that effect is less serious than the death of the lake. Assigning the value -12 to the death of the lake and -5 to the negative effects of iron acetate in the drinking water, we arrive at the utility matrix in Table 2.
(1)
(2)
(3)
Add iron acetate
5
-5
-17
Do nothing
-12
0
-12
Table 2: Utility-Matrix for the Dying-Lake Case
We can then obtain the regret table by subtracting the utility of each outcome from the highest utility in each column, the result being Table 3. Minimax regret then selects the option to add iron acetate to the lake.
(1)
(2)
(3)
Add iron acetate
0
5
5
Do nothing
7
0
0
Table 3: Regret-Matrix for the Dying-Lake Case
Chisholm and Clarke (1993) strongly support the minimax regret rule. They argue that it is better suited for PP than maximin, since it gives some weight to foregone benefits. They also show that even if it is uncertain whether precautionary measures will be effective, minimax regret still recommends them as long as the expected damage from not implementing them is large enough. They advocate so-called “dual purpose” policies, where precautionary measures have other positive effects, even if they do not fulfill their main purpose. One example is measures that are aimed at abating global climate change, but at the same time have direct positive effects on local environmental problems. Contrarily, Hansson (1997) argues that to take precautions means to avoid bad outcomes, and especially to avoid worst cases. Consequently, he defends maximin and not minimax regret as the adequate PP interpretation. Maximin would, as Table 2 shows, select to not add iron acetate to the lake. According to Hansson, this is the precautionary choice as adding iron acetate could lead to a worse outcome than not adding it.
ii.Context-Sensitive Principles
Other interpretations of PPs as action-guiding principles differ from stand-alone if-this-then-that decision rules. They stress that principles have to be interpreted and concretized depending on the specific context (Fisher 2002; Randall 2011).
A Virtue Principle
Sandin (2009) argues that one can reinterpret a PP as an action-guiding principle not by reference to decision theory, but by using cautiousness as a virtue. He formulates an action-guiding virtue principle of precaution (VPP):
VPP—Perform those, and only those, actions that a cautious agent would perform in the circumstances. (Sandin 2009, 98)
Although virtue principles are commonly criticized as not being action-guiding, Sandin argues that understanding a PP in this way actually makes it more action-guiding. “Cautious” is interpreted as a virtue term that refers to a property of an agent, like “courageous” or “honest”. Sandin states that it is often possible to identify what the virtuous agent would do: Either because it is obvious, or because at least some agreement can be reached. Even the uncertain cases VPP is dealing with belong to classes of situations where we have experience with, for example, failed regulations of the past, and therefore can assess what the cautious agent would (not) have done and extrapolate from that to other cases (Sandin 2009, 99). According to Sandin, interpreting a PP as a virtue principle will avoid both objections of extremism and paralysis. It is unlikely that the virtuous agent will choose courses of action which will, in the long run, have overall negative effects or are self-refuting (like “ban activity a and do not ban activity a!”). However, even if one accepts that it makes sense to interpret “cautious” as a virtue, “the circumstances” under which one should choose the course of action that the cautious agent would choose are not specified in the VPP as it is formulated by Sandin. This makes it an incomplete proposal.
Reasonableness and Plausibility
Another important example is the PP interpretation by Resnik (2003, 2004), who defends a PP as an alternative to maximin and other strategies for decision-making in situations where we lack the type of empirical evidence that one would need for a risk management that uses probabilities obtained from risk assessment. His PP interpretation, which we can call the “reasonable measures precautionary principle” (RMPP), reads as follows:
RMPP—One should take reasonable measures to prevent or mitigate threats that are plausible and serious.
The seriousness of a threat relates to its potential for harm, as well as to whether or not the possible damage is seen as reversible or not (Resnik 2004, 289). Resnik emphasizes that reasonableness is a highly pragmatic and situation-specific concept. He names some neither exhaustive nor necessary criteria for reasonable responses: They should be effective, proportional to the nature of the threat, take a realistic attitude toward the threat, be cost-effective, and be applied consistently (Resnik 2003, 341–42). Lastly, that threats have to be credible means that there have to be scientific arguments for the plausibility of a hypothesis. These can be based on epistemic and/or pragmatic criteria, including for example coherence, explanatory power, analogy, precedence, precision, or simplicity. Resnik stresses that a threat being plausible is not the same as a threat being even minimally probable: We might accept threats as plausible that we think to be all but impossible to come to fruition (Resnik 2003, 341).
This shows that the question when a threat should count as plausible enough to warrant precautionary measures is very important for the application of an action-guiding PP. Consequently, such PPs are often very sensitive to how a problem is framed. Some authors took these aspects—the weighing of evidence and the description of the decision problem—to be central points of PPs, and interpreted them as epistemic principles, that is, principles at the level of risk assessment.
b. Epistemic Interpretations
Authors that defend an epistemic PP interpretation argue that we should accept that PPs are not principles that can guide our actions, but that this is neither a problem nor against their spirit. Instead of telling us how to act when facing uncertain threats of harm, they propose that PPs tell us something about how we should perceive these threats, and what we should take as a basis for our actions, for example, by relaxing the standard for the amount of evidence required to take action.
i. Standards of Evidence
One interpretation of an epistemic PP is to give more weight to evidence suggesting a causal link between an activity and threats of serious and irreversible harm than one gives to evidence suggesting less dangerous, or beneficial, effects. This could mean to assign a higher probability for an effect to occur than one would in other circumstances based on the same evidence. Arguably, the underlying idea of this PP can be traced back to the German philosopher Hans Jonas, who proposed a “heuristic of fear”, that is, to give more weight to pessimistic forecasts than to optimistic ones (Jonas 2003). However, this PP interpretation has been criticized on the basis that it systematically discounts evidence pointing in one direction, but not in the other. This could lead to distorted beliefs about the world in the long run, being detrimental to our epistemic and scientific progress and eventually doing more harm than good (Harris and Holm 2002).
However, other authors point out that we might have to distinguish between “regulatory science” and “normal science”. Different epistemic standards are appropriate for the two contexts since they have different aims: In normal science, we are searching for truth; in regulatory science, we are primarily interested in reducing risk and avoiding harm (John 2010). Accordingly, Peterson (2007a) refers in his epistemic PP interpretation only to decision makers—not scientists—who find themselves in situations involving risk or uncertainty. He argues that in such cases, decision-makers should strive to acquire beliefs that are likely to protect human health, and that it is less important whether they are also likely to be true. One principle that has been promoted in order to capture this idea is the preference for false positives, that is, for type I errors over type II errors.
ii. Type I and Type II Errors
Is it worse to falsely assert that there is a relationship between two classes of events, which does not exist (false positives), or to fail to assert such a relationship, when it in fact exists (false negatives)? For example, would you prefer a virus software on your computer which classifies a harmless program as a virus (false positive) or rather one that misses a malicious program (false negative)? Statistical hypotheses testing tests the so-called null-hypothesis, which is the default view that there is no relationship between two classes of events, or groups. Rejecting a true null hypothesis is called a type I error, whereas failing to reject a false null hypothesis is a type II error. Which type of possible error should we try to minimize, if we cannot minimize both at once?
In (normal) science, it is valued higher not to include false assertions into the body of knowledge, which would distort it in the long term. Thus, the default assumption—the null hypothesis—is that there is no connection between two classes of events, and typically statistical procedures are used that minimize type I errors (false positives) even if this might mean that an existing connection is missed (at least at first, or for a long time) (John 2010). To believe that a certain existing deterministic or probabilistic connection between two classes of events does not exist might slow down the scientific progress in normal science aiming at truth. However, in regulatory contexts it might be disastrous to believe falsely that a substance is safe when it is not. Consequently, a prominent interpretation of an epistemic PP takes it to entail a preference for type I errors over type II errors in regulatory contexts (see for example Lemons, Shrader-Frechette, and Cranor 1997; Peterson 2007a; John 2010).
Merely favoring one type of error over another might not be enough. It has been argued that the underlying methodology of either rejecting or accepting hypotheses does not sufficiently allow for identifying and tracking uncertainties. If a PP is understood as a principle that relaxes the standard for the amount of evidence required to take action, then a new epistemology might be needed: One that allows integrating the uncertainty about the causal connection between, for example, a drug and a harm, in the decision (Osimani 2013).
iii. Precautionary Defaults
The use of precautionary regulatory defaults is one proposal for how to deal with having to make regulatory decisions in the face of insufficient information (Sandin and Hansson 2002; Sandin, Bengtsson, and others 2004). In regulatory contexts, there are often situations in which a decision has to be made on how to treat a potentially harmful substance that also has some (potential) benefits. Other than in normal science, it is not possible to wait and collect further evidence before a verdict is made. The substance has to be treated one way or another while waiting for further evidence. Thus, it has been suggested that we should use regulatory defaults, that is, assumptions that are used in the absence of adequate information and that should be replaced if such information were obtained. They should be precautionary defaults by building in special margins of safety in order to make sure that the environment and human health get sufficient protection. One example is the use of uncertainty factors in toxicology. Such uncertainty factors play a role in estimating reference doses which are acceptable for humans by dividing a level of exposure found acceptable in animal experiments by a number (usually 100) (Steel 2011, 356). This takes into account that there are significant uncertainties, for example, in extrapolating the results from animals to humans. Such defaults are a way to handle uncertain threats. Nevertheless, they should not be confused with actual judgments about what properties a particular substance has (Sandin, Bengtsson, and others 2004, 5). Consequently, an epistemic PP does not have to be understood as a belief-guiding principle, but as saying something on which methods for risk assessment are legitimate, for example, for quantifying uncertainties (Steel 2011). According to this view, precautionary defaults like uncertainty factors in toxicology are methodological implications of a PP that allow to apply it in a scientifically sound way while protecting human health and the environment.
Given this, it might be misleading to interpret a PP as a purely epistemic principle, if it is not guiding our beliefs but telling us what assumptions to accept, that is, to act as if certain things were true, as long as we do not have more information. Thus, it has been argued that a PP is better interpreted as a procedural requirement, or as a principle that imposes several such procedural requirements (Sandin 2007, 103–4).
c. Procedural Interpretations
It has been argued that we should shift our attention when interpreting PPs from the question of what action to take to the question of what is the best way to reach decisions.
i. Argumentative, or “Meta”-PPs
Argumentative PPs are procedural principles specifying what kinds of arguments are admissible in decision-making (Sandin, Peterson, and others 2002). They are different from prescriptive, or action-guiding, PPs in that they do not directly prescribe actions that should be taken. Take principle 15 of the Rio Declaration on Environment and Development. On one interpretation, it states that arguments for inaction which are based solely on the ground that we are lacking full scientific certainty, are not acceptable arguments in the decision-making procedure:
Rio PP—“In order to protect the environment, the precautionary approach shall be widely applied by states according to their capabilities. Where there are threats of serious or irreversible damage, lack of full scientific certainty shall not be used as a reason for postponing cost-effective measures to prevent environmental degradation.” (United Nations Conference on Environment and Development 1992, Principle 15)
Such an argumentative PP is seen as a meta-rule that places real constraints on what types of decision rules should be used: For example, by entailing that decision-procedures should be used that are applicable under conditions of uncertainty, it recommends against some of the traditional approaches in risk regulation like cost-benefit analysis (Steel 2014). Similarly, it has been proposed that the idea behind PPs is best interpreted as a general norm that demands a fundamental shift in our way of risk regulation, based on an obligation to learn from regulatory mistakes of the past (Whiteside 2006).
ii. Transformative Decision Rules
Similar to argumentative principles, an interpretation of a PP as a transformative decision rule doesn’t tell us which action should be taken, but it puts constraints on which actions can be considered as valid options. Informally, a transformative decision rule is defined as a decision rule that takes one decision problem as input, and yields a new decision problem as output (Sandin 2004, 7). For example, the following formulation of a PP as a transformative decision rule (TPP) has been proposed by Peterson (2003):
TPP—If there is a non-zero probability that the outcome of an alternative act is very low, that is, below some constant c, then this act should be removed from the decision-maker’s list of options.
Thus, the TPP excludes courses of actions that could lead, for example, to catastrophic outcomes, from the options available to the decision maker. However, it does not tell us which of the remaining options should be chosen.
iii. Reversing the Burden of Proof
The requirement of reversal of burden of proof is one of the most prominent specific procedural requirements that are named in connection with PPs. For example, in the influential communication on the PP from the Wingspread Conference on the Precautionary Principle (1998), it is stated, “the proponent of an activity, rather than the public bears the burden of proof.”
One common misconception is that the proponent of a potentially dangerous activity would have to prove with absolute certainty that the activity is safe. This gave rise to the objection that PPs are too demanding, and therefore would bring every progress to a halt (Harris and Holm 2002). However, the idea is rather that we have to change our approach to regulatory policy: Proponents of an activity have to prove to a certain threshold that it is safe in order to employ it, instead of the opponents having to prove to a certain threshold that it is harmful in order to ban it.
Thus, whether or not the situation is one in which the burden of proof is reversed depends on the status quo. Instead of speaking of shifting the burden of proof, it seems more sensible to ask what has to be proven, and who has to provide what kind of evidence for it. The important point that then remains to be clarified is what standards of proof are accepted.
An alternative proposal to shifting the burden of proof is that both regulators and proponents of an activity (Arcuri 2007) should share it: If opponents want to regulate an activity, they should at least provide some evidence that the activity might lead to serious or irreversible harm, even though we are lacking evidence to prove it with certainty. Proponents, on the other hand, should provide certain information about the activity in order to get it approved. Who has the burden of proof can play an important role in the production of information: If proponents have to show (to a certain standard) that their activity is safe, this generates an incentive to gather information about the activity, whereas in the other case—“safe until proven otherwise”—they might deliberately refrain from this (Arcuri 2007, 15).
iv. Procedures for Determining Precautionary Measures
Interpreted in a procedural way, a PP puts constraints on how a problem should be described or how a decision should be made. It does not dictate a specific decision or action. This is in line with one interpretation of what it means to be a principle as opposed to a rule. While rules specify precise consequences that follow automatically when certain conditions are met, principles are understood as guidelines whose interpretation will depend on specific contexts (Fisher 2002; Arcuri 2007).
Developing a procedural precautionary framework that integrates different procedural requirements is a way to enable the context-dependent specification and implementation of such a PP. One example is Tickner’s (2001) “precautionary assessment” framework, which consists of six steps that are supposed to guide decision-making as a heuristic device. The first five steps—(1) Problem Scoping, (2) Participant Analysis, (3) Burden/Responsibility Allocation Analysis, (4) Environment and Health Impact Analysis, and (5) Alternatives Assessment—serve to describe the problem, identify stakeholders, and assess possible consequences as well as available alternatives. In the final step, (6) Precautionary Action Analysis, the appropriate precautionary measure(s) are determined based on the results from the other steps. These decisions are not permanent, but should be part of a continuous process of increasing understanding and reducing overall impacts.
That the components are clarified on a case-by-case basis is a big advantage of such procedural implementations of PPs. It avoids an oversimplification of the decision process and takes the complexity of decisions under uncertainty into account. However, they are criticized for losing the “principle” part of PPs: For example, Sandin (2007) argues that procedural requirements form a heterogeneous category. A procedural PP would soon dissolve beyond recognition because it is intermingled with other (rational, legal, moral, and so forth) principles and rules. As an answer, some authors try to preserve the “principle” in PPs, while also taking into account procedural as well as epistemic elements.
d. Integrated Interpretations
We can find two main strategies for formulating a PP that is still identifiable as an action-guiding principle while integrating procedural as well as epistemic considerations: Either (1) developing particular principles that are specific to a certain context, and accompanied by a procedural framework for this context; or (2) describing the structure and the main elements of a PP plus naming criteria for adjusting those elements on a case-by-case basis.
i. Particular Principles for Specific Contexts
It has been argued that the general talk of “the” PP should be abandoned in favor of formulating distinct precautionary principles (Hartzell-Nichols 2013). This strategy aims to arrive at action-guiding and coherent principles by formulating particular PPs that apply to a narrow range of threats and express a specific obligation. One example is the “Catastrophic Harm PP (CHPP)” of Hartzell-Nichols (2012, 2017), which is restricted to catastrophic threats. It consists of eight conditions that specify when precautionary measures have to be taken, spelling out (a) what counts as a catastrophe, (b) the knowledge requirements for taking precaution, and (c) criteria for appropriate precautionary measures. The CHPP is accompanied by a “Catastrophic Precautionary Decision-Making Framework” which guides the assessment of threats in order to decide whether they meet the CHPP’s criteria, and guides decision-makers in determining what precautionary measures should be taken against a particular threat of catastrophe. This framework lists key considerations and steps that should be performed when applying the CHPP, for example, drawing on all available sources of information, assessing likelihoods of potential harmful outcomes under different scenarios, identifying all available courses of precautionary action and their effectiveness, and identifying specific actors who should be held responsible for taking the prescribed precautionary measures.
ii. An Adjustable Principle with Procedural Instructions
Identifying main elements of a PP and accompanying them with rules for adjusting them on a case-by-case basis is another strategy to preserve the idea of a precautionary principle while avoiding both inconsistency as well as vagueness. It has been shown that as diverse as PP formulations are, they typically share the elements of uncertainty, harm, and (precautionary) action (Sandin 1999, Manson 2002). By explicating these concepts and, most importantly, by defining criteria for how they should be adjusted with respect to each other, some authors obtain a substantial PP that can be adjusted on a case-by-case basis without becoming arbitrary.
One example is the PP that Randall (2011) develops in the context of an in-depth analysis of traditional, or as he calls it, ordinary risk management (ORM). Randall identifies the following “general conceptual form of PP”:
If there is evidence stronger than E that an activity raises a threat more serious than T, we should invoke a remedy more potent than R.
Threat, T, is explicated as chance of harm, meaning that threats are assessed and compared according to their magnitude and likelihood. Our knowledge of outcomes and likelihoods is explicated with the concept of evidence, E, referring to uncertainty in the sense of our incomplete knowledge about the world. The precautionary response is conceptualized as remedy, R, which covers a wide range of responses from averting the threat, remediating its damage, mitigating harm, and adapting to changed conditions after other remedies have been exhausted. Remedies should fulfill a double function, (1) providing protection from a plausible threat, while at the same time (2) generating additional evidence about the nature of the threat and the effectiveness of various remedial actions. The main relations between the three elements are that the higher the likelihood that the remedy-process will generate more evidence, the smaller is the threat-standard and the lower is the evidence-standard that should be required before invoking the remedy even if we have concerns about its effectiveness (Randall 2011, 167).
Having clarified the concepts used in the ETR-framework, Randall specifies them in order to formulate a PP that accounts for the weaknesses of ORM:
Credible scientific evidence of plausible threat of disproportionate and (mostly but not always) asymmetric harm calls for avoidance and remediation measures beyond those recommended by ordinary risk management. (Randall 2011, 186)
He then goes on to integrate this PP and ORM together into an integrated risk management framework. Randall makes sure to stress that a PP cannot determine the decision-process on its own. As a moral principle, it has to be weighed against other moral, political, economic, and legal considerations. Thus, he also calls for the development of a procedural framework to ensure that its substantial normative commitments will be implemented on the ground (Randall 2011, 207).
Steel (2014, 2013) develops a comprehensive PP interpretation which is intended to be “a procedural requirement, a decision rule, and an epistemic rule” (Steel 2014, 10). Referring to the Rio Declaration, Steel argues that such a formulation of a PP states that our decision-process should be structured differently, namely that decision-rules should be used that can be applied in an informative way under uncertainty. However, he does not take this procedural element to be the whole PP, but interprets it as a “meta”-rule which guides the application and specification of the precautionary “tripod” of threat, uncertainty, and precautionary action. More specifically, Steel’s proposed PP consists of three core elements:
The Meta Precautionary Principle (MPP): Uncertainty must not be a reason for inaction in the face of serious threats.
The Precautionary Tripod: The elements that have to be specified in order to obtain an action-guiding precautionary principle version, namely: If there is a threat that meets the harm condition under a given knowledge condition then a recommended precaution should be taken.
Proportionality: Demands that the elements of the Precautionary Tripod are adjusted proportionally to each other, understood as Consistency: The recommended precaution must not be recommended against by the same PP version, and Efficiency: Among those precautionary measures that can be consistently recommended by a PP version, the least costly one should be chosen.
An application of this PP requires selecting what Steel calls a “relevant version of PP,” that is, a specific instance of the Precautionary Tripod that meets the constraints from both MPP and Proportionality. To obtain such a version, Steel (2014, 30) proposes the following strategy: (1) select a desired safety target and define the harm condition as a failure to meet this target, (2) select the least stringent knowledge condition that results in a consistently applicable version of PP given the harm condition. To comply with the MPP, uncertainty must neither turn the PP version inapplicable nor lead to continual delay in taking measures to prevent harm.
Thus, Steel’s PP proposal guides decision-makers both in formulating the appropriate PP version as well as in its application. The process of formulating the particular version already deals with many questions like how evidence should be assessed, who has to prove what, to what kind of threats we should react, and what appropriate precautionary measures would be. Arguably, this PP can thereby be action-guiding, since it helps to select specific measures, without being a rigid prescriptive rule that is not suited for decisions under uncertainty.
Additionally, proposals like the ones of Randall and Steel have the advantage that they are not rigidly tied to a specific category of decision-theoretic non-certainty, that is, decision-theoretic risk, uncertainty, or ignorance. They can be adjusted with respect to varying degrees of knowledge and available evidence, taking into account that we typically have some imprecise or vague sense of how likely various outcomes are, but not enough of a sense to assign meaningful precise probabilities to the outcomes. While these situations do not amount to decision-theoretic risk, they nonetheless include more information than what is often taken to be available in decision-theoretic uncertainty. Arguably, this better corresponds to the notion of “scientific uncertainty” than to equate the latter with decision-theoretic uncertainty (see Steel 2014, Chapter 4).
3. Justifications for Precautionary Principles
This section surveys different normative backgrounds that have been used to defend a PP. It starts by addressing arguments that can be located in the framework of practical rationality, before moving to substantial moral justifications for precautions.
a. Practical Rationality
When PPs are proposed as principles of practical rationality, they are typically seen as principles of risk regulation. This includes, but is not reduced to, rational choice theory. When we examine the justifications for PPs in this context, we have to do this against the background of established risk regulation practices. We can identify a rather standardized approach to the assessment and management of risks, which Randall (2011, 43) calls “ordinary risk management (ORM).”
i. Ordinary Risk Management
Although there are different understandings of ORM, we can identify a rather robust “core” of two main parts. First, a scientific risk assessment is conducted, where potential outcomes are identified and their extent and likelihood estimated (compare Randall 2011, 43–46). Typically, risk assessment is understood as a quantitative endeavor, expressing numerical results (Zander 2010, 17). Second, on the basis of the data obtained from the risk assessment, the risk management phase takes place. Here, alternative regulatory courses of action as response to the scientifically estimated risks are discussed, and a choice is made between them. While the risk assessment phase should be as objective and value-free as possible, the decisions that take place in the risk management phase should be, although informed by science, based on the values and interests of the parties involved. In ORM, cost-benefit analysis (CBA) is a powerful and widely used tool for making these decisions in the risk-management phase. To conduct a CBA, the results from the risk assessment, that is, what outcomes are possible under which course of action, are evaluated according to the willingness to pay (WTP) or willingness to accept compensation (WTA) of individuals in order to estimate the benefits and costs of different courses of actions. That means that non-economic values, like human lives or environmental preservation, are getting monetized in order to be comparable on a common ratio-scale. Since we rarely if ever are facing cases of certainty, where each course of action has exactly one outcome which will materialize if we chose it, these so-reached utilities are then probability-weighed and added up in order to arrive at the expected utility of the different courses of action. On this basis, it is possible to calculate which regulatory actions have the highest expected net benefits (Randall 2011, 47), that is, to apply the principle of maximizing expected utility (MEU) and to choose the option with the highest expected utility. CBA is seen as a tool that enables decision-makers to rationally compare costs and benefits, helping them to come to an informed decision (Zander 2010, 4).
In the context of ORM, we can distinguish two main lines of argumentation for PPs: On the one hand, authors argue that PPs are rational by trying to show that they gain support from ORM. On the other hand, authors argue that ORM itself is problematic in some aspects, and propose PPs as a supplement or alternative to it. In both cases, we find justifications for PPs as decision rules for risk management as well as principles that pertain to the risk assessment stage and are concerned with problem-framing (this includes epistemic and value-related questions).
ii. PPs in the Framework of Ordinary Risk Management
To begin, here are some ways in which people propose to locate and defend PPs within ORM.
Expected Utility Theory
Some authors claim that as long as we can assign probabilities to the various outcomes, that is, as long as we are in a situation of decision-theoretic risk, precaution is already “built in” into ORM (Chisholm and Clarke 1993; Gardiner 2006; Sunstein 2007). The argument is roughly that no additional PP is necessary because expected utility theory in combination with the assumption of decreasing marginal utility allows for risk aversion by placing greater weight on the disutility of large damages. Not to choose options with possibly catastrophic outcomes, even if they only have a small probability, would thus be recommended by the principle of maximizing expected utility (MEU) as a consequence of their large disutility.
This argumentation does not go unchallenged, as the next subsection (3.a.iii) shows. Additionally, MEU itself is not uncontroversial (see Buchak 2013). Still, even if we accept it, we cannot use MEU under conditions of decision-theoretic uncertainty, since it relies on probability information. Consequently, authors proposed PPs for decisions under uncertainty in order to fill this “gap” in the ORM framework. They argue that under decision-theoretic uncertainty, it is rational to be risk-averse, and try to demonstrate this with arguments based on rational choice theory. However, it is not always clear if the discussed decision rule is used to justify a—somehow—already formulated PP, or if the decision rule is proposed as a PP itself.
Maximin and Minimax Regret
Both the maximin rule—selecting the course of action with the best worst case—and the minimax regret rule—selecting the course of action where under each possible scenario, the maximal regret is the smallest—have been proposed and discussed as possible formalizations of a PP within the ORM framework. It has been argued that maximin captures the underlying intuitions of PPs (namely, that the worst should be avoided) and that it yields rational decisions in relevant cases (Hansson 1997). Although the rationality of maximin is contested (Harsanyi 1975; Bognar 2011), it is argued that we can qualify it with criteria to single out the cases in which it can—and should—rationally be applied (Gardiner 2006). This is done by showing that a so-qualified maximin rule fits with paradigm cases of precaution and commonsense-decisions that we make, arguing that it is plausible to adopt it also for further cases.
Chisholm and Clarke (1993) argue that the minimax regret rule leads to the prevention of uncertain harm in line with the basic idea of a PP, while also giving some weight to forgone benefits. Against minimax regret and in favor of maximin, Hansson (1997, 297) argues that, firstly, minimax regret presupposes more information, since we need to be able to assign numerical utilities to outcomes. Secondly, he uses a specific example to show that minimax regret and maximin can lead to conflicting recommendations. According to Hansson, the recommendation made by maximin expresses a higher degree of precaution.
Quasi-Option Value
Irreversible harm is mentioned in many PP formulations, for example in the Rio Declaration. One proposal to justify why “irreversibility” justifies precautions refers to the concept of “(quasi-)option value” (Chisholm and Clarke 1993; Sunstein 2005a, 2009), which was first introduced by Arrow and Fisher (1974). They show that when regulators are confronted with decision problems where they are (a) uncertain about the outcomes of the options, but there are (b) chances for resolving or reducing these uncertainties in the future, and (c) one or more of the options might entail irreversible outcomes, then they should attach an extra-value, that is, an option-value to the reversible options. This takes into account the value of the options that choosing an alternative with irreversible outcome would foreclose. To illustrate this, think of the logging of (a part of) a rain forest: It is a very complex ecosystem, which we could use in many ways. But once it is clear-cut, it is almost impossible to restore to its original state. By choosing the option to cut it down, all options to use the rain forest in any other way would practically be lost forever. As Chisholm and Clarke (1993, 115) point out, irreversibility might sometimes be associated with not taking actions now: Not mitigating greenhouse gas (GHG) emissions means that more and more GHG aggregate in the atmosphere, where they stay for a century or more. They argue that introducing the concept of quasi-option value supports the application of a PP even if decision makers are not risk-averse.
iii. Reforming Ordinary Risk Management
After reviewing attempts to justify a PP in the ORM framework, without challenging the framework itself, let us now examine justifications for PPs that are partially based on criticisms of ORM.
Deficits of ORM
As a first point, ORM as a regulatory practice tends toward oversimplification that neglects uncertainty and imprecision, leading to irrational and harmful decisions. This is seen as a systematic deficit of ORM itself, not only of its users (see Randall 2011, 77), and not only as a problem under decision-theoretic uncertainty, that is, situations where no reliable probabilities are available, but already under decision-theoretic risk. First, decision makers tend to ignore low probabilities as irrelevant, focusing on the “more realistic,” higher ones. This means that low, but significant probabilities for catastrophe are ignored, for example, so called “fat tails” in climate scenarios (Randall 2011, 77). Second, decision makers are often “myopic”, placing higher weight on current costs than on future benefits, avoiding high costs today. This often leads to even higher costs in the future. Third, disutilities might get calculated too optimistically, neglecting so-called “secondary effects” or “social amplifications,” for example, the psychological and social effects of catastrophes (see Sunstein 2007, 7). Lastly, since cost-benefit analysis (CBA) provides such a clear view, there is a tendency to apply it even if the conditions for its application are not fulfilled. We tend to assume more than we know, and to decide according to the MEU criterion although no reliable probability information and/or no precise utility information is available. This so-called “tuxedo fallacy” is seen as a dangerous fallacy because it creates an “illusion of control” (Hansson 2008, 426–27).
Since PPs are seen as principles that address exactly such problems—drawing our attention on unlikely catastrophic possibilities, demanding action besides uncertainty, to consider the worst possible outcomes, and not to assume more than we know—they gain indirect support from these arguments. ORM in its current form allures us to apply it incorrectly and to neglect rational precautionary action. At least some sort of overarching PP that reminds us of correct practices seems necessary.
As a second point, it is argued that the regulatory practice of ORM has not only the “built-in” tendency to miss-apply its tools, but that it has fundamental flaws in itself which should be corrected by a PP. Randall (2011, 46–70) criticizes risk assessment in ORM on the grounds that it is typically built on simple models of the threatened system, for example, the climate system. Those neglect systemic risks like the possibility of feedback effects or sudden regime shifts. By depending on the law of large numbers, ORM is also not a decision framework that is suitable to deal with potential catastrophes, since they are singular events (Randall 2011, 52). Similarly, Chisholm and Clarke (1993, 112) argue that expected utility theory is only useful as long as “probabilities and possible outcomes are within the normal range of human experience.” Examples for such probabilities and outcomes in the normal range of human experience are insurances like car and fire insurance: We have statistics about the probabilities of accidents or fires, and can calculate reasonable insurance premiums based on the law of large numbers. Furthermore, we have experience with how to handle them, and have institutions in place like fire departments. None of this is true for singular events like anthropogenic climate change. Consequently, it is argued that we cannot just leave ORM relatively unaltered, and support it with a PP for decisions under uncertainty, and perhaps a more general, overarching PP as a normative guideline. Instead, it is demanded that we also have to reform the existing ORM framework in order to include precautionary elements.
Historical Arguments for Revising ORM
In the past, failures to take precautionary measures often resulted in substantial, widespread, and long-term harm to the environment and human health (Harremoës and others 2001, Gee and others 2013). This insight has been used to defend adopting a precautionary principle as a corrective to existing practices: For John (2007, 222), these past failures can be used as “inductive evidence” in an argument for reforming our regulatory policies. Whiteside (2006, 146) defends a PP as a product of social learning from past mistakes. According to Whiteside, these past mistakes reveal that (a) our knowledge about the influences of our actions on complex ecological systems is insufficient, and (b) that how decisions were reached was an important part of their inefficiency, leading to insufficient protection of the environment and human health. As such, to Whiteside, the PP generates a normative obligation to re-structure our decision-procedures (Whiteside 2006, 114). The most elaborate historical argument is made by Steel (2014, Chapter 5). Steel’s argument rests on the following premise:
If a systematic pattern of serious errors of a specific type has occurred, then a corrective for that type of error should be sought. (Steel 2014, 91)
By critically examining not only cases of failed precautions and harmful outcomes, but also counter-examples of allegedly “excessive” precaution, Steel shows that such a pattern of serious errors in fact exists. Cases such as the ones described in “Late Lessons from Early Warnings” (Harremoës and others 2001) demonstrate that continuous delays in response to emerging threats have frequently led to serious and persistent harms. Steel (2014, 74–77) goes on to examine cases that have been named as examples of excessive precaution. He finds that in fact, often no regulation whatsoever was implemented in the first place. And in cases where regulations were put in place, they were mostly very restricted, had only minimal negative effects, and were relatively easily reversible. For example, one of the “excessive precautions” consisted in putting a warning label on products containing saccharine in the US. According to Steel (2014, 82), the historical argument thus supports a PP as a corrective against a systematic bias that is entrenched in our practices. This bias emerges because there are informational and political asymmetries that make continual delays more likely than precautionary measures when there are trade-offs between short-term economic gain for an influential party against harms that are uncertain or distant in terms of space or time (or all three).
Epistemic Implications
The justifications presented so far all concern PPs aiming at the management of risks, that is, action-guiding interpretations. But we can also find discussions of a PP for the assessment of threats, so called “epistemic” PPs. It is not enough to just supply existing practices with a PP; clearly, risk assessment has to be changed, too, in order to be able to apply a PP. This means that uncertainties have to be taken seriously and to be communicated clearly, that we need to employ more adequate models which take into account the existence of systemic risks (Randall 2011, 77–78), that we need criteria to identify plausible (as opposed to “mere”) possibilities, and so on. However, this is more a question of the implications of adopting a PP, not an expression of a genuine PP itself. Thus, these kinds of argument are either presuppositions for a PP, because we need to identify uncertain harms first in order to do something about them. Or they are implications from a PP, because it is not admissible to conduct a risk assessment that makes it impossible to apply a PP.
Procedural Precaution
Authors who favor a procedural interpretation of PPs stress that they are concerned especially with decisions under conditions of uncertainty. They point out that while ORM, with its focus on cost-effectiveness and maximizing benefits, might be appropriate for conditions of decision-theoretic risk, the situation is fundamentally different if we have to make decisions under decision-theoretic uncertainty or even decision-theoretic ignorance. For example, Arcuri (2007, 20) points out that since PPs are principles particularly for decisions under decision-theoretic uncertainty, they cannot be prescriptive rules which tell us what the best course of action is—because the situation is essentially characterized by the fact that we are uncertain about the possible outcomes to which our actions can lead. Tickner (2001, 14) claims that this should lead to redirecting the questions that are asked in environmental decision-making: The focus should be moved from the hazards associated with a narrow range of options to solutions and opportunities. Thus, the assessment of alternatives is a central point of implementing PPs in procedural frameworks:
In the end, acceptance of a risk must be a function not only of hazard and exposure but also of uncertainty, magnitude of potential impacts and the availability of alternatives or preventive options. (Tickner 2001, 122)
Although (economical) efficiency should not be completely dismissed and still should have its place in decision-making, proponents of a procedural PP proclaim that we should shift our aim in risk regulation from maximizing benefits to minimizing threats, especially in the environmental domain where harms are often irreversible (compare Whiteside 2006, 75). They also advocate democratic participation, pointing out that a decision-making process under scientific uncertainty cannot be a purely scientific one (Whiteside 2006, 30–31; Arcuri 2007, 27). They thus see procedural interpretations of PPs as justified with respect to the goal of ensuring that decisions are made in a responsible and defensible way, which is especially important when there are substantial uncertainties about their outcomes.
Challenging the Underlying Value Assumptions
In addition to scientific uncertainty, Resnik (2003, 334) distinguishes another kind of uncertainty, which he calls “axiological uncertainty.” Both kinds make it difficult to implement ORM in making decisions. While scientific uncertainty arises due to our lack of empirical evidence, axiological uncertainty is concerned with our value assumptions. This kind of uncertainty can take on different forms: We can be unsure about how to measure utilities—in dollars lost/saved, lives lost/saved, species lost/saved, or something else? Then, we can be uncertain how to aggregate costs and benefits, and how to compare, for example, economic values with ecological ones. Values cannot always be measured on a common ordinal scale, much less on a common cardinal scale (as ORM requires, at least in some senses such as those including the use of a version of cost-benefit analysis). Thus, it is irrational to treat them as if they would fulfill this requirement (Thalos 2012, 176–77; Aldred 2013). This challenges the value assumptions underlying ORM, and is seen as a problem that should be fixed by a PP.
Additionally, authors like Hansson (2005b, 10) criticize that it is essentially problematic that costs and benefits get aggregated without regard to who has them, and that person-related aspects like autonomy, or if a risk is willingly taken or imposed by others, are unjustly neglected.
To sum up, we can say that when the underlying value assumptions of ORM are challenged, either the criticism pertains to how values are estimated and assigned, or the utilitarian decision criterion of maximizing overall expected utility is criticized. In both cases, we are arguably leaving the framework of rational choice and ORM, and move toward genuine moral justifications for PPs.
b. Moral Justifications for Precaution
Some authors stress that, regardless of whether a PP is thought to supplement ordinary risk management (ORM) or whether it is a more substantive claim, a PP is essentially a moral principle, and has to be justified on explicitly moral grounds. (Note that depending on the moral position one holds, many of the considerations in 3.a can also be seen as discussions of PPs from a moral standpoint; most prominently utilitarianism, since ORM uses the rule of maximizing expected utility.) They argue that taking precautionary measures under uncertainty is morally demanded, because otherwise we risk damages that are in some way morally unacceptable.
i. Environmental Ethics
PPs are often associated with environmental ethics, and the concept of sustainable development (O’Riordan and Jordan 1995; Kaiser 1997; Westra 1997; McKinney and Hill 2000; Steele 2006; Paterson 2007). Some authors take environmental preservation to be at the core of PPs. PP formulations such as the Rio or the Wingspread PP emerged in a debate about the necessity to prevent environmental degradation, which explains why many PPs highlight environmental concerns. It seems plausible that a PP can be an important part of a broader approach to environmental preservation and sustainability (Ahteensuu 2008, 47). But it seems difficult to justify a PP with recourse to sustainability, since the concept itself is vague and contested. Indeed, when PPs have been discussed in the context of sustainability, they are often proposed as ways to operationalize the vague concept into a principle for policymaking, along with other principles like the “polluter pays” principle (Dommen 1993; O’Riordan and Jordan 1995). Thus, while PPs are partly motivated by the insight that our way of life is not sustainable, and that we should change how we approach environmental issues, it is difficult to justify them solely on such grounds. However, the hope is that a clarification of the normative (moral) underpinnings of PPs will help to justify a PP for sustainable development. In the following, we will see that it might make sense to take special precautions with respect to ecological issues, not only because they often are complex and might entail unresolvable uncertainties (Randall 2011, 64–70), but also because harm to the environment can affect many other moral concerns, for example, human rights and both international and intergenerational justice. As we will see, these moral issues might provide justifications for PPs on their own, without explicit reference to sustainability.
ii. Harm-Based Justifications
PPs that apply to governmental regulatory decisions have been defended as an extension of the harm principle. There are different versions of the harm principle, but roughly, it states that the government is justified in restricting citizens’ individual liberty only to avoid harm to others.
The application of the harm principle normally presupposes that certain conditions are fulfilled, for example, that the harms in question must be (1) involuntarily taken, (2) sufficiently severe and (3) probable, and (4) the prescribed measures must be proportional to the harms (compare Jensen 2002, Petrenko and McArthur 2011). If these conditions are fulfilled, the prevention principle can be applied, prescribing proportional measures to prevent the harm in question from materializing. However, PPs apply to cases where we are unsure about the extent and/or the probability of a possible harm. Consequently, PPs are seen as a “clarifying amendment” (Jensen 2002, 44) which extends the normative foundation of the harm principle from prevention to precaution (Petrenko and McArthur 2011, 354): The impossibility to assign probabilities does not negate the obligation to act as long as possible harms are severe enough and scientifically plausible. Even for the prevention principle, it holds that the more severe a threat is, the less probable it has to be in order to warrant preventive measures. Thus, it has been argued that the probability of high-magnitude harms becomes almost irrelevant, as long as they are scientifically plausible (Petrenko and McArthur 2011, 354–55). Additionally, some harm is seen as so serious that it warrants special precaution, for example, if it is irreversible or cannot be (fully) compensated (Jensen 2002, 49–50). In such situations, the government is justified in restricting liberties by, for example, prohibiting a technology, even if there remains uncertainty about whether or not the technology would actually have harmful effects.
A related idea is that governments have an institutional obligation not to harm the population, which overrides the weaker obligation to do good—meaning that it is worse if certain regulatory decisions of the government lead to harm than if they lead to foregone benefits (John 2007).
The question what exactly makes a threat severe enough to justify the implementation of precautionary measures has also been discussed with reference to justice- and rights-based considerations.
iii. Justice-Based Justifications
McKinnon (2009, 2012) presents two independent arguments for precautions, which both are justice-based. Those arguments are developed with respect to the possibility of a climate change catastrophe (CCC), and concern two alternative courses of action and their worst cases. The case of “Unnecessary Expenditure” means taking precautions which turn out to have been unnecessary, thereby wasting money which could have been spent for other, better purposes. “Methane Nightmare” describes the case of not taking precautions, leading to CCCs with catastrophic consequences, making survival on earth very difficult if not impossible. McKinnon argues that CCCs are uncertain in the sense that they are scientifically plausible, even though we cannot assign probabilities to them (McKinnon 2009, 189).
Playing it Safe
McKinnon’s first argument for why uncertain, yet plausible harm with the characteristics of CCCs justifies precautionary measures is called the “playing safe”– argument. It is based on two Rawlsian commitments about justice (McKinnon 2012, 56): (1) That treating people as equals means (among other things) to ensure a distribution of (dis)advantage among them that makes the worst-off group as well off as possible, and (2) that justice is intergenerational in scope, governing relations across generations as well as within them.
McKinnon (2009, 191–92) argues that the distributive injustice would be so much higher if “Methane Nightmare” would materialize than if it came to “Unnecessary Expenditure” that we have to choose to take precautionary measures, even though we do not know how probable “Methane Nightmare” is. That is to say, such a situation warrants the application of the maximin-principle, because distributive justice in the sense of making the worst-off as well off as possible has lexical priority to maximizing the overall benefits for all. Choosing an option that has a way better best case, but, in the worst-case, would lead to distributive injustice, over another option which might have a less-good best-case, but where the worst-case does not entail such distributive injustices, would be inadmissible.
Unbearable Strains of Commitment
As McKinnon notes, the “playing safe” justification only holds if one accepts a very specific understanding of distributive (in)justice. However, she claims to have an even more fundamental argument for precautionary measures in this context, which is also based on Rawlsian arguments concerning intergenerational justice, but does not rely on a specific conception of distributive justice. It is called the “unbearable strains of commitment”-argument and is based on a combination of the “just savings”-principle for intergenerational justice together with the “impartiality”-principle. It states that we should not choose courses of actions that impose on future generations conditions which we ourselves could not agree to and which would undermine the bare possibility of justice itself (McKinnon 2012, 61). This justifies taking precautions against CCCs, since the worst-case in that option is “Unnecessary Expenditure”, which, in contrast to “Methane Nightmare” would not lead to justice-jeopardizing consequences.
iv. Rights-Based Justifications
Strict precautionary measures concerning climate change have been demanded based on the possible rights violations that such climate change might entail. For example, Caney (2009) claims that although other benefits and costs might be discounted, human rights are so fundamental that they must not be discounted. He argues that the possible harms involved in climate change justify precautions: An unmitigated climate change entails possible outcomes which would lead to serious or catastrophic right violations, while a policy of strict mitigation would not involve a loss of human rights—at least not if it is carried out by the affluent members of the world. Additionally, “business as usual” from the affluent would mean to gamble with the conditions of those who already lack fundamental rights protection, because the negative effects of climate change would come to bear especially in poor countries. Moreover, the benefits of taking the “risk of catastrophic climate change” outcomes would almost entirely result for the risk-takers, not the risk-bearers (Caney 2009, 177–79). If we extrapolate from this concrete application, the basic justification for precaution seems to be: If a rights violation is plausibly possible, and there are ways to avoid this possibility by choosing another course of action, which does not involve the plausible possibility of rights violations, then we have to choose the second option. It does not matter how likely the rights violations are going to happen; as long as they are plausible, we have to treat them as if they would materialize with certainty.
Thus, in this interpretation, precaution means making sure that no rights violations happen, even if we (because of uncertainty) “run the risk” of doing more than what would have been necessary—as long as we don’t have to jeopardize our own rights in order to do so.
v. Ethics of Risk and Risk Impositions
Some authors see the PP as an expression of a problem with what they call standard ethics (Hayenhjelm and Wolff 2012, e28). According to them, standard ethical theories, with their focus on evaluations of actions and their outcomes under conditions of certainty, fail to keep up with the challenges that technological development poses. PPs are then placed in the broader context of developing and defending an ethics of risk, that is, a moral theory about the permissibility of risk impositions. Surprisingly, so far there are few explicit connections between the discussion of the ethics of risk impositions (see for example Hansson 2013, Lenman 2008, Suikkanen 2019) and the discussion of PPs.
One exemption is Munthe (2011), who argues that before we can formulate an acceptable and intelligible PP, we first need at least the basic structure of an ethical theory that deals directly with issues of creating and avoiding risks of harm. In Chapter 5 of his book, Munthe (2011) sets out to develop such a theory, which focuses on the responsibility of a decision, specifically, responsibility as a property of decisions: Decisions and risk impositions may be morally appraised in their own right. When one does not know what the outcome of a decision will be, it is important to make responsible decisions, that is, decisions that can still be defensible as being responsible given the information one had at the time the decision was made, even if the outcome is wrong. However, even though Munthe’s discussion starts out from the PP, he ultimately concludes that we do not need a PP, but a policy that expresses a proper degree of precaution: “What is needed is plausible theoretical considerations that may guide decision makers also employing their own judgement in specific cases. We do not need a precautionary principle, we need a policy that expresses a proper degree of precaution.” Thus, the idea seems to be that while a fully developed ethics of risk will justify demands commonly associated with PPs, it ultimately will replace the need for a PP.
4. Main Objections and Possible Rejoinders
This section presents the most frequent and the most important objections and challenges PPs face. They can be roughly divided into three groups. The first argues that there are fundamental conceptual problems with PPs, which make them unable to guide our decisions. The second claims that PPs, in any reasonable interpretation, are superfluous and can be reduced to existing practices done right. The third rejects PPs as irrational, saying that they are based on unfounded fears and that they contradict science, leading to undesirable consequences. While some objections are aimed at specific PP-proposals, others are intended as arguments against PPs in general. However, even the latter typically hold only for specific interpretations. This section shortly presents the main points of these criticisms, and then discusses how they might be answered.
a. PPs Cannot Guide Our Decisions
There are two main reasons why PPs are seen as unable to guide us in our decision-making: They are rejected either as incoherent, or as being vacuous and devoid of normative content.
Objection: PPs are incoherent
One frequent criticism, most prominently advanced by Sunstein (2005b), is that a “strong PP” leads to contradicting recommendations and is therefore paralyzing our decision-making. He understands “strong PP” as a very demanding principle which states that “regulation is required whenever there is a possible risk to health, safety, or the environment, even if the supporting evidence remains speculative and the economic costs of regulation are high” (Sunstein 2005b, 24). The problem is that every action poses such a possible risk, and thus, both regulation and non-regulation would be prohibited by the “strong PP,” resulting in paralysis (Sunstein 2005b, 31). Hence, “strong PP” is rejected as an incoherent decision-rule, because it leads to contradicting recommendations.
Peterson (2006) makes another argument that rejects PPs as incoherent. He claims that he can prove formally as well as informally that every serious PP formulation is logically inconsistent with reasonable conditions of rational choice, and should therefore be given up as a decision-rule (Peterson 2006, 597).
Rejoinder
Both criticisms have been rejected as being based on a skewed PP interpretation. In the case of Sunstein’s argument, he is attacking a straw-man. His critique of the “strong PP” as paralyzing relies on two assumptions which are not made explicit, namely (a) that a PP is invoked by any and all risks, and (b) that risks of action and inaction are typically equally balanced (Randall 2011, 20). However, this is an atypical PP interpretation. Most formulations make explicit reference to severe dangers, meaning that not just any possible harm, no matter how small, will invoke a PP. And, as the case studies in Harremoës and others (2001) illustrate, the possible harms from action and inaction—or, more precisely, regulation or no regulation—are typically not equally balanced (see also Steel 2014, Chapter 9). Still, Sunstein’s critique calls attention to the important point of risk-risk trade-offs, which every sound interpretation and application of a PP has to take into account: Taking precautions against a possible harm should not lead to an overall higher level of threat (Randall 2011, 84–85). Nevertheless, there seems to be no reason why a PP should not be able to take this into account, and the argument thus fails as a general rejection of PPs.
Similarly, it can be contested whether Peterson’s (2006) PP formalization is a plausible PP candidate: He presupposes that we can completely enumerate the list of possible outcomes, that we have rational preferences that allow for a complete ordering of the outcomes, and that we can estimate at least the relative likelihood of the outcomes. As Randall (2011, 86) points out, this is an ideal setup for ordinary risk management (ORM), and the three conditions for rational choice that Peterson cites and with which he shows his PP to be inconsistent, have their place in the ORM- framework. Thus, one can object that it is not very surprising if a PP, which aims especially at situations in which the ideal conditions are not met, does not do very well under the ideal conditions.
Objection: PPs are vacuous
On the other hand, it is argued that if a PP is attenuated in order not to be paralyzing, it becomes such a weak claim that it is essentially vacuous. Sunstein (2005b, 18) claims that weaker formulations of PPs are, although not incoherent, trivial: They merely state that lack of absolute scientific proof is no reason for inaction, which, according to Sunstein, has no normative force because everyone is already complying with it. Similarly, McKinnon (2009) takes a weak PP formulation to state that precautionary measures are permissible, which she also rejects as a hollow claim, stating that everyone could comply with it without ever taking any precautionary action.
Additionally, PPs are rejected as vacuous because of the multitude of formulations and interpretations. Turner and Hartzell (2004), examining different formulations of PPs, come to the conclusion that they are all beset with unclarity and ambiguities. They argue that there is no common core of the different interpretations, and that the plausibility of a PP actually rests on its vagueness. This makes it unsuitable as a guide for decision-making. Similarly, Peterson (2007b, 306) states that such a “weak” PP has no normative content and no implications for what ought to be done. He claims that in order to have normative content, a PP would need to give us a precise instruction for what to do for each input of information (Peterson 2007b, 306). By formulating a minimal normative PP interpretation and showing that it is incoherent, he argues that there cannot be a PP with normative content.
Rejoinder
Firstly, let us address the criticism that PPs are vacuous because they express a claim that is too weak to have any impact on decision-making. Against this, Steel (2013, 2014) has argued that even if these supposedly “weak” or “argumentative” principles do not directly recommend a specific decision, they nonetheless have an impact on the decision-making process if taken seriously. He interprets them as a meta-principle that puts constraints on what decision-rules should be used, namely, none that would lead to inaction in the face of uncertainty. As, for example, cost-benefit analysis needs numerical probabilities to be applicable, the Meta PP will recommend against it in situations where no such probability information is available. This is a substantial constraint, meaning that the Meta PP is not vacuous. One can reasonably doubt that Sunstein is right that everyone follows such an allegedly “weak” principle anyway. There are many historical cases where there was some positive evidence that an activity caused harm, but the fact that the activity-harm link had not been irrefutably proven was used to argue against regulatory action (Harremoës and others 2001, Gee and others 2013). Thus, in cases where no proof, or at least no reliable probability information, concerning the possibility of harm is available, uncertainty is often used as a reason to not take precautionary action. Additionally, this criticism clearly does not concern all forms of PPs, and only amounts to a full-fledged rejection of PPs if combined with the claim that so-called “stronger” PPs which are not trivial, will always be incoherent. And both Sunstein (2005b) and McKinnon (2009, 2012) do propose other PPs which express a stronger claim, albeit with a restricted scope (for example, only pertaining to catastrophic harm, or damage which entails specific kinds of injustice). This form of the “vacuous” objection can thus be seen not as an attack on the general idea of PPs, but more as the demand that the normative obligation they express should be made clear in order to avoid downplaying it.
Let us now consider the other form of the objection, namely the claim that PPs are essentially vague and that there cannot be a precise formulation of a PP that is both action-guiding and plausible. It is true that so far, there does not seem to exist a “one size fits all” PP that yields clear instructions for every input and that captures all the ideas commonly associated with PPs. However, even if this would be a correct interpretation of what a “principle” is (which many authors deny, compare for example Randall 2011, 97), it is not the only one. Peterson (2007b) presumes that only a strict “if this, then that” rule can have normative force, and consequently be action-guiding. In contrast, other authors stress the difference between a principle and a rule (Fisher 2002; Arcuri 2007; Randall 2011). According to them, while rules specify precise consequences that follow automatically when certain conditions are met, principles express normative obligations that need to be specified according to different contexts, and that need to be implemented and operationalized in rules, laws, policies, and so on (Randall 2011, 97). When authors are rejecting PPs as incoherent (see the objection), they might sometimes make the same mistake, confusing a general principle that needs to be specified on a case-by-case basis with a stand-alone decision rule that should fit for any and all cases.
As for PPs being essentially vague: This criticism seems to presuppose that in order to formulate a clarified PP, we have to capture and unify everything that is associated with it. However, explicating a concept in a way that clarifies it and captures as many of the ideas associated with it as possible does not mean that we have to preserve all of the ideas commonly associated with it. The same is true for explicating a principle such as a PP. Additionally, this article shows that many different ways of interpreting PPs in a precise way are possible, and not all of them exclude each other.
b. PPs are Redundant
Some authors reject PPs by arguing that they are just a narrow and complicated way of expressing what is already incorporated into established, more comprehensive approaches. For example, Bognar (2011) compares Gardiner’s (2006) “Rawlsian Core PP”-interpretation with what he calls a “utilitarian principle” which consists of a combination of the principles of indifference and that of maximizing expected utility. He concludes that this “utilitarian principle” does lead to the same results as the RCPP in the cases where the RCPP applies, but, contrary to it, this “utilitarian principle” is not restricted to such a narrow range of cases. His conclusion is that we can dispose of PPs, at least in formulations of maximin (Bognar 2011, 345).
In the same vein, Peterson (2007b, 600) asserts that if formulated in a consistent way, a PP would not be different from the “old” rules for risk-averse decision-making, while other authors have shown that we can use existing ordinary risk management (ORM) tools to implement a PP (Farrow 2004; Gollier, Moldovanu, and Ellingsen 2001). This allegedly would make PPs redundant (Randall 2011, 25; 87).
Rejoinder
Particularly against the criticism of Bognar (2011), one can counter that his “utilitarian principle” falls victim to the so-called “tuxedo fallacy” (Hansson 2008). Using the principle of indifference, that is, treating all outcomes as equally probable when one does not have enough information to assign reliable probabilities, can be seen as creating an “illusion of control” by assuming that as long as no probability information is available, all outcomes are equally probable. Neither does it pay special attention to catastrophic harms, nor does it take the special challenges of decision-theoretic uncertainty adequately into account.
More generally, one can make the following point: Even though there might be plausible ways how we can translate a PP into the ORM-framework and implement it using ORM-tools, there is more to it than that. Even if we use ORM-methods to implement precaution, in the end this might still be based on a normative obligation to enact precautionary measures. This obligation has to be spelled out, because ORM canallow for precaution, but does not demand it in itself (and, as a regulatory practice, tends to neglect it).
c. PPs are Irrational
The last line of criticism accuses PPs of being based on unfounded fears, expressing cognitive biases, and therefore leading to decisions with undesirable and overall harmful consequences.
Objection: Unfounded Panic
One criticism that is especially frequent in discussions aimed at a broader audience is that PPs lead to unrestrained regulation because they can be invoked by uncertain harm. Therefore, the argument goes, PPs hold the danger of unnecessary expenditures to reduce insignificant risks, forego benefits by regulating or prohibiting potentially beneficial activities, and are prone to being exploited, for example, from interest groups or for protectionism in international trade (Peterson 2006). A PP would stifle innovation, resulting in an overall less safe society: Many (risk-reducing) beneficial innovations of the past were only possible because risks had been taken (Zander 2010, 9), and technical innovation takes place in a process of trial-and-error, which would be seriously disturbed by a PP (Graham 2004, 5).
These critics see this as a consequence of PPs, because PPs do not require scientific certainty in order to take action, which they interpret as making merely speculative harm a reason for strict regulation. Thus, science would be marginalized or even rejected as a basis for decision-making, giving way to cognitive biases of ordinary people.
Objection: Cognitive biases
Sunstein claims that PPs are based on cognitive biases of ordinary people, which tend to systematically mis-assess risks (Sunstein 2005b, Chapter 4). By reducing the importance of scientific risk-assessment and marginalizing the role of experts, decisions resulting from the application of a PP will be influenced by these biases and result in negative consequences, the criticism goes.
Rejoinder
As has been pointed out by Randall (2011, 89), these criticisms seem to be misguided. Lower standards of evidence do not mean no standards at all. It is surely an important challenge for the implementation of a PP to find a way to define plausible possibilities, but this requires by no means less science. Instead, as Sandin, Bengtsson, and others (2004) point out, more, and different scientific approaches are needed. Uncertainties need to be communicated more clearly and tools need to be developed that allow taking uncertainties better into account. For decisions where we lack scientific information, but great harms are possible, ways need to be found for how public concerns can be taken into consideration (Arcuri 2007, 35). This, however, seems more a question of implementation and neither of the formulation nor the justification of a PP.
5. References and Further Reading
Ahteensuu, Marko. 2008. “In Dubio Pro Natura? A Philosophical Analysis of the Precautionary Principle in Environmental and Health Risk Governance.” PhD thesis, Turku, Finland: University of Turku.
Arcuri, Alessandra. 2007. “The Case for a Procedural Version of the Precautionary Principle Erring on the Side of Environmental Preservation.” SSRN Scholarly Paper ID 967779. Rochester, NY: Social Science Research Network.
Arrow, Kenneth J., and Anthony C. Fisher. 1974. “Environmental Preservation, Uncertainty, and Irreversibility.” The Quarterly Journal of Economics 88 (2): 312–19.
Buchak, Lara. 2013. Risk and Rationality. Oxford University Press.
Bognar, Greg. 2011. “Can the Maximin Principle Serve as a Basis for Climate Change Policy?” Edited by Sherwood J. B. Sugden. Monist 94 (3): 329–48. https://doi.org/10.5840/monist201194317.
Chisholm, Anthony Hewlings, and Harry R. Clarke. 1993. “Natural Resource Management and the Precautionary Principle.” In Fair Principles for Sustainable Development: Essays on Environmental Policy and Developing Countries, edited by Edward Dommen, 109–22.
Dommen, Edward (Ed.). 1993. Fair Principles for Sustainable Development: Essays on Environmental Policy and Developing Countries. Edward Elgar.
Farrow, Scott. 2004. “Using Risk Assessment, Benefit-Cost Analysis, and Real Options to Implement a Precautionary Principle.” Risk Analysis 24 (3): 727–35.
Fisher, Elizabeth. 2002. “Precaution, Precaution Everywhere: Developing a Common Understanding of the Precautionary Principle in the European Community.” Maastricht Journal of European and Comparative Law 9: 7.
Gardiner, Stephen M. 2006. “A Core Precautionary Principle.” Journal of Political Philosophy 14 (1): 33–60.
Gee, David, Philippe Grandjean, Steffen Foss Hansen, Sybille van den Hove, Malcolm MacGarvin, Jock Martin, Gitte Nielsen, David Quist and David Stanners. 2013. Late lessons from early warnings: Science, precaution, innovation. European Environment Agency.
Gollier, Christian, Benny Moldovanu, and Tore Ellingsen. 2001. “Should We Beware of the Precautionary Principle?” Economic Policy, 303–27.
Graham, John D. 2004. The Perils of the Precautionary Principle: Lessons from the American and European Experience. Vol. 818. Heritage Foundation.
Hansson, Sven Ove. 1997. “The Limits of Precaution.” Foundations of Science 2 (2): 293–306.
Hansson, Sven Ove. 2005a. Decision Theory: A Brief Introduction, Uppsala University class notes.
Hansson, Sven Ove. 2005b. “Seven Myths of Risk.” Risk Management 7 (2): 7–17.
Hansson, Sven Ove. 2013. The Ethics of Risk: Ethical Analysis in an Uncertain World. Palgrave Macmillan.
Harremoës, Poul, David Gee, Malcolm MacGarvin, Andy Stirling, Jane Keys, Brian Wynne, and Sofia Guedes Vaz. 2001. Late Lessons from Early Warnings: The Precautionary Principle 1896-2000. Office for Official Publications of the European Communities.
Harris, John, and Søren Holm. 2002. “Extending Human Lifespan and the Precautionary Paradox.” Journal of Medicine and Philosophy 27 (3): 355–68.
Harsanyi, John C. 1975. “Can the Maximin Principle Serve as a Basis for Morality? A Critique of John Rawls’s Theory.” Edited by John Rawls. The American Political Science Review 69 (2): 594–606. https://doi.org/10.2307/1959090.
Hartzell-Nichols, Lauren. 2013. “From ‘the’ Precautionary Principle to Precautionary Principles.” Ethics, Policy and Environment 16 (3): 308–20.
Hartzell-Nichols, Lauren. 2017. A Climate of Risk: Precautionary Principles, Catastrophes, and Climate Change. Taylor & Francis.
Hayenhjelm, Madeleine, and Jonathan Wolff. 2012. “The Moral Problem of Risk Impositions: A Survey of the Literature.” European Journal of Philosophy 20 (S1): E26–E51.
Jensen, Karsten K. 2002. “The Moral Foundation of the Precautionary Principle.” Journal of Agricultural and Environmental Ethics, 15(1): 39–55. https://doi.org/10.1023/A:1013818230213
John, Stephen. 2007. “How to Take Deontological Concerns Seriously in Risk–Cost–Benefit Analysis: A Re-Interpretation of the Precautionary Principle.” Journal of Medical Ethics 33 (4): 221–24.
John, Stephen. 2010. “In Defence of Bad Science and Irrational Policies: An Alternative Account of the Precautionary Principle.” Ethical Theory and Moral Practice 13 (1): 3–18.
Jonas, Hans. 2003. Das Prinzip Verantwortung: Versuch Einer Ethik Für Die Technologische Zivilisation. 5th ed. Frankfurt am Main: Suhrkamp Verlag.
Kaiser, Matthias. 1997. “Fish-Farming and the Precautionary Principle: Context and Values in Environmental Science for Policy.” Foundations of Science 2 (2): 307–41.
Lemons, John, Kristin Shrader-Frechette, and Carl Cranor. 1997. “The Precautionary Principle: Scientific Uncertainty and Type I and Type II Errors.” Foundations of Science 2 (2): 207–36.
Manson, Neil A. 2002. “Formulating the precautionary principle.” Environmental Ethics 24(3): 263–274.
McKinney, William J., and H. Hammer Hill. 2000. “Of Sustainability and Precaution: The Logical, Epistemological, and Moral Problems of the Precautionary Principle and Their Implications for Sustainable Development.” Ethics and the Environment 5 (1): 77–87.
McKinnon, Catriona. 2009. “Runaway Climate Change: A Justice-Based Case for Precautions.” Journal of Social Philosophy 40 (2): 187–203.
McKinnon, Catriona. 2012. Climate Change and Future Justice: Precaution, Compensation and Triage. Routledge.
Munthe, Christian. 2011. The Price of Precaution and the Ethics of Risk. Vol. 6. The International Library of Ethics, Law and Technology. Springer.
O’Riordan, Timothy, and Andrew Jordan. 1995. “The Precautionary Principle in Contemporary Environmental Politics.” Environmental Values 4 (3): 191–212.
Osimani, Barbara. 2013. “An Epistemic Analysis of the Precautionary Principle.” Dilemata: International Journal of Applied Ethics, 149–67.
Paterson, John. 2007. “Sustainable Development, Sustainable Decisions and the Precautionary Principle.” Natural Hazards 42 (3): 515–28. https://doi.org/10.1007/s11069-006-9071-4.
Peterson, Martin. 2006. “The Precautionary Principle Is Incoherent.” Risk Analysis 26 (3): 595–601. ll.
Peterson, Martin. 2007a. “Should the Precautionary Principle Guide Our Actions or Our Beliefs?” Journal of Medical Ethics 33 (1): 5–10. https://doi.org/10.1136/jme.2005.015495.
Peterson, Martin. 2007b. “The Precautionary Principle Should Not Be Used as a Basis for Decision‐making.” EMBO Reports 8 (4): 305–8. https://doi.org/10.1038/sj.embor.7400947.
Petrenko, Anton, and Dan McArthur. 2011. “High-Stakes Gambling with Unknown Outcomes: Justifying the Precautionary Principle.” Journal of Social Philosophy 42 (4): 346–62.
Randall, Alan. 2011. Risk and Precaution. Cambridge University Press.
Rawls, John. 2001. Justice as fairness: A restatement. Belknap, Harvard University Press.
Resnik, David B. 2003. “Is the Precautionary Principle Unscientific?” Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 34 (2): 329–44.
Resnik, David B. 2004. “The Precautionary Principle and Medical Decision Making.” Journal of Medicine and Philosophy 29 (3): 281–99.
Sandin, Per. 1999. “Dimensions of the Precautionary Principle.” Human and Ecological Risk Assessment: An International Journal 5 (5): 889–907.
Sandin, Per. 2004. “Better Safe Than Sorry: Applying Philosophical Methods to the Debate on Risk and the Precautionary Principle.” PhD thesis, Stockholm.
Sandin, Per. 2007. “Common-Sense Precaution and Varieties of the Precautionary Principle.” In Risk: Philosophical Perspectives, edited by Tim Lewens, 99–112. London; New York.
Sandin, Per. 2009. “A New Virtue-Based Understanding of the Precautionary Principle.” Ethics of Protocells: Moral and Social Implications of Creating Life in the Laboratory, 88–104.
Sandin, Per, Bengt-Erik Bengtsson, Ake Bergman, Ingvar Brandt, Lennart Dencker, Per Eriksson, Lars Förlin, and others 2004. “Precautionary Defaults—a New Strategy for Chemical Risk Management.” Human and Ecological Risk Assessment 10 (1): 1–18.
Sandin, Per, and Sven Ove Hansson. 2002. “The Default Value Approach to the Precautionary Principle.” Human and Ecological Risk Assessment: An International Journal 8 (3): 463–71. https://doi.org/10.1080/10807030290879772.
Sandin, Per, Martin Peterson, Sven Ove Hansson, Christina Rudén, and André Juthe. 2002. “Five Charges Against the Precautionary Principle.” Journal of Risk Research 5 (4): 287–99.
Science & Environmental Health Network (SEHN). 1998. Wingspread Statement on the Precautionary Principle.
Steel, Daniel. 2011. “Extrapolation, Uncertainty Factors, and the Precautionary Principle.” Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 42 (3): 356–64.
Steel, Daniel. 2013. “The Precautionary Principle and the Dilemma Objection.” Ethics, Policy & Environment 16 (3): 321–40.
Steel, Daniel. 2014. Philosophy and the Precautionary Principle. Cambridge University Press.
Steele, Katie. 2006. “The Precautionary Principle: A New Approach to Public Decision-Making?” Law, Probability and Risk 5 (1): 19–31. https://doi.org/10.1093/lpr/mgl010.
Suikkanen, Jussi. 2019. Ex Ante and Ex Post Contractualism: A Synthesis. The Journal of Ethics, 23(1): 77–98. https://doi.org/10/ggjn22
Sunstein, Cass R. 2005a. “Irreversible and Catastrophic.” Cornell Law Review 91: 841–97.
Sunstein, Cass R. 2007. “The Catastrophic Harm Precautionary Principle.” Issues in Legal Scholarship 6 (3).
Sunstein, Cass R. 2009. Worst-Case Scenarios. Harvard University Press.
Sunstein, Cass R. 2005b. Laws of Fear: Beyond the Precautionary Principle. Cambridge University Press.
Thalos, Mariam. 2012. “Precaution Has Its Reasons.” In Topics in Contemporary Philosophy 9: The Environment, Philosophy, Science and Ethics., edited by W. Kabasenche, M. O’Rourke, and M. Slater, 171–84. Cambridge, MA: MIT Press.
Tickner, Joel A. 2001. “Precautionary Assessment: A Framework for Integrating Science, Uncertainty, and Preventive Public Policy.” In The Role of Precaution in Chemicals Policy, edited by Elisabeth Freytag, Thomas Jakl, G. Loibl, and M. Wittmann, 113–27. Diplomatische Akademie Wien.
Turner, Derek, and Lauren Hartzell. 2004. “The Lack of Clarity in the Precautionary Principle.” Environmental Values 13 (4): 449–60.
United Nations Conference on Environment and Development. 1992. Rio Declaration on Environment and Development.
Westra, Laura. 1997. “Post-Normal Science, the Precautionary Principle and the Ethics of Integrity.” Foundations of Science 2 (2): 237–62.
Whiteside, Kerry H. 2006. Precautionary Politics: Principle and Practice in Confronting Environmental Risk. MIT Press Cambridge, MA.
Zander, Joakim. 2010. The Application of the Precautionary Principle in Practice: Comparative Dimensions. Cambridge: Cambridge University Press.
Research for this article was part of the project “Reflective Equilibrium – Reconception and Application” (Swiss National Science Foundation grant no. 150251).
The term “conspiracy theory” refers to a theory or explanation that features a conspiracy among a group of agents as a central ingredient. Popular examples are the theory that the first moon landing was a hoax staged by NASA, or the theory that the 9/11 attacks on the World Trade Center were not (exclusively) conducted by al-Qaeda, but that the US government conspired to let these attacks succeed. Conspiracy theories have long been an element of popular culture; and cultural theorists, sociologists and psychologists have had things to say about conspiracy theories and the people who believe in them. This article focuses on the philosophy of conspiracy theories, that is, on what philosophers have had to say about conspiracy theories. Conspiracy theories meet philosophy when it comes to questions concerning epistemology, science, society and ethics.
After giving a brief history of philosophical thinking about conspiracy theories in section 1, this article considers in more detail the definition of the term “conspiracy theory” in section 2. As it turns out, the definition of the term has received a lot of attention in philosophy, mainly because the common usage of the term has negative connotations (as in, “It’s just a conspiracy theory!”), raising the question whether our definition should reflect these. As there is a great variety of conspiracy theories on offer, section 3 considers ways of classifying conspiracy theories into distinct types. Such a classification may be useful when it comes to identifying possible problems with a conspiracy theory.
The main part of this article, section 4, is devoted to the question when one should believe in a conspiracy theory. In general, the philosophical literature has been more positive about conspiracy theories than other fields, being careful not to dismiss such theories too easily. Hence, it becomes important to come up with criteria that one may use to evaluate a given conspiracy theory. Section 4 provides such a list of criteria, distilled from the philosophical literature.
Turning from questions about belief to questions about society, ethics and politics, section 5 addresses the societal effects of conspiracy theories that philosophers have identified, also asking to what extent these are positive or negative. Given these effects, the last question this article addresses, in section 6, is what, if anything, we should do about conspiracy theories. Answering this question does not, of course, depend on philosophical thinking alone. For this reason, section 7 briefly mentions some relevant work outside of philosophy.
1. History of Philosophizing about Conspiracy Theories
Philosophical thinking about conspiracies can be traced back at least as far as Niccolo Machiavelli. Machiavelli discussed conspiracies in his most well-known work, The Prince (for example in chapter 19), but more extensively in his Discourses on the First Ten Books of Titus Livius, where he devotes the whole sixth chapter of the third book to a discussion of conspiracies. Machiavelli’s aim in his discussion of conspiracies is to help the ruler guard against conspiracies directed against him. At the same time, he warns subjects not to engage in conspiracies, partly because he believes these rarely achieve what they desire.
Where Machiavelli discussed conspiracies as a political reality, Karl Raimund Popper is the philosopher who put conspiracy theories on the philosophical agenda. The philosophical discussion of conspiracy theories begins with Popper’s dismissal of what he calls “the conspiracy theory of society” (Popper, 1966 and 1972). Popper sees the conspiracy theory of society as a mistaken approach to the explanation of social phenomena: It attempts to explain a social phenomenon by discovering people who have planned and conspired to bring the phenomenon about. While Popper thinks that conspiracies do occur, he thinks that few conspiracies are ultimately successful, since few things turn out exactly as intended. It is precisely the unforeseen consequences of intentional human action that social science should explain, according to Popper.
Popper’s comments on the conspiracy theory of society comprised only a few pages, and they did not trigger critical discussion until many years later. It was only in 1995 that Charles Pigden critically examined Popper’s views (Pigden, 1995). Besides Pigden’s critique of Popper, it was Brian Keeley (1999) and his attempt at defining what he called “unwarranted conspiracy theories” that started the philosophical literature on conspiracy theories. The question raised by Keeley’s paper is essentially the demarcation problem for conspiracy theories: Just as Popper’s demarcation problem was to separate science from pseudoscience, within the realm of conspiracy theories, the problem Keeley raised was to separate warranted from unwarranted conspiracy theories. However, Keeley concluded that the problem is a difficult one, admitting that the five criteria he proposed were not adequate for specifying when we are (un)warranted to believe in a conspiracy theory. This article returns to this problem in section 4.
After Popper’s work in the late 1960s and early 1970s, and Pigden’s and Keeley’s in the 1990s, philosophical work on conspiracy theories took off in the first decade of the 21st century. Particularly important in this development is the collection of essays by Coady (2006a), which made visible that there is a philosophical debate about conspiracy theories to a wider audience, as well as within philosophy. Since this collection of essays, philosophical thinking has been continuously evolving, as evidenced by special issues of Episteme (volume 4, issue 2, 2007), Critical Review (volume 28, issue 1, 2016), and Argumenta (volume 3, no.2, 2018).
Looking at the history of philosophizing about conspiracy theories, a useful distinction that has been applied to philosophers writing about conspiracy theories is the distinction between generalists and particularists (Buenting and Taylor, 2010). Following in the footsteps of Popper, generalists believe that conspiracy theories in general have an epistemic problem. For them, there is something about a theory being a conspiracy theory that should lower its credibility. It is this kind of generalism which underlies the popular dismissal, “It’s just a conspiracy theory.” Particularists like Pigden, on the other hand, argue that there is nothing problematic about conspiracy theories per se, but that each conspiracy theory needs to be evaluated on its own (de)merits.
2. Problems of Definition
The definition of the term “conspiracy theory” given at the beginning of this article is neutral in the sense that it does not imply that a conspiracy theory is wrong or unlikely to be true. In popular discourse, however, an epistemic deficit is often implied. Tracking this popular use, the Wikipedia entry on the topic (consulted 26 July 2019) defined a conspiracy theory as “an explanation of an event or situation that invokes a conspiracy by sinister and powerful actors, often political in motivation, when other explanations are more probable.”
We can order possible definitions of the term “conspiracy theory” in terms of logical strength. The definition given at the beginning of this article is minimal in this sense; it says that a conspiracy theory is a theory that involves a conspiracy. Slightly more elaborate, but still in line with this weak notion of conspiracy theory, Keeley (1999, p.116) sees a conspiracy theory as an explanation of an event by the causal agency of a small group of people acting in secret. What Keeley has added to the minimal definition is that the group of conspirators is small. Other additions that have been considered are that the group is powerful and/or that it has nefarious intentions. While these additions create a stronger notion of conspiracy theory, they all remain epistemically neutral; that is, they do not state that the explanation is unlikely or otherwise problematic. On the other end of the logical spectrum, definitions like the Wikipedia definition cited above are not only logically stronger than the minimal definition—the conspirators are powerful and sinister—but are also epistemically laden: A conspiracy theory is unlikely.
Within this spectrum of possibilities, philosophers have generally opted for a rather minimal definition that is epistemically neutral. As explicated by Dentith (2016, p.577), the central ingredients of a conspiracy are (a) a group of conspirators, (b) secrecy and (c) a shared goal. Similarly separating out the different ingredients of a conspiracy theory, Mandik (2007, p.206) states that conspiracy theories postulate “(1) explanations of (2) historical events in terms of (3) intentional states of multiple agents (the conspirators) who, among other things, (4) intended the historical events in question to occur and (5) keep their intentions and actions secret.” He sees these five conditions as necessary conditions for being a conspiracy theory, but he remains agnostic as to whether they are jointly sufficient.
A second approach to defining conspiracy theories has been proposed by Coady (2006b, p.2). He sees conspiracy theories as explanations that are opposed to the official explanation of an event at a given time. Coady points out that usually explanations that are conspiracy theories in this sense are also conspiracy theories in the sense discussed earlier, but not vice versa, as also official theories can refer to conspiracies, for example the official account of 9/11. Often, according to Coady, an explanation will be a conspiracy theory in both senses.
Which definition to adopt—strong or weak, epistemically neutral or not—is ultimately a question of what purpose the definition is to serve. No matter what definition one chooses, such a choice will have consequences. As an example, Watergate will not count as a conspiracy theory under the Wikipedia definition, but it will under the minimal definition given at the beginning of this article. Furthermore, this minimal definition of conspiracy theories will have as a consequence that an explanation of a surprise party will be considered a conspiracy theory. Hence, to be put to use, the minimal definition may need to be supplemented by an extra condition like nefariousness.
Finally, besides using the term “conspiracy theory,” some authors also use the term “conspiracism.” This latter term has been used in different ways in the literature. Pipes (1997) has used the term to indicate a particular paranoid style of thinking. Muirhead and Rosenblum (2019) have used it to describe an evolving phenomenon of political culture, distinguishing classic conspiracism from new conspiracism. While classic conspiracism involves the development of conspiracy theories as alternative explanations of phenomena, new conspiracism has shed the interest in explanation and theory building. Instead, it is satisfied with bare assertion or insinuation of a conspiracy and aims at political delegitimation and destabilization.
3. Types of Conspiracy Theories
Conspiracy theories come in great variety, and typologies can help to order this variety and to further guide research to a particular type of conspiracy theory that is particularly interesting or problematic. Räikkä (2009a, p.186 and 2009b, p.458-9) distinguishes political from non-political conspiracy theories. Räikkä mentions conspiracy theories about the death of Jim Morrison or Elvis Presley as examples of non-political conspiracy theories. He furthermore divides political conspiracy theories into local, global and total conspiracy theories depending on the scale of the event to be explained.
Huneman and Vorms (2018, p.251) provide further useful categories for distinguishing different types of conspiracy theories. They distinguish scientific from non-scientific conspiracy theories—that is, whether or not the theories deal with the domain of science, like the AIDS conspiracy theory—ideological from neutral conspiracy theories—whether there is a strong ideology driving the conspiracy theory, like anti-Semitism—official from anti-institutional conspiracy theories—as exemplified by official versus unofficial conspiracy theories about 9/11—and alternative explanations from denials—providing a different explanation for an event versus denying that the event took place.
A further way to distinguish conspiracy theories is by looking at what kind of theoretical object we are dealing with. In general, a conspiracy theory is an explanation of some event or phenomenon, but one can examine what kind of explanation it is. Some conspiracy theories may be full-blown theories, whereas others may not be theories in the scientific or philosophical sense. Clarke (2002 and 2007) thinks that some conspiracy theories are actually only proto-theories, not worked out sufficiently to count as theories, while others may be degenerating research programs in the sense defined by Imre Lakatos. There is more on the relationship between conspiracy theories and Lakatosian research programs in section 4, but here it is important to realize that while all conspiracy theories are explanations of some sort, certain conspiracy theories may be theories, others may be proto-theories or research programs.
4. Criteria for Believing in a Conspiracy Theory
A number of criteria have been offered, sometimes implicitly, in the philosophical literature to evaluate whether we should believe in a particular conspiracy theory, and these are surveyed below. Partly, such criteria will be familiar from scientific theory choice, but given that we are dealing with a specific type of theory, more can be said and more has been said. Due to the number of criteria, it is useful to group them into categories. There are different ways of grouping these criteria. The one adopted here tries to stay close to the labels and classifications common in the philosophy of science.
Although not explicitly stated, the dominant view in the philosophical literature from which the criteria below are taken is a realist view: Our (conspiracy) theories and beliefs should aim at the truth. Alternatively, one may propose an instrumentalist criterion which advocates a (conspiracy) theory or belief for its usefulness, for example in making predictions. Finally, while instrumentalism still has epistemic aims, we can also identify a more radical pragmatist view which focuses more generally on the consequences, for example political and social consequences, of holding a particular (conspiracy) theory or belief.
As mentioned, most of the criteria from the philosophical literature fit into the realist view. Within this view, we can distinguish three groups of criteria. First, we have criteria coming from the philosophy of science. These criteria have to do with the scientific methodology of theory choice, and here the question is how these play out when applied to conspiracy theories. Second, we have criteria dealing with motives. These can be the motives of the agents proposing a conspiracy theory, the motives of institutions relevant to the propagation of a conspiracy theory, or, finally, the motives of the agents the conspiracy theory is about. Third, there are a number of other criteria neither dealing with motives nor with scientific methodology. The picture arising from this way of organizing the criteria is presented in figure 1. The figure is not intended as a decision tree. Rather, it is more like an organized toolbox from which multiple tools may be chosen, depending, for example, on one’s philosophical commitments and one’s existing beliefs.
Figure 1
a. Criteria concerning Scientific Methodology
i. Internal Faults (C1)
Basham (2001, p.275) advocates skepticism of a conspiracy theory if it suffers from what he calls “internal faults,” among which he lists “problems with self-consistency, explanatory gaps, appeals to unlikely or obviously weak motives and other unrealistic psychological states, poor technological claims, and the theory’s own incongruencies with observed facts it grants (including failed predictions).” Räikkä (2009a, p.196f) also refers to a similar list of criteria. Basham thinks that this criterion, while seemingly straightforward, will already exclude many conspiracy theories. An historical example he mentions is the theory that sees the antichrist of the biblical Book of Revelations to be Adolf Hitler. According to Basham, the fact that Hitler is dead and the kingdom of God nowhere near shows that this theory has internal faults, presumably a big explanatory gap or failed prediction.
Note that the list of things mentioned by Basham as internal faults is rather diverse, and one can debate whether all of these faults should really be considered internal to the theory. More narrowly, one could restrict internal faults to problems with self-consistency. Most of the other elements mentioned by Basham return below as separate criteria. For instance, an appeal to “unlikely or obviously weak motives” is discussed as C5.
ii. Progress: Is the Conspiracy Theory Part of a Progressive Research Program? (C2)
Clarke (2002; 2007) sees conspiracy theories as degenerating research programs in the sense developed by Lakatos (1970). In Clarke’s description of a degenerating research program, “successful novel predictions and retrodictions are not made. Instead, auxiliary hypotheses and initial conditions are successively modified in light of new evidence, to protect the original theory from apparent disconfirmation” (Clarke 2002, p.136). By contrast, a progressive research program would make successful novel predictions and retrodictions. Clarke cites the Watergate conspiracy theory as an example of a progressive research program: It led the journalists to make successful predictions and retrodictions about the behavior of those involved in the conspiracy. By contrast, Clarke uses the conspiracy theory about Elvis Presley’s fake funeral as an example of a degenerating research program (p.136-7), since it did not come up with novel predictions that were confirmed, for example, concerning the unusual behavior of Elvis’s relatives. Going further, Clarke (2007) also views other conspiracy theories—the controlled demolition theory of 9/11, for instance—as only proto-theories, something that is not sufficiently worked out to count as a theoretical core of a degenerating or progressive research program. Proto-theories are similar to what Muirhead and Rosenblum (2019) call newconspiracism.
Pigden (2006, footnote 17 and p.29) criticizes Clarke for not providing any evidence that conspiracy theories are in fact degenerating research programs and points to the many conspiracy theories accepted by historians as counterevidence. In any case, we might consider evaluating a given conspiracy theory by trying to see to what extent it is, or is part of, a progressive or a degenerating research program. Furthermore, as Lakatos’s notion of a research program comes with a hard core—the central characteristic claims not up for modification—and a protective belt—auxiliary hypotheses which can be changed—applying this notion also gives us tools to analyze a conspiracy theory in more detail. Such an analysis might yield, for example, that the problematic aspects of a conspiracy theory all concern its protective belt rather than its hard core.
iii. Inference to the Best Explanation: Evidence, Prior, Relative and Posterior Probability (C3)
Dentith (2016) views conspiracy theories as inferences to the best explanation. To judge such inferences using a Bayesian framework, we need to look at the prior probability of the conspiracy theory, the prior probability of the evidence and its likelihood given the conspiracy theory, thereby allowing us to calculate the posterior probability of the conspiracy theory. Furthermore, we need to look at the relative probability of the conspiracy theory when comparing it to competing hypotheses explaining the same event. Crucial in this calculation is our estimation of the prior probability of the conspiracy theory, which Dentith thinks we usually set too low (p.584) because we tend to underestimate how often conspiracies occur in history.
There is some disagreement between authors about whether conspiracy theories may be selective in their choice of evidence. Hepfer (2015, p.78) warns against the selective acceptance of evidence which he calls selective coherentism (p.92), which for Hepfer explains, for example, the wealth of different conspiracy theories surrounding the assassination of John F. Kennedy. Dentith (2019, section 2), on the other hand, argues that scientific theories are also selective in their use of evidence, and that conspiracy theories are not different from other theories, such as scientific ones, in the way they use evidence. Dentith compares conspiracy theories about 9/11 to the work that historians usually do. In both cases, says Dentith, we see a selection of only part of the total evidence as salient.
Finally, Keeley (2003, p.106) considers whether lack of evidence for a conspiracy should count against a theory positing such a conspiracy. On the one hand, he points out that it is in general true that we should not confuse absence of evidence for a conspiracy with evidence of absence of a conspiracy. After all, since we are dealing with a conspiracy, we should expect that evidence will be hard to come by. This is also why falsifiability is in general not advocated as a criterion for evaluating conspiracy theories (see, e.g., Keeley 1999, p.121 and Basham 2003, p.93): In the case of conspiracy theories, something approaching unfalsifiability is a consequence of the theory. Nonetheless, Keeley (2003, p.106) thinks that if diligent efforts to find evidence for a conspiracy fail where similar efforts in other similar cases have succeeded, we are justified in lowering the credibility of the conspiracy theory.
iv. Errant Data (C4)
While the previous criterion already discussed how conspiracy theories relate to data, there is a particular kind of data that receives special attention both by conspiracy theorists and in the philosophical literature about conspiracy theories. Many conspiracy theories claim that they can explain “errant data” (Keeley, 1999, p.117), data which either contradicts the official theory or which the official theory leaves unexplained. According to Keeley (1999), conspiracy theories place great emphasis on errant data, an emphasis that also exists in periods of scientific innovation. However, Keeley thinks that conspiracy theories wrongly claim that errant data by itself is a problem for a theory, which Keeley thinks it is not, since not all the available data will in fact be true. Clarke (2002 p.139f) and Dentith (2019, section 3) are skeptical of Keeley’s argument: Clarke points out that the data labelled as “errant” will depend on the theory one adheres to, and Dentith thinks that conspiracy theories are no different from other theories in relation to such data.
Dentith (2014, 129ff), following Coady (2006c), points out that any theory, official or unofficial, will have errant data. While advocates of a conspiracy theory will point to data problematic for the official theory which the conspiracy theory can explain, there will usually also be data problematic to the conspiracy theory which the official theory can explain. As an example of data errant with regard to the official theory, Dentith mentions that the official theory about the assassination of John F. Kennedy does not explain why some witnesses heard more gunshots than the three gunshots Oswald is supposed to have fired. As an example of data errant with regard to a conspiracy theory, Dentith points out that some of the conspiracy theories about 9/11 cannot explain why there is a video of Osama Bin Laden claiming responsibility for the attacks. When it comes to evaluating a specific conspiracy theory, the conclusion is that we should be looking at the errant data of both the conspiracy theory and alternative theories.
b. Criteria Concerning Motives
i. Cui Bono: Who Benefits from the Conspiracy? (C5)
Hepfer (2015, p.98ff) uses the assassination of John F. Kennedy in 1963 to illustrate how motives enter into our evaluation of conspiracy theories. While there seems to be widespread agreement that the assassin was in fact Lee Harvey Oswald, conspiracy theories doubt the official theory that he was acting on his own. There are a number of possible conspirators with plausible motives that may have been behind Oswald: The military-industrial complex, the American mafia, the Russian secret service, the United States Secret Service and Fidel Castro. Which of these conspiracy theories we should accept also depends on how plausible we find the ascribed motives given our other beliefs about the world.
According to Hepfer (2015, p.98 and section 2.3), a conspiracy theory should be (a) clear about the motives or goals of the conspirators and (b) rational in the means-ends sense of rationality; that is, if successful, the conspiracy should further the goals the conspirators are claimed to have. If the goals of the conspirators are not explicitly part of the theory, we should be able to infer these goals, and they should be reasonable. Problematic conspiracy theories are those where the motives or goals of the conspirators are unclear, the goals ascribed to the conspirators conflict with our other knowledge about the goals of these agents, or a successful conspiracy would not further the goals the theory itself ascribes to the conspirators.
ii. Individual Trust (C6)
Trust plays a role in two different ways when it comes to conspiracy theories. First, Räikkä (2009b, section 4) raises the question of whether we can trust the motives of the author(s) or proponents of a conspiracy theory. Some conspiracy theorists may not themselves believe the theory they propose, and instead may have other motives for proposing the theory; for example, to manipulate the political debate or make money. Other conspiracy theorists may genuinely believe the conspiracy theory they propose, but the fact that the alleged conspirators are the political enemy of the theory’s proponent may cast doubt on the likelihood of the theory. The general question here is whether the author or proponent of a conspiracy theory has a motive to lie or mislead. Here, Räikkä uses as an example the conspiracy theory about global warming (p.462). If a person working for the fossil-fuel industry claims that there is a global conspiracy propagating the idea of global warming, the financial motive is clear. Conversely, people who reject a particular theory as “just” a conspiracy theory may also have a motive to mislead. As an example, Pidgen disscusses the case of Tony Blair,who labeled the idea that the Iraq war was fought for oil a mere conspiracy theory.
A second way in which trust enters into the analysis of conspiracy theories is in terms of epistemic authority. Many conspiracy theories refer to various authorities for the justification of certain claims. For instance, a 9/11 conspiracy theory may refer to a structural engineer who made a certain claim regarding the collapse of the World Trade Center. The question arises as to what extent we should trust claims of alleged epistemic authorities, that is, people who have relevant expertise in a particular domain. Levy (2007) takes a radically socialized view of knowledge: Since knowledge can only be produced by a complex network of inquiry in which the relevant epistemic authorities are embedded, a conspiracy theory conflicting with the official story coming out of this network is “prima facie unwarranted” (p.182, italics in the original). According to Levy, the best epistemic strategy is simply to “adjust one’s degree of belief in an explanation of an event or process to the degree to which the epistemic authorities accept that explanation” (p.190). Dentith (2018) criticizes Levy’s trust in epistemic authority. First, Dentith argues that since conspiracy theories cross disciplinary boundaries, there is no obvious group of experts when it comes to evaluating a conspiracy theory, since a conspiracy theory will usually involve claims connecting various disciplines. Furthermore, Dentith points out that the fact that a theory has authority in the sense of being official does not necessarily mean that it has epistemic authority, a point Levy also makes. Related to our first point about trust, Dentith also points out that epistemic authorities might have a motive to mislead, for example, when the funding sources might have influenced research. Finally, our trust in epistemic authority will also depend on the trust we place in the institutions accrediting expertise, and hence questions of individual trustworthiness relate to questions of institutional trustworthiness.
iii. Institutional Trust (C7)
As mentioned when discussing individual trust, when we want to assess the credibility of experts, part of that credibility judgment will depend on the extent to which we trust the institution accrediting the expertise, assuming there is such an institution to which the expert is linked. The question of institutional trust is relevant more generally when it comes to conspiracy theories, and this issue has been discussed at length in the philosophical literature on conspiracy theories.
The starting point of the discussion of institutional trust is Keeley (1999, p.121ff) who argues that the problem with conspiracy theories is that these theories cast doubt on precisely those institutions which are the guarantors of reliable data. If a conspiracy theory contradicts an official theory based on scientific expertise, this produces skepticism not only with regard to the institution of science, but may also produce skepticism with regard to other public institutions, for example the press, which accepts the official story instead of uncovering the conspiracy, the parliament and the government, which produce or propagate the conspiracy theory in the first place. Thus, the claim is that believing in a conspiracy theory implies a quite widespread distrust of our public institutions. If this implication is true, it can be used in two ways: Either to discredit the conspiracy theory, which is the route Keeley advocates, or to discredit our public institutions. In any case, our trust in our public institutions will influence the extent to which we hold a particular conspiracy theory to be likely. For this reason, both Keeley (1999, p.121ff) and Coady (2006a, p.10) think that conspiracy theories are more trustworthy in non-democratic societies.
Basham (2001, p.270ff) argues that it would be a mistake to simply assume our public institutions to be trustworthy and dismiss conspiracy theories. His position is one he calls “studied agnosticism” (p.275): In general, we are not in a position to decide for or against a conspiracy theory, except—and this is where the “studied” comes in—where a conspiracy theory can be dismissed due to internal faults (see C1). In fact, we are caught in a vicious circle: “We cannot help but assume an answer to the essential issue of how conspirational our society is in order to derive a well justified position on it” (p.274). Put differently, while an open society provides fewer grounds for believing in conspiracy theories, we cannot really know how open our society actually is (Basham 2003, p.99). In any case, an individual who tries to assess a particular conspiracy theory should thus also consider to what extent they trust or distrust our public institutions.
Clarke (2002,p.139ff) questions Keeley’s link between belief in conspiracy theories and general distrust in our public institutions. He claims that conspiracy theories actually do not require general institutional skepticism. Instead, in order to believe in a conspiracy theory, it will usually suffice to confine one’s skepticism to particular people and issues. Räikkä (2009a) also criticizes Keeley’s supposed link between conspiracy theories and institutional distrust, claiming that most conspiracy theories do not entail such pervasive institutional distrust, but that if such pervasive distrust were entailed by a conspiracy theory, it would lower the conspiracy theory’s credibility. A global conspiracy theory like the Flat Earth theory tends to involve more pervasive institutional distrust, since it involves multiple institutions from various societal domains, than a local conspiracy theory like the Watergate conspiracy. According to Clarke, even the latter does not have to engender institutional distrust with regard to the United States government as an institution, since distrust could remain limited to specific agents within the government.
c. Other Realist Criteria
i. Fundamental Attribution Error (C8)
Starting with Clarke (2002; see also his response to criticism in 2006), philosophers have discussed whether conspiracy theories commit the fundamental attribution error (FAE). In psychology, the fundamental attribution error refers to the human tendency to overestimate dispositional factors and underestimate situational factors in explaining the behavior of others. Clarke (p.143ff) claims that conspiracy theories commit this error: They tend to be dispositional explanations whereas official theories often are more situational explanations. As an example, Clarke considers the funeral of Elvis Presley. The official account is situational since it explains the funeral in terms of his death due to heart problems. On the other hand, the conspiracy theory which claims Elvis is still alive and staged his funeral is dispositional since it sees Elvis and his possible co-conspirators as having the intention to deceive the public.
Dentith (2016, p.580) questions whether conspiracy theories are generally more dispositional than other theories. Also, like in the case of 9/11, the official theory may also be dispositional. Pigden (2006, footnotes 27 and 30, and p.29) is critical of the psychological literature about the FAE, claiming that “if we often act differently because of different dispositions, then the fundamental attribution error is not an error” (footnote 30). Pigden is also critical of Clarke’s application of the FAE to conspiracy theories: Given that conspiracies are common, what Pigden calls “situationism” is either false or it does not imply that conspiracies are unlikely. Hence, Pigden concludes, the FAE has no relevant implications for our thinking about conspiracy theories. Coady (2003) is also critical of the existence of the FAE. Furthermore, he claims that belief in the FAE is paradoxical in that it commits the FAE: Believing that people think dispositionally rather than situationally is itself dispositional thinking.
ii. Ontology: Existence Claims the Conspiracy Theory Makes (C9)
Some conspiracy theories claim the existence or non-existence of certain entities. Among the examples Hepfer (2015, p.45) cites is a theory by Heribert Illig that claims that the years between 614 and 911 never actually happened. Another example would be a theory claiming the existence of a perpetual motion machine that is kept secret. Both existence claims go against the scientific consensus of what exists and what does not. Hepfer (2015, p.42) claims that the more unusual a conspiracy theory’s existence claims are, the more we should doubt its truth. This is because of the ontological baggage (p.49) that comes with such existence claims: Accepting these claims will force us to revise a major part of our hitherto accepted knowledge, and the more substantial the revision needed, the more we should be suspicious of such a theory.
iii. Übermensch: Does the Conspiracy Theory Ascribe Superhuman Qualities to Conspirators? (C10)
Hepfer (2015, p.104) and Räikkä (2009a, p.197) note that some conspiracy theories ascribe superhuman qualities to the conspirators that border on divine attributes like omnipotence and omniscience. Examples here might be the idea that Freemasons, Jews or George Soros control the world economy or the world’s governments. Sometimes the superhuman qualities ascribed to conspirators are moral and negative, that is, conspirators are demonized (Hepfer, 2015, p.131f). The antichrist has not only been seen in Adolf Hitler but also in the pope. In general, the more extraordinary the qualities ascribed to the conspirators, the more they should lower the credibility of the conspiracy theory.
iv. Scale: The Size and Duration of the Conspiracy
(C11)
The general claim here is that the more agents that are supposed to be involved in a conspiracy—its size—and the longer the conspiracy is supposed to be in existence—its duration—the less likely the conspiracy theory. Hepfer (2015, p.97) makes this point, and similarly Keeley (1999, p.122) says that the more institutions are supposed to be involved in a conspiracy, the less believable the theory should become. To some extent, this point is simply a matter of logic: The claim that A and B are involved in a conspiracy cannot be more likely than that A is involved in a conspiracy. Similarly, the claim that a conspiracy theory has been going on for at least 20 years cannot be more likely than the claim that it has been going on for at least 10 years. In this sense, conspiracy theories involving many agents over a long period of time will tend to be less likely than conspiracy theories involving fewer agents over a shorter period of time. Furthermore, Grimes (2016) has conducted simulations showing that large conspiracies with 1000 agents or more are unlikely to succeed due to problems with maintaining secrecy.
Basham (2001, p.272; 2003, p.93) takes an opposing view by referring to social hierarchies and mechanisms of control, saying that “the more fully developed and high placed a conspiracy is, the more experienced and able are its practitioners at controlling information and either co-opting, discrediting, or eliminating those who go astray or otherwise encounter the truth” (Basham 2001, p.272). Dentith (2019, section 7) also counters the scale argument by pointing out that any time an institution is involved in a conspiracy, only very few people of that institution actually are involved in the conspiracy. This reduces the number of total conspirators and questions the relevance of the results by Grimes of which Dentith is very critical.
d. Non-Realist Criteria
i. Instrumentalism: Conspiracy Theories as “as if” Theories (C12)
Grewal (2016) has shown how the philosophical opposition between scientific realism and various kinds of anti-realism also shows up in how we evaluate conspiracy theories. While most authors implicitly seem to interpret the claims of conspiracy theories along the lines of realism, Grewal has suggested that adherents of conspiracy theories may interpret or at least use these theories instrumentally. Viewed this way, conspiracy theories are “as-if”-theories which allow their adherents to make sense of a world that is causally opaque in a way that may often yield quite adequate predictions. “An assumption that the government operated as if it were controlled by a parallel and secret government may fit the historical data…while also providing better predictions than would, say, an exercise motivated by an analysis of constitutional authority or the statutory limitations to executive power” (p.36). As a more concrete example, Grewal mentions that “the most parsimonious way to understand financial decision making in the Eurozone might be to treat it as if it were run by and for the benefit of the Continent’s richest private banks” (p.37). Hence, our evaluation of a given conspiracy theory will also depend on basic philosophical commitments like what we expect our theories to do for us.
ii. Pragmatism (C13)
The previous arguments have mostly been epistemic or epistemological arguments, arguments that bear on the likelihood of a conspiracy theory to be true or at least epistemically useful. However, similar to Blaise Pascal’s pragmatic argument for belief in God (Pascal, 1995), some arguments concerning conspiracy theories that have nothing to do with their epistemic value can be reinterpreted pragmatically as arguments about belief: Pragmatically, our belief or disbelief should depend on the consequences the (dis)belief has for us personally or for society more generally.
Basham (2001) claims that epistemic rejection of conspiracy theories will often not work, and we have to be agnostic about their truth. Still, we should reject them for pragmatic reasons because “[t]here is nothing you can do,” given the impossibility of finding out the truth, and “[t]he futile pursuit of malevolent conspiracy theory sours and distracts us from what is good and valuable in life” (p.277). Similarly, Räikkä (2009a) says that “a person who strives for happiness in her personal life should not ponder on vicious conspiracies too much” (p.199). Then again, contrary to Basham’s claim, what you can do with regard to conspiracy theories will depend on your role. As a journalist, you may decide to investigate certain claims, and Räikkä (2009a, p.199f) thinks that “it is important that in every country there are some people who are interested in investigative journalism and political conspiracy theorizing.”
Like journalists, politicians play a special role when it comes to conspiracy theories. Muirhead and Rosenblum (2016) argue that politicians should oppose conspiracy theories if they (1) are fueled by hatred, or (2) when they present political opposition as treason and illegitimate, or (3) when they undermine epistemic or expert authority generally. Similarly, Räikkä (2018, p.213) argues that we must interfere with conspiracy theories when they include libels or hate speech. The presumed negative consequences of such conspiracy theories would be pragmatic reasons for disbelief.
Räikkä (2009b) lists both positive and negative effects of conspiracy theorizing, and we may apply these to concrete conspiracy theories to see which ones to believe in. The two positive effects he mentions are (a) that “the information gathering activities of conspiracy theorists and investigative journalists force governments and government agencies to watch out for their decisions and practices” (p.460) and (b) that conspiracy theories help to maintain openness in society. As negative effects, he mentions that a conspiracy theory “tends to undermine trust in democratic political institutions and its implications may be morally questionable, as it has close connections to populist discourse, as well as anti-Semitism and racism” (p.461). When a conspiracy theory blames certain people, Räikkä points out that there are moral costs for the people blamed. Furthermore, he thinks that the moral costs will depend on whether the people blamed are private individuals or public figures (p.463f).
5. Social and Political Effects of Conspiracy Theories
Räikkä (2009b, section 3) and Moore (2016, p.5) survey some of the social and political effects of conspiracy theories and conspiracy theorizing. One may look at the positive and negative effects of conspiracy theorizing in general, but it is also useful to consider the effects of a specific conspiracy theory, by looking at which effects mentioned below are likely to obtain for the conspiracy theory in question. Such an evaluation is related to the pragmatist evaluation criterion C13 just discussed, so some of the points mentioned there are revisited in what follows. Also, the effects of a conspiracy theory may be related to the type of conspiracy theory we are dealing with; see section 3 of this article.
On the positive side, conspiracy theories may be tools to uncover actual conspiracies, with the Watergate scandal as the standard example. When these conspiracies take place in our public institutions, conspiracy theories can thereby also help us to keep these institutions in check and to uncover institutional problems. Conspiracy theories can help us to remain critical of those holding power in politics, science and the media. One of the ways they can achieve this is by forcing these institutions to be more transparent. Since conspiracy theories claim the secret activity of certain agents, transparent decision making, open lines of communication and the public availability of documents are possible responses to conspiracy theories which can improve a democratic society, independent of whether they suffice to convince those believing conspiracy theories. We may call this the paradoxical effect of conspiracy theories: Conspiracy theories can help create or maintain the open society whose existence they deny.
Turning from positive to possible negative effects of conspiracy theories, a central point that already came up when discussing criterion C7 is institutional trust. Conspiracy theories can contribute to eroding trust in the institutions of politics, science and the media. The anti-vaccination conspiracy theory which claims that politicians and the pharmaceutical industry are hiding the ineffectiveness or even harmfulness of vaccines is an example of a conspiracy theory which can undermine public trust in science. Huneman and Vorms (2018) discuss how at times it can be difficult to draw the line between rational criticism of science and unwarranted skepticism. One fear is that eroding trust in institutions leads us via unwarranted skepticism to an all-out relativism or nihilism, a post-truth world where it suffices that a claim is repeated by a lot of people to make it acceptable (Muirhead and Rosenblum, 2019). Conspiracy theories have also been linked to increasing polarization, populism and racism (see Moore, 2016). Finally, as alluded to in section 1, Popper’s dislike of conspiracy theories was also because they create wrong ideas about the root causes of social events. By seeing social events as being caused by powerful people acting in secret, rather than as effects of structural social conditions, conspiracy theories arguably undermine effective political action and social change.
Bjerg and Presskorn-Thygesen (2017) have claimed that conspiracy theories cause a state of exception in the way introduced by Giorgio Agamben. Just like terrorism undermines democracy in such a way that it licenses a state of political exception justifying undemocratic measures, a conspiracy theory undermines rational discourse in such a way that it licenses a state of epistemic exception justifying irrational measures. Those measures consist in placing conspiracy theories outside of official public discourse, labeling them as irrational, as “just” conspiracy theories, and as not worthy of serious critical consideration and scrutiny. Seen in this way, conspiracy theories appear as a form of epistemic terrorism, through their erosion of trust in our knowledge-producing institutions.
6. What to Do about Conspiracy Theories?
Besides deciding to believe or not to believe in a conspiracy theory (section 4), there are other actions one may consider with regard to conspiracy theories. Philosophical discussion has mainly focused on what actions governments and politicians can or should take.
The seminal article concerning the question of government action is by Sunstein and Vermeule (2009). Besides describing different psychological and social mechanisms underlying belief in conspiracy theories, they consider a number of policy and legal responses a government might take when it comes to false and harmful conspiracy theories: banning conspiracy theories, taxing the dissemination of conspiracy theories, counterspeech and cognitive infiltration of groups producing conspiracy theories. While dismissing the first two options, Sunstein and Vermeule consider counterspeech and cognitive infiltration in more detail. First, the government may itself speak out against a conspiracy theory by providing its own account. However, Sunstein and Vermeule think that such official counterspeech will have only limited success, in particular when it comes to conspiracy theories involving the government. Alternatively, the government may try to involve private parties to infiltrate online fora and discussion groups associated with conspiracy theories in order to introduce cognitive diversity, breaking up one-sided discussion and introducing non-conspirational views.
The proposals by Sunstein and Vermeule have led to strong opposition, most explicitly by Coady (2018). He points out that Sunstein and Vermeule too easily assume good intentions on the part of the government. Furthermore, these policy proposals, coming from academics who have also been involved in governmental policy making, will only confirm the fears of the conspiracy theorists that the government is involved in conspirational activities. If the cognitive infiltration proposed by Sunstein and Vermeule were discovered, conspiracy theorists would be led to believe in conspiracy theories even more. Put differently, we are running the risk of a pragmatic inconsistency: The government would try to deceive, via covert cognitive infiltration, a certain part of the population to make it believe that it does not deceive, that it is not involved in conspiracies.
As mentioned when discussing evaluation criterion C13 in section 4, Muirhead and Rosenblum (2016) consider three kinds of conspiracy theories that should give politicians cause for official opposition. These are conspiracy theories that fuel hatred, equate political opposition with treason, or that express a general distrust of expertise. In these cases, politicians are called to speak truth to conspiracy, even though this might create a divide between them and their electorate. Muirhead and Rosenblum (2019) also consider what to do against new conspiracism (see the end of section 2). They note that such conspiracism is rampant in our society despite ever more transparency. As a counter measure, they not only advocate speaking truth to conspiracy, but also what they call “democratic enactment,” by which they mean “a strenuous adherence to the regular processes and forms of public decision-making” (p.175).
Both Sunstein and Vermeule, as well as Muirhead and Rosenblum, agree that what we should do about conspiracy theories will depend on the theory we are dealing with. They do not advocate action against all theories about groups acting in secret to achieve some aim. However, when a theory is of a particularly problematic kind—false and harmful, fueling hatred, and so forth—political action may be needed.
7. Related Disciplines
Philosophy is not the only discipline dealing with conspiracy theories, and in particular when it comes to discussing what to do about conspiracy theories, research from other fields is important. We have already seen some ways in which philosophical thinking about conspiracy theories touches on other disciplines, in particular in the previous section’s discussion of political science and law. As for other related fields, psychologists have done a lot of research about conspirational thinking and the psychological characteristics of people who believe in conspiracy theories. Historians have presented histories of conspiracy theories in the United States, the Arab world and elsewhere. Sociologists have studied how conspiracy theories can target racial minorities, as well as the structure and group dynamics of specific conspirational milieus. Uscinski (2018) covers many of the relevant disciplines which this article does not cover and also includes an interdisciplinary history of conspiracy theory research.
8. References and Further Reading
To get an overview of the philosophical thinking about conspiracy theories, the best works to start with are Dentith (2014), Coady (2006a) and Uscinski (2018).
Basham, L. (2001). “Living with the Conspiracy”, The Philosophical Forum, vol. 32, no. 3, p.265-280.
Basham, L. (2003). “Malevolent Global Conspiracy”, Journal of Social Philosophy, vol. 34, no. 1, p.91-103.
Bjerg, O. and T. Presskorn-Thygesen (2017). “Conspiracy Theory: Truth Claim or Language Game?”, Theory, Culture and Society, vol. 34, no. 1, p.137-159.
Buenting, J. and J. Taylor (2010). “Conspiracy Theories and Fortuitous Data”, Philosophy of the Social Sciences, vol. 40, no. 4, p. 567-578.
Clarke, St. (2002). “Conspiracy Theories and Conspiracy Theorizing”, Philosophy of the Social Sciences, vol. 32, no. 2, p.131-150.
Clarke, St. (2006). “Appealing to the Fundamental Attribution Error: Was it All a Big Mistake?”, in Conspiracy Theories: The Philosophical Debate. Edited by David Coady. Ashgate, p.129-132.
Clarke, St. (2007). “Conspiracy Theories and the Internet: Controlled Demolition and Arrested Development”, Episteme, vol. 4, no. 2, p.167-180.
Coady, D. (2003). “Conspiracy Theories and Official Stories”, International Journal of Applied Philosophy, vol. 17, no. 2, p.197-209.
Coady, D., ed. (2006a). Conspiracy Theories: The Philosophical Debate. Ashgate.
Coady, D. (2006b). “An Introduction to the Philosophical Debate about Conspiracy Theories”, in Conspiracy Theories: The Philosophical Debate. Edited by David Coady. Ashgate, p.1-11.
Coady, D. (2006c). “Conspiracy Theories and Official Stories”, in Conspiracy Theories: The Philosophical Debate. Edited by David Coady. Ashgate, p.115-128.
Coady, D. (2018). “Cass Sunstein and Adrian Vermeule on Conspiracy Theories”, Argumenta, vol. 3, no.2, p.291-302.
Dentith, M. (2014). The Philosophy of Conspiracy Theories. Palgrace MacMillan.
Dentith, M. (2016). “When Inferring to a Conspiracy might be the Best Explanation”, Social Epistemology, vol. 30, nos. 5-6, p.572-591.
Dentith, M. (2018). “Expertise and Conspiracy Theories”, Social Epistemology, vol. 32, no. 3, p.196-208.
Dentith, M. (2019). “Conspiracy theories on the basis of the evidence”, Synthese, vol. 196, no. 6, p.2243-2261.
Grewal, D. (2016). “Conspiracy Theories in a Networked World”, Critical Review, vol. 28, no. 1, p.24-43.
Grimes, D. (2016). “On the Viability of Conspirational Beliefs”, PLoS ONE, vol. 11, no. 1.
Hepfer, K. (2015). Verschwörungstheorien: Eine philosophische Kritik der Unvernunft. Transcript Verlag.
Huneman, Ph. and M. Vorms (2018). “Is a Unified Account of Conspiracy Theories Possible?”, Argumenta, vol. 3, no. 2, p.247-270.
Keeley, B. (1999). “Of Conspiracy Theories”, The Journal of Philosophy, vol. 96, no. 3, p.109-126.
Keeley, B. (2003). “Nobody Expects the Spanish Inquisition! More Thoughts on Conspiracy Theory”, Journal of Social Philosophy, vol. 34, no. 1, p.104-110
Lakatos, I. (1970). “Falsification and the Methodology of Scientific Research Programmes”, in I. Lakatos and A. Musgrave, editors, Criticism and the Growth of Knowledge. Cambridge University Press, p.91-196.
Levy, N. (2007). “Radically Socialized Knowledge and Conspiracy Theories”, Episteme, vol. 4 no. 2, p.181-192.
Moore, A. (2016). “Conspiracy and Conspiracy Theories in Democratic Politics”, Critical Review, vol. 28, no. 1, p.1-23.
Muirhead, R. and N. Rosenblum (2016). “Speaking Truth to Conspiracy: Partisanship and Trust”, Critical Review, vol. 28, no. 1, p.63-88.
Muirhead, R. and N. Rosenblum (2019). A Lot of People are Saying: The New Conspiracism and the Assault on Democracy. Princeton University Press.
Pascal, B. (1995). Pensées and Other Writings, H. Levi (trans.). Oxford University Press.
Pigden, Ch. (1995). “Popper Revisited, or What Is Wrong With Conspiracy Theories?” Philosophy of the Social Sciences, vol. 25, no. 1, p.3-34.
Pigden, Ch. (2006). “Complots of Mischief”, in David Coady (ed.), Conspiracy Theories: The Philosophical Debate. Ashgate, p.139-166.
Pipes, D. (1997). Conspiracy: How the Paranoid Style Flourishes and Where It Comes From. Free Press.
Popper, K.R. (1966). The Open Society and Its Enemies, vol. 2: The High Tide of Prophecy, 5th edition, Routledge and Kegan Paul.
Popper, K.R. (1972). Conjectures and Refutations. 4th edition, Routledge and Kegan Paul.
Räikkä, J. (2009a). “On Political Conspiracy Theories”, Journal of Political Philosophy, vol. 17, no. 2, p.185-201.
Räikkä, J. (2009b). “The Ethics of Conspiracy Theorizing”, Journal of Value Inquiry, vol. 43, p.457-468.
Räikkä, J. (2018). “Conspiracies and Conspiracy Theories: An Introduction”, Argumenta, vol. 3, no. 2, p.205-216.
Sunstein, C. and A. Vermeule (2009). “Conspiracy Theories: Causes and Cures”, Journal of Political Philosophy, vol. 17, no. 2, p.202-227.
Uscinski, J.E., editor (2018). Conspiracy Theories and the People Who Believe Them. Oxford University Press.
Author Information
Marc Pauly
Email: m.pauly@rug.nl
University of Groningen
The Netherlands
René Descartes: Ethics
This article describes the main topics of Descartes’ ethics through discussion of key primary texts and corresponding interpretations in the secondary literature. Although Descartes never wrote a treatise dedicated solely to ethics, commentators have uncovered an array of texts that demonstrate a rich analysis of virtue, the good, happiness, moral judgment, the passions, and the systematic relationship between ethics and the rest of philosophy. The following ethical claims are often attributed to Descartes: the supreme good consists in virtue, which is a firm and constant resolution to use the will well; virtue presupposes knowledge of metaphysics and natural philosophy; happiness is the supreme contentment of mind which results from exercising virtue; the virtue of generosity is the key to all the virtues and a general remedy for regulating the passions; and virtue can be secured even though our first-order moral judgments never amount to knowledge.
Descartes’ ethics was a neglected aspect of his philosophical system until the late 20th century. Since then, standard interpretations of Descartes’ ethics have emerged, debates have ensued, and commentators have carved out key interpretive questions that anyone must answer in trying to understand Descartes’ ethics. For example: what kind of normative ethics does Descartes espouse? Are the passions representational or merely motivational states? At what point in the progress of knowledge can the moral agent acquire and exercise virtue? Is Descartes’ ethics as systematic as he sometimes seems to envision?
When one considers the heyday of early modern ethics, the following philosophers come to mind: Hobbes, Hutcheson, Hume, Butler, and, of course, Kant. Descartes certainly does not. Indeed, many philosophers and students of philosophy are unaware that Descartes wrote about ethics. Standard interpretations of Descartes’ philosophy place weight on the Discourse on the Method, Rules for the Direction of the Mind, Meditations on First Philosophy (with the corresponding Objections and Replies), and the Principles of Philosophy. Consequently, Descartes’ philosophical contributions to the early modern period are typically understood as falling under metaphysics, epistemology, philosophy of mind, and natural philosophy. When commentators do consider Descartes’ ethical writings, these writings are often regarded as an afterthought to his mature philosophical system. Indeed, Descartes’ contemporaries often did not think much of Descartes’ ethics. For example, Leibniz writes: “Descartes has not much advanced the practice of morality” (Letter to Molanus, AG: 241).
This view is understandable. Descartes certainly does not have a treatise devoted solely to ethics. This lack, in and of itself, creates an interpretive challenge for the commentator. Where does one even find Descartes’ ethics? On close inspection of Descartes’ corpus, however, one finds him tackling a variety of ethical themes—such as virtue, happiness, moral judgment, the regulation of the passions, and the good—throughout his treatises and correspondence. The following texts are of central importance in unpacking Descartes’ ethics: the Discourse on Method, the French Preface to the Principles, the Dedicatory Letter to Princess Elizabeth for the Principles, the Passions of the Soul, and perhaps most importantly, the correspondence with Princess Elizabeth of Bohemia, Queen Christina of Sweden, and the envoy Pierre Chanut (for more details on these important interlocutors—Princess Elizabeth in particular—and how they all interacted with each other in bringing about these letters see Shapiro [2007: 1–21]).
These ethical writings can be divided into an early period and a later—and possibly mature—period. That is, the early period of the Discourse (1637) and the later period spanning (roughly) from the French Preface to the Passions of the Soul (1644–1649).
b. The Tree of Philosophy and Systematicity
Why should we take seriously Descartes’ interspersed writings on ethics, especially since he did not take the time to write a systematic treatment of the topic? Indeed, one might think that we should not give much weight to Descartes’ ethical musings, given his expressed aversion to writing about ethics. In a letter to Chanut, Descartes writes:
It is true that normally I refuse to write down my thoughts concerning morality. I have two reasons for this. One is that there is no other subject in which malicious people can so readily find pretexts for vilifying me; and the other is that I believe only sovereigns, or those authorized by them, have the right to concern themselves with regulating the morals of other people. (Letter to Chanut 20 November 1647, AT V: 86–7/CSMK: 326)
However, one should take this text with a grain of salt. For in other texts, Descartes clearly does express a deep interest in ethics. Consider the famous tree of philosophy passage:
The whole of philosophy is like a tree. The roots are metaphysics, the trunk is physics, and the branches emerging from the trunk are all the other sciences, which may be reduced to three principal ones, namely, medicine, mechanics, and morals. By ‘morals’ I understand the highest and most perfect moral system, which presupposes a complete knowledge of the other sciences and is the ultimate level of wisdom.
Now just as it is not the roots or the trunk of a tree from which one gathers the fruit, but only the ends of the branches, so the principal benefit of philosophy depends on those parts of it which can only be learnt last of all. (French Preface to the Principles, AT IXB: 14/CSM I: 186)
This passage is surprising, to say the least. Descartes seems to claim that the proper end of his philosophical program is to establish a perfect moral system, as opposed to (say) overcoming skepticism, proving the existence of God, and establishing a mechanistic science. Moreover, Descartes seems to claim that ethics is systematically grounded in metaphysics, physics, medicine, and mechanics. Ethics is not supposed to float free from the metaphysical and scientific foundations of the system.
The tree of philosophy passage is a guiding text for many commentators in interpreting Descartes’ ethics, primarily because of its vision of philosophical systematicity (Marshall 1998, Morgan 1994, Rodis-Lewis 1987, Rutherford 2004, Shapiro 2008a). Indeed, the nature of the systematicity of Descartes’ ethics has been one of the main interpretive questions for commentators. Two distinct questions of systematicity are of importance here, which the reader should keep in mind as we engage Descartes’ ethical writings.
The first question of systematicity is internal to Descartes’ ethics itself. The early period of Descartes’ ethics, that is, the Discourse, is characterized by Descartes’ provisional morality. Broadly construed, the provisional morality seems to be a temporary moral guide—a stop gap, as it were—so that one can still live in the world of bodies and people while simultaneously engaging in hyperbolic doubt for the sake of attaining true and certain knowledge (scientia). As such, one might expect Descartes to revise the four maxims of the provisional morality once foundational scientia is achieved. Presumably, the perfect moral system that Descartes envisions in the tree of philosophy is not supposed to be a provisional morality. However, some commentators have claimed that the provisional morality is actually Descartes’ final moral view (Cimakasky & Polansky 2012). Others, however, take a developmental view, arguing that Descartes’ later period, although related to the provisional morality, makes novel and distinct advancements (Marshall 1998, Shapiro 2008a).
The second question of systematicity concerns how Descartes’ ethics relates to the rest of his philosophy. To fully understand this question, we must distinguish two senses of ethics (la morale) in the tree of philosophy (Parvizian 2016). First, there is ethics qua theoretical enterprise. This concerns a theory of virtue, happiness, the passions, and other areas. Second, there is ethics qua practical enterprise. That is, the exercise of virtue, the attainment of happiness, the regulation of the passions. Thus, one may distinguish, for example, the question of whether a theory of virtue depends on metaphysics, physics, and the like, from whether exercising virtue depends on knowledge of metaphysics, physics, and the like. Commentators tend to agree that theoretical ethics presupposes the other parts of the tree, although how this is supposed to work out with respect to each field has not been fully fleshed out. For example: what is the relationship between mechanics and ethics? However, there is substantive disagreement about whether exercising virtue presupposes knowledge of metaphysics or contributes to knowledge of metaphysics.
c. The Issue of Novelty
Another broad interpretive question concerns how Descartes’ ethics relates to past ethical theories, and whether Descartes’ ethics is truly novel (as he sometimes claims). It is undeniable that Descartes’ ethics is, in certain respects, underdeveloped. Given that Descartes is well versed in the ethical theories of his predecessors, one might be tempted to fill in the details Descartes does not spell out by drawing on other sources (for example, the Stoics).
This is a complicated matter. In section 3, Descartes claims that he is advancing beyond ancient ethics, particularly with his theory of virtue. This is in line with Descartes’ more general tendency to claim that his philosophical system breaks from the ancient and scholastic philosophical tradition (Discourse I, AT VI: 4–10/CSM I: 112–115). However, in some texts Descartes suggests that he is building upon past ethical theories. For example, Descartes tells Princess Elizabeth:
To entertain you, therefore, I shall simply write about the means which philosophy provides for acquiring that supreme felicity which common souls vainly expect from fortune, but which can be acquired only from ourselves.
One of the most useful of these means, I think, is to examine what the ancients have written on this question, and try to advance beyond them by adding something to their precepts. For in this way we can make the precepts perfectly our own and become disposed to put them into practice. (Letter to Princess Elizabeth 21 July 1645, AT IV: 252/CSMK: 256; emphasis added)
Given such a text, a commentator would certainly be justified in drawing on other sources to illuminate Descartes’ ethical positions (such as the nature of happiness vis-à-vis the Stoics). Thus, although Descartes claims that he is breaking with the past, one still ought to explore the possibility that his ethics builds on, for example, the Aristotelian and Stoic ethics with which he was surely acquainted. Indeed, some commentators have argued that Descartes’ ethics is indebted to Stoicism (Kambouchner 2009, Rodis-Lewis 1957, Rutherford 2004 & 2014).
2. The Provisional Morality
Descartes’ first stab at ethics is in Discourse III. In the Discourse, Descartes lays out a method for conducting reason in order to acquire knowledge. This method requires an engagement with skepticism, which raises the question of how one should live in the world when one has yet to acquire knowledge and must suspend judgment about all dubitable matters. Perhaps to ward off the classic apraxia objection to skepticism, that is, the objection that one cannot engage in practical affairs if one is truly a skeptic (Marshall 2003), Descartes offers a “provisional morality” to help the temporary skeptic and seeker of knowledge still act in the world. Descartes writes:
Now, before starting to rebuild your house, it is not enough simply to pull it down, to make provision for materials and architects (or else train yourself in architecture), and to have carefully drawn up the plans; you must also provide yourself with some other place where you can live comfortably while building is in progress. Likewise, lest I should remain indecisive in my actions while reason obliged me to be so in my judgements, and in order to live as happily as I could during this time, I formed for myself a provisional moral code consisting of just three or four maxims, which I should like to tell you about. (Discourse III, AT VI: 22/CSM I: 122)
Notice that Descartes is ambiguous about whether the provisional morality consists of three or four maxims. There is some interpretive debate about this matter. We will discuss all four candidate maxims. Furthermore, we will bracket the issue of how to understand the provisional nature of this morality (see, for example, LeDoeuff 1989, Marshall 1998 & 2003, Morgan 1994, Shapiro 2008a). However, it should be noted that Descartes does refer to the provisional morality even in his later ethical writings, which suggests that the maxims are not entirely abandoned once skepticism is defeated (see Letter to Princess Elizabeth 4 August 1645, AT IV: 265–6/CSMK: 257–8).
a. The First Maxim
Maxim One can be divided into three claims:
M1a: The moral agent ought to obey the laws and customs of her country.
M1b: The moral agent ought to follow their religion.
M1c: In all other matters not addressed by M1a and M1b, the moral agent ought to follow the most commonly accepted and sensible opinions of her community. (Discourse III, AT VI: 23/CSM I: 122)
Descartes claims that during his skeptical period he found his own “opinions worthless” (Ibid.). In the absence of genuine moral knowledge to guide our practical actions, Descartes claims that the best we can do is conform to the moral guidelines offered in the laws and customs of one’s country, religion, and the moderate and sensible opinions of one’s community. As Vance Morgan notes, M1 is strikingly anti-Cartesian, as it calls the moral agent to an “unreflective social conformism” (1994: 45). But as we see below, M1, at least partially, does not seem to be abandoned in Descartes’ later ethical writings.
b. The Second Maxim
Maxim Two states:
M2: The moral agent ought to be firm and decisive in her actions, and to follow even doubtful opinions once they are adopted, with no less constancy than if they were certain.
The motivation for M2 seems to be the avoidance of irresolution, which Descartes later characterizes as an anxiety of the soul in the face of uncertainty that prevents or delays the moral agent from taking up a course of action (Passions III.170, AT XI: 459–60/CSM I: 390–1). Descartes writes that, since “in everyday life we must often act without delay, it is a most certain truth that when it is not in our power to discern the truest opinions, we must follow the most probable” (Discourse III, AT VI: 25/CSM I: 123). Descartes discusses a traveler lost in a forest to illustrate the usefulness of M2. The traveler is lost, and he does not know how to get out of the woods. Descartes’ advice is that the traveler should pick a route, even if it is uncertain, and resolutely stick to it:
Keep walking as straight as he can in one direction, never changing it for slight reasons even if mere chance made him choose it in the first place; for in this way, even if he does not go exactly where he wishes, he will at least end up in a place where he is likely to be better off than in the middle of a forest. (Ibid.)
Descartes claims that following M2 prevents the moral agent from undergoing regret and remorse. This is important because regret and remorse prevent the moral agent from attaining happiness. The notion of sticking firmly and constantly to one’s moral judgments, even if they are not certain, is a recurring theme in Descartes’ later ethical writings (it is indeed constitutive of his virtue theory).
c. The Third Maxim
Maxim Three states:
M3: The moral agent ought to master herself rather than fortune, and to change her desires rather than the order of the world.
The justification for M3 is that “nothing lies entirely within our power except our thoughts” (Ibid.). Knowing this truth will lead the moral agent to orient her desires properly, because she will have accepted that “after doing our best in dealing with matters external to us, whatever we fail to achieve is absolutely impossible so far as we are concerned” (Ibid.). To be clear, the claim is that we should consider “all external goods as equally beyond our power” (Discourse III, AT VI: 26/CSM I: 124). Unsurprisingly, Descartes claims that it takes much work to accept M3: “it takes long practice and repeated meditation to become accustomed to seeing everything in this light” (Ibid.). The claim that only our thoughts lie within our power—and that knowing this is a key to regulating the passions—is another recurring theme in Descartes’ ethical writings, particularly in his theory of the passions and generosity (see section 7).
d. The Fourth Maxim
When reading Discourse III, it seems that the provisional morality ends after the discussion of M3. Indeed, in some texts Descartes refers to “three rules of morality” (see, for instance, Letter to Princess Elizabeth 4 August 1645, AT IV: 265/CSMK: 257). However, Descartes does seem to tack on a final Fourth Maxim:
M4: The moral agent ought to devote their life to cultivating reason and acquiring knowledge of the truth, according to the method outlined in the Discourse.
M4 has a different status than the other three maxims: it is the “sole basis of the foregoing three maxims” (Discourse III, AT VI: 27/CSM I: 124). It seems that M4 is not truly a maxim of morality, however, but a re-articulation of Descartes’ commitment to acquiring genuine knowledge. The moral agent must not get stuck in skepticism, resorting to a life of provisional morality, but rather must continue and persist in her search for knowledge of the truth (with the hope of establishing a well-founded morality—perhaps the “perfect moral system” of the tree of philosophy).
3. Cartesian Virtue
We now turn to Descartes’ later ethical writings (ca. 1644–1649). Arguably, the centerpiece of these writings is a theory of (moral) virtue. Though formulated in different ways, Descartes offers a consistent definition of virtue throughout his later ethical writings, namely, that virtue consists in the firm and constant resolution to use the will well (see Letter to Princess Elizabeth 18 August 1645, AT IV: 277/CSMK: 262; Letter to Princess Elizabeth 4 August 1645, AT IV: 265/CSMK: 258; Letter to Princess Elizabeth 6 October 1645, AT IV: 305/CSMK: 268; Passions II.148, AT XI: 442/CSM I: 382; Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325). This resolution to use the will well has two main features: (1) the firm and constant resolution to arrive at one’s best moral judgments, and (2) the firm and constant resolution to carry out these best moral judgments to the best of one’s abilities. It is important to note that the scope of the discussion here concerns moral virtue, not epistemic virtue (for an account of epistemic virtue see Davies 2001, Shapiro 2013, Sosa 2012).
a. The Unity of the Virtues
Descartes claims that his definition of virtue is wholly novel, and that he is breaking off from Scholastic and ancient definitions of virtue:
He should have a firm and constant resolution to carry out whatever reason recommends without being diverted by his passions or appetites. Virtue, I believe, consists precisely in sticking firmly to this resolution; though I do not know that anyone has ever so described it. Instead, they have divided it into different species to which they have given various names, because of the various objects to which it applies. (Letter to Princess Elizabeth 4 August 1645, AT IV: 265/CSMK: 258)
It is unclear what conception of virtue Descartes is criticizing here, but it is not far-fetched that he has in mind Aristotle’s account of virtue (arete) in the Nicomachean Ethics. For, according to Aristotle, there are a number of virtues—such as courage, temperance, and wisdom—each of which are distinct characterological traits that consist of a mean between an excess and a deficiency and guided by practical wisdom (phronesis) (Nicomachean Ethics II, 1106b–1107a). For example, the virtue of courage is the mean between rashness and cowardice. Although Descartes is willing to use a similar conceptual apparatus for distinguishing different virtues—for example, he will talk extensively about a “distinct” virtue of generosity—at bottom he thinks that there are no strict metaphysical divisions between the virtues. All of the so-called virtues have one and the same nature—they are reducible to the resolution to use the will well. As he tells Queen Christina:
I do not see that it is possible to dispose it [that is, the will] better than by a firm and constant resolution to carry out to the letter all the things which one judges to be best, and to employ all the powers of one’s mind in finding out what these are. This by itself constitutes all the virtues. (Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325)
Similarly, he writes in the Dedicatory Letter to Princess Elizabeth for the Principles:
The pure and genuine virtues, which proceed solely from knowledge of what is right, all have one and the same nature and are included under the single term ‘wisdom’. For whoever possesses the firm and powerful resolve always to use his reasoning powers correctly, as far as he can, and to carry out whatever he knows best, is truly wise, so far as his nature permits. And simply because of this, he will possess justice, courage, temperance, and all the other virtues; but they will be interlinked in such a way that no one virtue stands out among the others. (AT VIIIA: 2–3/CSM:191)
In these passages, Descartes is espousing a unique version of the unity of the virtues thesis. An Aristotelian unity of the virtues entails a reciprocity or inseparability among distinct virtues (Nichomachean Ethics VI, 1144b–1145a). According to Descartes, however, there is a unity of the “virtues” because, strictly speaking, there is only one virtue, namely, the resolution to use the will well (Alanen and Svensson 2007: fn. 8; Naaman-Zauderer 2010: 179–181). When the virtues are unified in this way, they exemplify wisdom.
b. Virtue qua Perfection of the Will
But what exactly is the nature of this resolution to use the will well? And how does one go about exercising this virtue? There are three main issues that need to be addressed in order to unpack Cartesian virtue. The first and foundational issue is Descartes’ rationale for locating virtue in a perfection of the will (section 3b). The second concerns the distinct epistemic requirements for virtue (section 4a). The third concerns Descartes’ characterization of virtue as a resolution of the will (section 5c).
According to Descartes, virtue is our “supreme good” (Letter to Princess Elizabeth 6 October 1645, AT IV: 305/CSMK: 268, Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325; see also Svensson 2019b). One avenue for tackling this claim about the supreme good is to think about what we can be legitimately praised or blamed for (Parvizian 2016). According to Descartes, virtue is certainly something that we can be praised for, and vice is certainly something that we can be blamed for. Now, in order to be legitimately praised or blamed for some property, f, f must be fully within our control. If f is not fully within our control, then we cannot truly be praised or blamed for possessing f. For example, Descartes cannot be praised or blamed for being French. This is a circumstantial fact about Descartes that is wholly outside of his control. However, Descartes can be praised or blamed for his choice to join the army of Prince Maurice of Nassau, for this is presumably a decision within his control, and it is either virtuous or vicious.
But what does it mean for f to be within our control? According to Descartes, control needs to be understood vis-à-vis the freedom to dispose of our volitions. The will is the source of our power and control—it is through the will that we affirm and deny perceptions at the cognitive level, and correspondingly act at the bodily level (Fourth Meditation, AT VII: 57/CSM II: 40). We have control over f insofar as f is fully under the purview of the will. As such, the reason why our supreme good lies in our will—or more specifically a virtuous use of our will—is because our will is the only thing we truly have control over. At bottom, everything else—our bodies, historical circumstances, and even intellectual capacities—are beyond the scope of our finite power.
This is not to deny that things outside of our control might be perfections or goods. Descartes clearly recognizes that wealth, beauty, intelligence and so forth are perfections, and desirable ones (Passions III.158, AT XI: 449/CSM I: 386). They can certainly contribute, in some sense, to well-being (see section 9). However, they are neither necessary nor sufficient for virtue and happiness. Descartes certainly allows for the possibility of the virtuous moral agent who is tortured “on the rack.” What matters is how we respond to the contingencies of the world, and how we incorporate contingent perfections into our life. Such responses are, of course, dependent on the will. Thus, it is through the will alone that we attain virtue.
As such, the will is also the only legitimate source of our personal value, and thus justified self-esteem. Indeed, Descartes claims that it is through the will alone that we bear any similarity to God. For it is through the will that we can become masters of ourselves, just as God is a master of Himself (Passions III.152, AT XI: 445/CSM I: 384).
4. The Epistemic Requirements of Virtue
Although virtue is located in a perfection of the will, the intellect does have a role in Cartesian virtue. One cannot use the will well in practical affairs unless the will is guided by the right kinds of perceptions—leaving open for now what we mean by ‘right’ (Morgan 1994: 113–128; Shapiro 2008: 456–7; Williston 2003: 308–310). Nonetheless, Descartes clearly claims that the virtuous will must be guided by the intellect:
Virtue unenlightened by the intellect is false: that is to say, the will and resolution to do well can carry us to evil courses, if we think them good; and in such a case the contentment which virtue brings is not solid. (Letter to Princess Elizabeth 4 August 1645, AT IV: 267/CSMK: 258)
More specifically, Descartes claims that we need knowledge of the truth to exercise virtue. However, Descartes recognizes that this knowledge cannot be comprehensive given our limited intellectual capacities:
It is true that we lack the infinite knowledge which would be necessary for a perfect acquaintance with all the goods between which we have to choose in the various situations of our lives. We must, I think, be contented with a modest knowledge of the most necessary truths. (Letter to Princess Elizabeth 6 October 1645, AT IV: 308/CSMK: 269)
This section tackles the issue of how to judge well based on knowledge of the truth, in other words how to arrive at our best moral or practical judgments. Notice that this seems to mark a departure from the provisional morality of the Discourse, in particular M1, where our moral judgments are not guided by any knowledge given the background engagement with skepticism.
a. Knowledge of the Truth
According to Descartes, in order to judge (and act) well we need to have knowledge of the truth in both a theoretical and practical sense. That is, we must assent to a certain set of truths at a theoretical level. However, in order to judge well in a moral situation, we need to have these truths ready at hand, that is, we need practical habits of belief.
i. Theoretical Knowledge of the Truth
In a letter to Princess Elizabeth, Descartes identifies six truths that we need in order to judge well in moral situations. Four of these truths are general in that they apply to all of our actions, and two of these truths are particular in that they are applicable to specific moral situations. Let us first examine what these truths are at a theoretical level, before turning to how these truths must be transformed into practical habits of belief.
Broadly put, the four general truths are:
T1: The existence of God
T2: The real distinction between mind and body
T3: The immensity of the universe
T4: The interconnectedness of the parts of the universe
The two particular truths are:
T5: The passions can misguide us.
T6: One can follow customary moral opinions when it is reasonable to do so.
On T1: Descartes claims that we must know that “there is a God on whom all things depend, whose perfections are infinite, whose power is immense and whose decrees are infallible” (Letter to Princess Elizabeth 15 September 1645, AT IV: 291/CSMK: 265) Knowing T1 is necessary for virtue, because it “teaches us to accept calmly all the things which happen to us as expressly sent by God,” and it engenders love for God in the moral agent (Ibid.).
On T2: Descartes says that we must know the nature of the soul, “that it subsists apart from the body, and is much nobler than the body, and that it is capable of enjoying countless satisfactions not to be found in this life” (Letter to Princess Elizabeth 15 September 1645, AT IV: 292/CSMK: 265–6). Knowing T2 is necessary for virtue because it prevents the moral agent from fearing death and helps her prioritize her intellectual pursuits over her bodily pursuits.
On T3: Descartes says that we must have a “vast idea of the extent of the universe” (Ibid.). He says this vast idea of the universe is conveyed in Principles III, and that it would be useful for moral agents to have read at least that part of his physics. Having knowledge of physics is necessary for virtue, because it prevents the moral agent from thinking that the universe was only created for her, thus wishing to “belong to God’s council” (Ibid.). It is important to note that this is one of the few places where Descartes draws out any connection between his physics and ethics, although he claims in a number of places that there are fundamental connections between these two disparate fields (Letter to Chanut 15 June 1646, AT IV: 441/CSMK: 289, Letter to Chanut 26 February 1649, AT V: 290-1/CSMK: 368).
On T4: Descartes says that “though each of us is a person distinct from others, whose interests are accordingly in some way different from those of the rest of the world, we ought still to think that none of us could subsist alone and that each one of us is really one of the many parts of the universe” (Letter to Princess Elizabeth 15 September 1645, AT IV: 293/CSMK: 266). Knowing T4 is necessary for virtue, because it helps engender an other-regarding character—perhaps love and generosity—that is particularly relevant to Cartesian virtue. Indeed, virtue requires that the “interests of the whole, of which each of us is a part, must always be preferred to those of our own particular person” (Ibid.).
On T5: Descartes seems to claim that the passions exaggerate the value of the goods they represent (and thus are misrepresentational), and that the passions correspondingly impel us to the pleasures of the body. Knowing T5 is necessary for virtue, because it helps us suspend our judgments when we are in the throes of the passions, so that we are not “deceived by the false appearances of the goods of this world” (Letter to Princess Elizabeth 15 September 1645, AT IV: 294–5/CSMK: 267).
On T6: Descartes claims that “one must also examine minutely all the customs of one’s place of abode to see how far they should be followed” (Ibid.). T6 is necessary for virtue because “though we cannot have demonstrations of everything, still we must take sides, and in matters of custom embrace the opinions that seem the most probable, so that we may never be irresolute when we need to act” (Ibid.). T6 seems to be a re-articulation of M1 in the provisional morality, specifically M1a above.
ii. Practical Knowledge of the Truth
T1–T6 must be known at a theoretical level. However, Descartes claims that we also need to transform T1–T6 into habits of belief:
Besides knowledge of the truth, practice is also required if one is to be always disposed to judge well. We cannot continually pay attention to the same thing; and so, however clear and evident the reasons may have been that convinced us of some truth in the past, we can later be turned away from believing it by some false appearances unless we have so imprinted it on our mind by long and frequent meditation that it has become a settled disposition within us. In this sense the Scholastics are right when they say that virtues are habits; for in fact our failings are rarely due to lack of theoretical knowledge of what we should do, but to lack of practical knowledge—that is, lack of a firm habit of belief. (Letter to Princess Elizabeth 15 September 1645, AT IV: 295–6/CSMK: 267)
The idea seems to be this: in order to actually judge well in a moral situation, T1–T6 need to be ready at hand. We need to bring them forth before the mind swiftly and efficiently in order to respond properly in a moral situation. In order to do that, we must meditate on T1–T6 until they become habits of belief.
b. Intellect, Will, and Degrees of Virtue
There seems to be an inconsistency between Descartes’ theory of virtue and his account of the epistemic requirements for virtue. Descartes is committed to the following two claims:
Theoretical and practical knowledge of T1–T6 is a necessary condition for virtue.
One can be virtuous even if they do not have theoretical and practical knowledge of T1–
We have seen that Descartes is committed to claim (1). But why is he committed to claim (2)? Consider the following passage from the Dedicatory Letter to Elizabeth:
Now there are two prerequisites for the kind of wisdom [that is, the unity of the virtues] just described, namely the perception of the intellect and the disposition of the will. But whereas what depends on the will is within the capacity of everyone, there are some people who possess far sharper intellectual vision than others. Those who are by nature somewhat backward intellectually should make a firm and faithful resolution to do their utmost to acquire knowledge of what is right, and always to pursue what they judge to be right; this should suffice to enable them, despite their ignorance on many points, to achieve wisdom according to their lights and thus to find great favour with God. (AT VIIIA: 3/CSM I: 191)
Descartes clearly commits himself to (2) in this passage. But in the continuation of this passage he offers a way to reconcile (1) and (2):
Nevertheless, they will be left far behind by those who possess not merely a very firm resolve to act rightly but also the sharpest intelligence combined with the utmost zeal for acquiring knowledge of the truth.
According to Descartes, virtue, at its essence, is a property of the will, not the intellect. Virtue consists in the firm and constant resolution to use the will well, which requires determining one’s best practical judgments and executing them to the best of one’s abilities. However, virtue comes in degrees, in accordance with what these best practical judgments are based on. The more knowledge one has (essentially, the more perfected one’s intellect is), the higher one’s degree of virtue.
In its ideal form, virtue presupposes, at a minimum, theoretical and practical knowledge of T1–T6 (and arguably one’s virtue would be improved by acquiring further relevant knowledge). But Descartes acknowledges that not everyone has the capacity or perhaps is in a position to acquire knowledge of the truth (for instance the peasant). Nonetheless, Descartes does not want to exclude such moral agents from acquiring virtue. Virtue is not just for the philosopher. If such moral agents resolve to acquire as much knowledge as they can, and have a firm and constant resolution to use their will well (according to that knowledge), then they will secure virtue (even if they have the wrong metaphysics, epistemology, natural philosophy, or the like). Claims (1) and (2) are rendered consistent, then, once they are properly revised:
(1)* Theoretical and practical knowledge of T1–T6 is a necessary condition for ideal virtue.
(2)* One can be non-ideally virtuous while lacking full theoretical and practical knowledge of T1–T6, so long as they do their best to acquire as much relevant knowledge as they can, and to have the firm and constant resolution to use their will well accordingly.
It is clear that Descartes is usually talking about an ideal form of virtue, whenever he uses the term ‘virtue.’ When he wants to highlight discussion of non-ideal forms of virtue, he is usually clear about his target (see, for example, Dedicatory Letter to Elizabeth, AT VIIIA: 2/CSM I: 190–1). In what follows, then, the reader should assume that the virtue being discussed is of the ideal variety, that is, it is based on some perfection of the intellect. As flagged earlier, there is disagreement, of course, about how much knowledge one must have to acquire certain virtues (for example, generosity).
5. Moral Epistemology
Does Descartes have a distinct moral epistemology? In the epistemology of the Meditations, Descartes distinguishes three different kinds of epistemic states: scientia/perfecte scire (perfect knowledge), cognitio (awareness), and persuasio (conviction or opinion). Broadly construed, the distinction between these three epistemic states is as follows. Scientia is an indefeasible judgment (it is true and absolutely certain), whereas cognitio and persuasio are both defeasible judgments. Nonetheless, cognitio is of a higher status than persuasio because there is—to some degree—better justification for cognitio than persuasio. Persuasio is mere opinion or belief, whereas cognitio is an opinion or awareness backed by some legitimate justification. For example, the atheist geometer can have cognitio of the Pythagorean theorem, and can justify that cognitio with a geometrical proof. However, this cognitio fails to achieve the status of scientia because the atheist geometer is unaware of God, and thus does not know the Truth Rule, namely, that her clear and distinct perceptions are true because God is not a deceiver (Second Replies, AT VII: 141/CSM II: 101; Third Meditation, AT VII: 35/CSM II: 24; Fourth Meditation, AT VII: 60–1/CSM II: 41).
a. The Contemplation of Truth vs. The Conduct of Life
There is an important question that must be raised about the epistemic status of our best moral judgments. In what sense is a “best moral judgment” the best? That is, is a best moral judgment the best because it amounts to scientia or does it fall short—that is, is it the best cognitio or persuasio? In the Meditations, where Descartes is engaged in a sustained hyperbolic doubt, he identifies two jointly necessary and sufficient conditions for knowledge in the strict sense, that is, scientia. In the standard interpretation, a judgment will amount to scientia when it is both true and absolutely certain (Principles I.45, AT VIIIA: 21–22/CSM I: 207). A judgment can meet the conditions of truth and absolute certainty when it is grounded in divinely guaranteed clear and distinct perceptions. Though the details are tricky, it is ultimately clear and distinct perceptions that make scientia indefeasible, because the intellect and its clear and distinct perceptions are epistemically guaranteed, in some sense, by God’s benevolence and non-deceptive nature. According to Descartes, however, the epistemic standards that we must abide by in theoretical matters or “the contemplation of the truth” should not be extended to practical matters or the “conduct of life.” As he writes in the Second Replies,
As far as the conduct of life is concerned, I am very far from thinking that we should assent only to what is clearly perceived. On the contrary, I do not think that we should always wait even for probable truths; from time to time we will have to choose one of many alternatives about which we have no knowledge, and once we have made our choice, so long as no reasons against it can be produced, we must stick to it as firmly as if it had been chosen for transparently clear reasons. (AT VII: 149/CSM II: 106)
This passage tells us that our best practical judgments cannot be the best in virtue of meeting the strict standards for scientia. This is because of the distinguishing factors between the contemplation of truth from the conduct of life. First and foremost, unlike the contemplation of truth, where the goal is to arrive at a true and absolutely certain theoretical judgment that amounts to knowledge, the conduct of life is concerned with arriving at a best practical moral judgment for the sake of carrying out a course of action. Given that in morality we are ultimately concerned with action in the conduct of life, we must keep in mind that there is a temporally indexed window of opportunity to act in a moral situation (Letter to Princess Elizabeth 6 October 1645 AT IV: 307/CSMK: 269). If a moral agent tries to obtain clear and distinct perceptions in moral deliberation—something that can take weeks or even months to attain according to Descartes (Second Replies, AT VII: 131/CSM II: 94; Seventh Replies, AT VII: 506/CSM II: 344)—the opportunity to act will pass and thus the moral agent will have failed to use her will well. In short, attaining clear and distinct perceptions in a moral situation is not advisable.
Second, and perhaps more importantly, it seems that we cannot attain clear and distinct perceptions in the conduct of life. Although our best moral judgments are guided by knowledge of the truth (which are presumably based on clear and distinct perceptions), we also base our best moral judgments, in part, on perceptions of the relevant features of the moral situation. These include information about other mind-body composites, bodies, and the consequences of our action. For example, in the famous trolley problem, the moral agent has to consider her perceptions of the people tied to the track, the train and the rails, and the possible consequences that follow from directing the train one way or another at the fork in the tracks. Such information about the other mind-body composites and bodies in this moral situation is ultimately provided by sensations. And sensations, according to Descartes, provide obscured and confused content to the mind about the nature of bodies (Principles I.45, AT VIIIA: 21–2/CSM I: 207–8, Principles I.66–68, AT VIIIA: 32–33/CSM I: 126–7). As for predicting the consequences of an action, this is done through the imagination, for these consequences do not exist yet. I need to represent to my myself, through imagination, the potential consequences my action will produce in the long run. And such fictitious representations can only be obscure and confused. In short, given the imperfect kinds of perceptions that are involved in moral deliberation, our best moral judgments can never be fully grounded in clear and distinct perceptions.
b. Moral Certainty and Moral Skepticism
These perceptual facts help explain why Descartes claims that our best moral judgments can achieve only moral certainty, that is,
[C]ertainty which is sufficient to regulate our behaviour, or which measures up to certainty we have on matters relating to the conduct of life which we never doubt, though we know that it is possible, absolutely speaking, that they may be false. (Principles IV.204, AT VIIIA: 327/CSM I: 289, fn. 1; see also Schachter 2005, Voss 1993)
Given that even our best moral judgments can achieve only moral certainty, Descartes seems to be claiming that we cannot have first-order moral knowledge. That is, when I make a moral judgment of the following form “I ought to f in x moral situation,” that moral judgment will never amount to knowledge in the strict sense. Nonetheless, morally certain moral judgments are certainly not persuasio, as they are backed with some justification. Thus, we should regard them as attaining the status of cognitio—just shy of scientia (but for different reasons than the cognitio of the atheist geometer, presuming that the moral agent has completed the Meditations and knows that her faculties—in normal circumstances—are reliable).
However, it is important to note that Descartes is not claiming that first-order moral knowledge is impossible tout court. That is, Descartes is not a non-cognitivist about moral judgments, claiming that moral judgments are neither true nor false. Cartesian moral judgments are truth-evaluable; that is, they are capable of being true or false. Descartes, then, is a cognitivist about moral judgments. As Descartes says, we must recognize that although our best practical moral judgments are morally certain, they may still, “absolutely speaking,” be false. If Descartes is a moral skeptic of some stripe, he should be understood as making a plausible claim about our limitations as finite minds. A finite mind, given its limited and imperfect perceptions, cannot attain first-order moral knowledge because it cannot ultimately know whether its first-order moral judgments are true or false. However, an infinite mind—God—surely knows whether the first-order moral judgments of finite minds are true or false. First-order moral knowledge is possible—finite minds just cannot attain it.
One final remark. One might resist the standard interpretation that we cannot have first-order moral knowledge, by claiming that Descartes is not a moral skeptic at all, because the standards for knowledge shift from the contemplation of truth to the conduct of life. That is, Descartes might be an epistemic contextualist. Epistemic contextualism is the claim that the meaning of the term ‘knows’ shifts depending on the context, in the same way the meaning of the indexical ‘here’ shifts depending on the context. If Jones utters the sentence ‘Brown is here,’ the meaning of the sentence will shift depending on where Jones is when he utters it (Rysiew 2007). This kind of contextualist view has been suggested in passing by Lex Newman (2016), who argues that Descartes’ epistemic standards shift depending on whether he is doing metaphysics or science (Principles IV.205–6, AT VIIIA: 327–9/CSM I: 289–291). Although Newman does not extend this contextualist interpretation to Descartes’ moral epistemology, it would take only a few steps to do so. Nonetheless, it strains credulity to think that first-order moral judgments could ever meet the standards of scientia in the Meditations.
c. Virtue qua Resolution
We can now clarify why Descartes characterizes virtue in terms of a resolution. The conduct of life presents us with a unique epistemic challenge that does not arise in the contemplation of truth. That is: (1) we have a short of window of opportunity to arrive at a moral judgment and then act, and (2) the perceptions that in part serve as the basis for our judgments are ultimately obscure and confused. These two features can give rise to irresolution. Irresolution, according to Descartes, is a kind of anxiety which causes a person to withhold from performing an action, creating a cognitive space for the person to make a choice (Passions III.170, AT XI: 459/CSM I: 390). As such, irresolution can be a good cognitive trait. However, irresolution becomes problematic when one has “too great a desire to do well” (Passions III. 170, AT XI: 460/CSM I: 390). If one wants to arrive at the most perfect moral judgment (such as through grounding their moral judgments in clear and distinct perceptions), they will ultimately fall into an excessive kind of irresolution which prevents them from judging and acting at all. Given the nature of moral situations and what is at stake within them (essentially, how we ought to treat other people), the conduct of life is ripe for producing this excessive kind of irresolution. Arguably, we do want perfection in our moral conduct.
This is why Descartes says virtue involves a resolution: we need to establish a firm and constant resolve to arrive at our best moral judgments and to carry them out, even though we realize that these judgments are only morally certain and can be false. So long as we have this firm resolve (which is of course guided by knowledge of the truth), we can be assured that we have done our duty, even if it turns out that we retrospectively determine that what we did was wrong. For we can control only our will—how our action plays out in the real world is beyond our control, and there is no way we can guarantee that we will always produce the right consequences. As Descartes tells Princess Elizabeth:
There is nothing to repent of when we have done what we judged best at the time when we had to decide to act, even though later, thinking it over at our leisure, we judge that we made a mistake. There would be more ground for repentance if we had acted against our conscience, even though we realized afterwards that we had done better than we thought. For we are only responsible for our thoughts, and it does not belong to human nature to be omniscient, or always to judge as well on the spur of the moment as when there is plenty of time to deliberate.
(Letter to Princess Elizabeth 6 October 1645, AT IV 308/CSMK: 269; see also Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325)
Consistent with Descartes’ grounding of virtue in a perfection of the will, Descartes’ view of moral responsibility is that we are responsible only for what is truly under our control—that is, our thoughts (or more specifically our volitions). Notice that the seeds of this full analysis of virtue qua resolution is present in the provisional morality, namely, M2.
6. The Passions
Strictly speaking, Descartes’ Passions of the Soul is not an ethical treatise. As Descartes writes, “my intention was to explain the passions only as a natural philosopher, and not as a rhetorician or even as a moral philosopher” (Prefatory Letters, AT XI: 326/CSM I: 327). Nonetheless, the passions have a significant status within Descartes’ ethics. At the end of Passions, Descartes writes: “it is on the passions alone that all the good and evil of this life depends” and “the chief use of wisdom lies in its teaching us to be masters of our passions and to control them with such skill that the evils which they cause are quite bearable, and even become a source of joy” (Passions III.212, AT XI: 488/CSM I: 404). Thus, it is important to discuss Cartesian passions in order to understand Descartes’ ethics. We will consider (1) the function of the passions and (2) whether the passions are merely motivational or representational states.
a. The Definition of the Passions
Descartes identifies a general sense of the term ‘passion,’ which covers all states of the soul that are not, in any way, active. That is, passions are passive and thus are perceptions: “all our perceptions, both those we refer to objects outside us and those we refer to the various states of our body, are indeed passions with respect to our soul, so long as we use the term ‘passion’ in its most general sense” (Passions I.25, AT XI: 347–8/CSM I: 337). Thus, a general use of the term ‘passion’ would include the following kinds of perceptions: smells, sounds, colors, hunger, pain, and thirst, all of which are states that we refer to objects outside of us (Passions I.29, AT XI: 350/CSM I: 339). However, the more narrow and strict sense of passions that is examined in the Passions are “those perceptions, sensations, or emotions of the soul which we refer particularly to it, and which are caused, maintained and strengthened by some movement of the spirits” (Passions I.27, AT XI: 349/CSM I: 338–9). Descartes identifies six primitive passions, out of which all of the other more complex passions are composed. These are wonder, love, hatred, joy, sadness, and desire. Each primitive and complex passion is distinguished from the others in terms of its physiological and causal basis (roughly, the animal spirits which give rise to it) and its cognitive nature and specific function (Passions II.51–2, AT XI: 371–2/CSM I: 349).
b. The Function of the Passions
Given Descartes’ general resistance to teleology, there is much to be said about how to understand the nature of Cartesian functions in general, and specifically the function of the passions (Brown 2012). Setting aside the issue of reconciling any metaphysical inconsistencies, it is clear that Descartes does think that the passions have some kind of function, and we must be mindful of this in interpreting Descartes.
In Passions II.52, Descartes identifies the generalfunction of the passions:
I observe, moreover, that the objects which stimulate the senses do not excite different passions in us because of differences in the objects, but only because of the various ways in which they may harm or benefit us or in general have importance for us. The function of all the passions consists solely in this, that they dispose our soul to want the things which nature deems useful for us, and to persist in this volition; and the same agitation of the spirits which normally causes the passions also disposes the body to make movements which help us to attain these things. (AT XI: 372/CSM I: 349, cf. Passions I.40, AT XI: 359/CSM I: 343)
Descartes claims that the general function of the passions is to dispose the soul to want the things which nature deems useful for us, and to also dispose the body to move in the appropriate ways so as to attain those things. Put more simply, the passions are designed to preserve the mind-body composite. How exactly that plays out will depend on the kind of passion under consideration. As Descartes writes in Passions I.40, fear disposes the soul to want to flee (a bodily action) and courage disposes the soul to want to fight (a bodily action as well).
It is important to note that the general function assigned to the passions is similar to, but slightly different than, the one assigned to sensations in the Sixth Meditation. In the context of his sensory theodicy, Descartes writes: “the proper purpose of the sensory perceptions given me by nature is simply to inform the mind of what is beneficial or harmful for the composite of which the mind is a part” (Sixth Meditation, AT VII: 83/CSM II: 57). Supposing that Descartes does not have passions in mind in the Sixth Meditation, and given Descartes’ strict distinction between sensations and passions in the Passions, it seems that passions and sensations have different functions. The function of a passion is to dispose the soul to want what is beneficial or harmful for it, while the function of a sensation is to inform the soul of what is beneficial or harmful for it. This would suggest that sensations are perhaps representational states (De Rosa 2007, Gottlieb & Parvizian 2018, Hatfield 2013, Simmons 1999), whereas the passions are merely motivational.
But matters are more complicated. A vexing issue for commentators has been how the passions fulfill their function of disposing the soul to want certain things. It is clear that the passions are motivational. But the interpretive issue for commentators has been whether the passions are merely motivational (and thus non-intentional, affective states), or whether they are, to some degree, representational as well. Settling this issue is important, because it helps clarify whether the passions ought to serve as guides to our practical behavior.
c. Whether the Passions are Representational or Motivational
The standard interpretation is that the passions are representational in addition to being motivational (Alanen 2003a & 2003b, Brown 2006, Clarke 2005, Franco 2015). Sometimes commentators describe the passions as being informative, but the best way to cash this out given Descartes’ philosophy of mind is in terms of representation. There are broad reasons for claiming that the passions are representational. If one thinks that the passions are a type of idea, then it seems that they must be representational, for Descartes claims in the Third Meditation that all ideas have intentionality: “there can be no ideas which are not as it were of things” (AT VII: 44/CSM II: 30). Moreover, Descartes seems to make a representationalist claim about the passions in T5: “all our passions represent to us the goods to whose pursuit they impel us as being much greater than they really are” (Letter to Princess Elizabeth 15 September 1645, AT IV: 294–295/CSMK: 267; see also Passions II. 90, AT XI: 395/CSM I: 360). Strictly speaking, the claim here seems to be that the passions have representational content—they represent goods for the mind-body composite—but they are ultimately misrepresentational because they exaggerate the value of those goods. However, it is claimed that the passions can be a guide to our survival and preservation once they are regulated by reason. According to John Marshall, once the passions are regulated they can become accurate representations of goods (1998: 119–125). As such, the passions can be reliable guides to our survival and preservation under the right circumstances.
Alternatively, it has been argued that, despite the textual evidence, Descartes’ considered view is that the passions are merely motivational states (Greenberg 2007, Brassfield 2012). Shoshana Brassfield has argued that the passions are motivational states which serve to strengthen and prolong certain thoughts which are good for the soul to cognitively sustain. When Descartes speaks of the passions representing, we need to re-read him as actually saying one of two things. First, he may be clarifying a representational content (distinct from a passion) that a particular passion strengthens and prolongs. Second, he may be discussing how the passions lead us to exaggerate the value of objects in our judgments, by prolonging and strengthening certain judgments, which thus make us mistakenly affirm that a particular object is more valuable than it actually is.
The upshot of this type of motivational reading of the passions is that the passions are not intrinsic guides to our survival and preservation, and that we should suspend judgment about how to act when we are moved by the passions. It is reason alone that is the guide to what is good and beneficial for the mind-body composite. The passions are, in some sense, beneficial when they are regulated by reason (and thus lead, for example, to proper experiences of joy and thus happiness), but they are not beneficial when reason is guided by the passions.
7. Generosity
According to Descartes, generosity—a species of wonder—is both a passion and a virtue (Passions III.153, AT XI: 445–6/CSM I: 384, Passions III.161, AT XI: 453–4/CSM I: 387–8). Generosity transitions from a passion to a virtue once the passion becomes a habit of the soul (Passions III.161, AT XI: 453–4/CSM I: 387–8). Having already discussed passions, we will focus on generosity qua virtue. Generosity is the chief virtue in Descartes’ ethics because it is the “key to all the virtues and a general remedy for every disorder of the passions” (Passions III.161, AT XI: 454/CSM I: 388). Descartes defines generosity as that:
Which causes a person’s self-esteem to be as great as it may legitimately be, [and] has only two components. The first consists in his knowing that nothing truly belongs to him but this freedom to dispose his volitions, and that he ought to be praised or blamed for no other reason than his using this freedom well or badly. The second consists in his feeling within himself a firm and constant resolution to use it well—that is, never to lack the will to undertake and carry out whatever he judges to be best. To do that is to pursue virtue in a perfect manner. (Passions III.153, AT XI: 445–6/CSM I: 384)
Generosity has two components. The first, broadly construed, consists in the knowledge that the only thing that truly belongs to us is our free will. The second, broadly construed, consists in feeling the firm and constant resolution to use this free will well.
a. Component One: What Truly Belongs to Us
What is particularly noteworthy about Descartes’ definition of generosity is the first component. Descartes claims that the first component of generosity consists in knowledge of the following proposition: the only thing that truly belongs to me is my free will. This is certainly a strong claim, which goes beyond Descartes’ account of the role of the will in virtue, as discussed in section 3. Recall that we claimed that virtue is grounded in a perfection of the will, because only our volitions are under our control. Descartes is taking this a step further here: he now seems to be claiming that the only thing that truly belongs to us is free will. In claiming that free will “truly belongs” to us, Descartes seems to be making a new metaphysical claim about the status of free will within a finite mind. But how exactly should this claim be interpreted?
The locution “belongs” and “truly belongs” is typically used by Descartes to make a metaphysical claim about the essence of a substance. For example, Descartes claims that his body does not truly belong to his essence (see Sixth Meditation, AT VII: 78/CSM II: 54). If Descartes is making a claim about our metaphysical essence in the definition of generosity, then this claim seems to be in clear contradiction with the account of our metaphysical essence in the Meditations and Principles. There, Descartes claims that he is essentially a thinking thing, res cogitans (Second Meditation, AT VII: 28/CSM II: 19). Although a body also belongs to him in some sense (Sixth Meditation, AT VII: 80/CSM II: 56; see also Chamberlain 2019), he can still draw a real distinction between his mind and body, which implies that what truly belongs to him is thought. Thought, in the Meditations, has a broad scope: in particular, it includes both the intellect and the will as well as all of the different types of perceptions and volitions that fall under these two faculties (Principles I.9, AT VIIIA: 7–9/CSM I: 195). However, in the first component of generosity, Descartes seems to be claiming that there is a particular kind of thought that truly belongs to us, namely, our free will and its corresponding volitions. As such, the moral agent is not strictly speaking a res cogitans; rather, she is a willing thing, res volans (Brown 2006: 25; Parvizian 2016).
Commentators have picked up on this difficulty in Descartes’ definition of generosity. There are two interpretations in the literature. One reading goes in for a metaphysicalreading of ‘truly belongs,’ according to which Descartes is making a metaphysical claim about our true essence (Boehm 2014: 718–19). Another reading takes an evaluative reading which approximates the standard account of why virtue is a perfection of the will, namely, that Descartes is making a claim about what is under our control—that is, our volitions—and thus what we can be truly praised and blamed for (Parvizian 2016). In this reading, there is a sense in which a human being is truly a res volans, but this does not metaphysically exclude the other properties of a res cogitans from its nature.
b. Acquiring Generosity
How is the chief virtue of generosity acquired? Descartes writes:
If we occupy ourselves frequently in considering the nature of free will and the many advantages which proceed from a firm resolution to make good use of it—while also considering, on the other hand, the many vain and useless cares which trouble ambitious people—we may arouse the passion of generosity in ourselves and then acquire the virtue. (Passions III. 161, AT XI: 453–4/CSM I: 388)
Here, Descartes claims that we need to reflect on two aspects of the will. First, we need to reflect on the very nature of the will. This includes facts such as its freedom, its being infinite in scope, and its different functional capacities. Second, we need to reflect on the advantages and disadvantages that come from using it well and poorly, respectively. This reflection on the advantages and disadvantages, interestingly, seems to require observation of other people’s behavior. As Descartes writes, we need to observe “the main vain and useless cares which trouble ambitious people,” which will help us appreciate the value and efficacy of the will. There are some commentators who have claimed that this process for acquiring generosity is exemplified in the Second or Fourth Meditation (Boehm 2014, Shapiro 2005), while other commentators have argued that the meditator cannot engage in the process of acquiring generosity until after the Meditations have been completed (Parvizian 2016).
c. Generosity and the Regulation of the Passions
Throughout the Passions, Descartes indicates different ways to remedy the disorders of the passions. Descartes claims, for example, that the exercise of virtue is a remedy against the disorders of the passions, because then “his conscience cannot reproach him,” which allows the moral agent to be happy amidst “the most violent assaults of the passions” (Passions II.148, AT XI: 441–2/CSM I: 381–2). However, Descartes claims that generosity is a “general remedy for every disorder of the passions” (Passions III.161, AT XI: 454/CSM I: 388). Descartes writes:
They [generous people] have mastery over their desires, and over jealousy and envy, because everything they think sufficiently valuable to be worth pursuing is such that its acquisition depends solely on themselves; over hatred of other people, because they have esteem for everyone; over fear, because of the self-assurance which confidence in their own virtue gives them; and finally over anger, because they have little esteem for everything that depends on others, and so they never give their enemies any advantage by acknowledging that they are injured by them. (Passions III.156, AT XI: 447–8/CSM I: 385)
Generosity is a general remedy for the disorders of the passions because it ultimately leads the moral agent to a proper conception of what she ought to esteem. At bottom, the problem of the passions is that they lead us to misunderstand the value of various external objects, and to place our own self-esteem in them. Once we understand that the only property that is truly valuable is a virtuous will, then all the passions will be regulated.
d. The Other-Regarding Nature of Generosity
Although Descartes’ definition of generosity is certainly not standard, his account of how generosity manifests in the world does coincide with our standard intuitions about what generosity looks like. According to Descartes, the truly generous person is fundamentally other-regarding:
Those who are generous in this way are naturally led to do great deeds, and at the same time not to undertake anything of which they do not feel themselves capable. And because they esteem nothing more highly than doing good to others and disregarding their own self-interest, they are always perfectly courteous, gracious, and obliging to everyone. (Passions III.156, AT XI: 447–8/CSM I: 385)
The fundamental reason why the generous person is other-regarding is that she realizes that the very same thing that causes her own self-esteem, a virtuous will, is present or at least capable of being present in other people (Passions III.154, AT XI: 446–7/CSM I: 384). That is, since others have a free will, they are also worthy of value and esteem and thus must be treated in the best possible way. A fundamental task of the generous person is to help secure the conditions for other people to realize their potential to acquire a virtuous will.
8. Love
Love is a passion that has direct ethical implications for Descartes, for in its ideal form love is altruistic, other-regarding, and requires self-sacrifice. Descartes distinguishes between different kinds of love: affection, friendship, devotion, sensory love, and intellectual love (Passions II. 83 AT XI: 389–90/CSM I: 357–8; Letter to Chanut 1 February 1647, AT IV: 600–617/CSMK: 305–314). We examine love in general:
Love is an emotion of the soul caused by a movement of the spirits, which impels the soul to join itself willingly to objects that appear to be agreeable to it. (Passions II.79, AT XI: 387/CSM I: 356)
In explicating what it means for the soul to join itself willingly to objects, Descartes writes:
In using the word ‘willingly’ I am not speaking of desire, which is a completely separate passion relating to the future. I mean rather the assent by which we consider ourselves henceforth as joined with what we love in such a manner that we imagine a whole, of which we take ourselves to be only one part, and the thing loved to be the other. (Passions II.80 AT XI: 387/CSM I: 356)
In short, love involves an expansion of the self. The lover regards herself and the beloved as two parts of a larger whole. But this raises an important question: is there a metaphysical basis for this part-whole relationship? Or is the part-whole relationship merely a product of the imagination and the will?
a. The Metaphysical Reading
One could try to provide a metaphysical basis for love by arguing that people are metaphysical parts of larger wholes. If so, then there would be metaphysical grounds “to justify a very expansive love” (Frierson 2002: 325). Indeed, Descartes seems to claim as much in his account of T4:
Though each of us is a person distinct from others, whose interests are accordingly in some way different from those of the rest of the world, we ought still to think that none of us could subsist alone and that each one of us is really one of the many parts of the universe, and more particularly a part of the earth, the state, the society and the family to which we belong by our domicile . . . and the interests of the whole, of which each of us is a part, must always be preferred to those of our own particular person. (Letter to Princess Elizabeth 15 September 1645, AT IV: 293/CSMK: 266)
Descartes uses suggestive metaphysical language here. Indeed, he claims that people cannot subsist without the other parts of the universe (which includes other people), and that we are parts of a larger whole. Given this metaphysical basis of love, then, the interests of the whole should be preferred to the interests of any given part.
There are interpretive problems for a metaphysical basis of love, however. For one, Descartes does not spell out this metaphysical relation in any detail. Moreover, such a metaphysical relation seems to fly in the face of Descartes’ account of the independent nature of substances and the real distinction between minds and bodies. To say that persons (mind-body composites) are parts of larger wholes would seem to suggest that (1) mind-body composites are modes and not substances, and consequently that (2) there is no real distinction between mind-body composites.
b. The Practical Reading
Alternatively, one could give a practical basis for love, by arguing that we ought to consider or imagine ourselves as parts of larger wholes, even though metaphysically we are not (Frierson 2002). As Descartes writes to Princess Elizabeth:
If we thought only of ourselves, we could enjoy only the goods which are peculiar to ourselves; whereas, if we consider ourselves as parts of some other body, we share also in the goods which are common to its members, without losing any of those which belong only to ourselves. (Letter to Princess Elizabeth 6 October 1645, AT IV: 308/CSMK: 269)
There are practical reasons for loving others, because doing so allows us to partake in their joy and perfections. Of course, this raises the problem that we will also partake in their imperfections and suffering. On this issue Descartes writes:
With evils, the case is not the same, because philosophy teaches that evil is nothing real, but only a privation. When we are sad on account of some evil which has happened to our friends, we do not share in the defect in which this evil consists. (Letter to Princess Elizabeth 6 October 1645, AT IV: 308/CSMK: 269)
In either the metaphysical or practical reading, however, it is clear that love has a central role in Descartes’ ethics. According to Descartes, inculcating and exercising love is central for curbing one’s selfishness and securing the happiness, well-being, and virtue of others (see also Letter to Chanut 1 February 1647, AT VI: 600–617/CSMK: 305–314). For further important work on Cartesian love see Frigo (2016), Boros (2003), Beavers (1989), Williston (1997).
9. Happiness
In general, Descartes characterizes happiness as an inner contentment or satisfaction of the mind that results from the satisfaction of one’s desires. However, he draws a distinction between mere happiness (bonheur) and genuine happiness or blessedness (felicitas; félicité/béatitude). Mere happiness, according to Descartes, is contentment of mind that is acquired through luck and fortune. This occurs through the acquisition of goods—such as honors, riches, and health—that do not truly depend on the moral agent (that is, her will) but external conditions. Although the moral agent is satisfying her desires, these desires are not regulated by reason. As such, she seeks things beyond her control. Blessedness, however, is a supreme contentment of mind achieved when the moral agent satisfies desires that are regulated by reason, and reason dictates that we ought to prioritize and desire virtue and wisdom. This is because virtue and wisdom are goods that truly depend on the moral agent, as they truly proceed from the right use of the will, and do not depend on any external conditions. As Descartes writes:
We must consider what makes a life happy, that is, what are the things which can give us this supreme contentment. Such things, I observe, can be divided into two classes: those which depend on us, like virtue and wisdom, and those which do not, like honors, riches, and health. For it is certain that a person of good birth who is not ill, and who lacks nothing, can enjoy a more perfect contentment than another who is poor, unhealthy and deformed, provided the two are equally wise and virtuous. Nevertheless, a small vessel may be just as full as a large one, although it contains less liquid; and similarly if we regard each person’s contentment as the full satisfaction of all his desires duly regulated by reason, I do not doubt that the poorest people, least blest by nature and fortune, can be entirely content and satisfied just as much as everyone else, although they do not enjoy as many good things. It is only this sort of contentment which is here in question; to seek the other sort would be a waste of time, since it is not in our own power. (Letter to Princess Elizabeth 4 August 1645, AT IV: 264–5/CSMK: 257)
It is important to note that Descartes is not denying that honors, riches, beauty, health, and so on are genuine goods or perfection. Nor is he claiming that they are not desirable. Rather, he is merely claiming that such goods are neither necessary nor sufficient for blessedness. Virtue alone is necessary and sufficient for blessedness (Svensson 2015).
However, such external goods are conducive to well-being (the quality of life), and for that reason they are desirable (Svensson 2011). Compare a virtuous person, S, who is poor, unhealthy, and ugly and a virtuous person, S*, who is rich, healthy, and beautiful. In Svensson’s reading, S and S* will have the same degree of happiness. However, Descartes does have room to acknowledge that S* has more well-being than S, because S* possesses more perfections.
10. Classifying Descartes’ Ethics
We have examined the main features of Descartes’ ethics. But what kind of ethics does Descartes espouse? There are three distinct classifications of Descartes’ ethics in the literature: virtue ethics, deontological virtue ethics, and perfectionism.
a. Virtue Ethics
Given that virtue is the undeniable centerpiece of Descartes’ ethics, it is natural to read Descartes as a virtue ethicist. Broadly construed, according to virtue ethics, the standard for morality in ethics is possession of the right kinds of character traits (virtues), as opposed to producing the right sorts of consequences, or following the right kinds of moral laws, duties, or rules.
Lisa Shapiro has argued that Descartes is a virtue ethicist. Her contention is that Descartes’ commitment to virtue (as opposed to happiness) being the supreme good makes Descartes a virtue ethicist (2008a: 454). In this view, the ultimate explanation for why an action is good or bad is whether it proceeds from virtue. This would place Descartes in the tradition of Aristotelian virtue ethics, but Shapiro notes that there are significant differences. For Aristotle, virtue must be successful: “virtue requires the world cooperate with our intentions” (2008a: 455). Whereas given Descartes’ moral epistemology, for Descartes “good intentions are sufficient for virtue” (Ibid.).
b. Deontological Virtue Ethics
Noa Naaman-Zauderer (2010) agrees with Lisa Shapiro that Descartes is a virtue ethicist, due to his commitment to virtue being the supreme good. However, Naaman-Zauderer claims that Descartes has a deontological understanding of virtue, and thus Descartes is actually a deontological virtue ethicist. Broadly construed, deontological ethics maintain that the standard of morality consists in the fulfillment of imperatives, duties, or ends.
Descartes indeed speaks of virtue in deontological terms. For example, he writes that the supreme good (virtue) is “undoubtedly the thing we ought to set ourselves as the goal of all our actions” (Letter to Princess Elizabeth 18 August 1645, AT IV: 275/CSMK: 2561). According to Naaman-Zauderer, Descartes is claiming that we have a duty to practice virtue: “the practice of virtue as a command of reason, as a constitutive moral imperative that we must fulfill for its own sake” (2010: 185).
c. Perfectionism
Frans Svensson (2010; compare 2019a) has argued that Descartes is not a virtue ethicist, and that other commentators have mistakenly classified him as such due to a misunderstanding of the criteria of virtue ethics. Recall that Shapiro and Naaman-Zauderer claim that Descartes must be a virtue ethicist (of whatever stripe) due to his claim that virtue is the supreme good. However, Svensson claims that virtue ethics, deontological ethics, and consequential ethics alike can, strictly speaking, admit that virtue is the supreme good, in the sense that virtue should be the goal in all of our actions (2010: 217). Descartes’ account of the supreme good, then, does not make him a virtue ethicist.
The criterion for being a virtue ethicist is that “morally right conduct should be grounded ultimately in an account of virtue or a virtuous agent” (Ibid. 218). This requires an explanation of the nature of virtue that does not depend on some independent account of morally right conduct. The problem, however, is that although Descartes agrees that virtue can be explained without reference to some independent account of morally right conduct, Descartes departs from the virtue ethicist in that he thinks that virtue is not constitutive of morally right conduct.
Instead, Svensson proposes that Descartes is committed to perfectionism. In this view, what Descartes’ ethics demands is that the moral agent pursue “everything in his power in order to successfully promote his own overall perfection as far as possible” (Ibid. 221). As such, Svensson claims that Descartes’ ethics is “outcome-based, rather than virtue-based, and it is thus best understood as a kind of teleological, or even consequentialist ethics” (Ibid. 224).
11. Systematicity Revisited
Are there systematic connections between Descartes’ ethics and his metaphysics, epistemology, and natural philosophy? There are broadly two answers to this question in the literature: the epistemological reading and the organic reading.
a. The Epistemological Reading
In the epistemological reading, the tree of philosophy conveys an epistemological order to Cartesian philosophy (Marshall 1998, 2–4, 72– 74, 59–60; Morgan 1994, 204–211; Rutherford 2004, 190). One must learn philosophy in the following order: metaphysics and epistemology, physics, and then the various sub-branches of natural philosophy, and finally ethics. As applied to ethics, proponents of the epistemological reading are primarily concerned with an epistemological order to ethics qua practical enterprise, not theoretical enterprise. For example, in order to acquire virtue and happiness, one must have knowledge of metaphysics and epistemology. As Donald Rutherford writes: virtue and happiness “can be guaranteed only if reason itself has been perfected through the acquisition and proper ordering of intellectual knowledge” (2004: 190).
A consequence of the epistemological reading is that one cannot read any ethical practices into the Meditations. While there may be ethical themes in the Meditations, the meditator cannot acquire or exercise any kind of moral virtue (epistemic virtue is a separate matter). The issue of whether virtue has a role in the Meditations has been a contemporary topic of debate. In particular, there has been a debate about whether the meditator acquires the virtue of generosity. Recall that the virtue of generosity consists of two components: the knowledge that the only thing that truly belongs to us is free will, and the firm and constant resolution to use the will well. It seems that the meditator, in the Fourth Meditation, acquires both of these components through her reflection on the nature of the will and her resolution to use the will well. Indeed, Lisa Shapiro has argued extensively that this is exactly what is happening, and thus generosity—and ethics more generally—has a role in the epistemic achievements of the meditator and the regulation of her passions. Omri Boehm (2014) has also argued that the virtue of generosity is actually acquired in the Second Meditation vis-à-vis the cogito. Parvizian (2016) has argued against Shapiro and Boehm’s view, arguing that generosity presupposes the knowledge of T1–T6 explained in section 4, which the meditator does not have access to by the Second or Fourth Meditation. But let us turn to the view that argues that ethics does have a role in metaphysics and epistemology.
b. The Organic Reading
In the organic reading, the tree of philosophy does not represent strict divisions between philosophical fields, and there is not a strict epistemological order to philosophy, and especially ethics qua practical enterprise. Rather the tree is organic. This reading is drawn from Lisa Shapiro (2008a), Genevieve Rodis-Lewis (1987), Amy Schmitter (2002), and Vance Morgan (1994) (although Morgan does not draw the same conclusion about ethics as the rest of these commentators). Morgan writes: “in a living organism such as a tree, all the connected parts grow simultaneously, dependent upon one another . . . hence the basic structure of the tree, branches and all, is apparent at the very early stage in its development” (1994, 25). Developing Rodis-Lewis’ interpretation, Shapiro writes:
Generosity is a seed-bearing fruit, and that seed, if properly cultivated, will grow into the tree of philosophy. In this way, morals is not simply one branch among the three branches of philosophy, but provides the ‘ultimate level of wisdom’ by leading us to be virtuous and ensuring the tree of philosophy continues to thrive. (2008a: 459)
Applying this view to generosity, Shapiro claims that generosity is “the key to Cartesian metaphysics and epistemology” (2008a: 459). Placing generosity in the Meditations has interpretive benefits. In particular, it may be able to explain the presence and regulation of the meditator’s passions from the First to Sixth Meditation (Shapiro 2005). Moreover, it shows the deep systematicity of Descartes’ ethics, for ethical themes are present right at the foundations of the system.
12. References and Further Reading
a. Abbreviations
AG: Philosophical Essays (cited by page)
AT: Oeuvres de Descartes (cited by volume and page)
CSM: The Philosophical Writings of Descartes, vol. 1 & 2 (cited by volume and page) ‘CSMK’: The Philosophical Writings of Descartes, vol. 3 (cited by page).
Descartes, R. (1996), Oeuvres de Descartes. (C. Adam, & P. Tannery, Eds.) Paris: J. Vrin.
Descartes, R. (1985). The Philosophical Writings of Descartes (Vol. II). (J. Cottingham, R. Stoothoff, & D. Murdoch, Trans.) Cambridge: Cambridge University Press.
Descartes, R. (1985). The Philosophical Writings of Descartes (Vol. I). (J. Cottingham, R. Stoothoff, & D. Murdoch, Trans.) Cambridge: Cambridge University Press.
Descartes, R. (1991). The Philosophical Writings of Descartes: The Correspondence (Vol. III). (J. Cottingham, R. Stoothoff, D. Murdoch, & A. Kenny, Trans.) Cambridge: Cambridge University Press.
Leibniz, G. W. (1989). Philosophical Essays. Trans. Ariew, R. and Garber, D. Indianapolis: Hackett.
Princess Elizabeth and Descartes (2007). The Correspondence Between Princess Elizabeth of Bohemia and René Descartes. Edited and Translated by Lisa Shapiro. University of Chicago Press.
c. Secondary Sources
Alanen, L. (2003a). Descartes’s Concept of Mind. Harvard University Press.
Alanen, L. (2003b). “The Intentionality of Cartesian Emotions,” in Passion and Virtue in Descartes, edited by B. Williston and A. Gombay. Amherst, NY: Humanity Books. 107–27.
Alanen, L. and Svennson, F. (2007). Descartes on Virtue. In Hommage à Wlodek Philosophical Papers Dedicated to Wlodek Rabinowicz, ed. by J.B. Petersson, D. Josefsson, and T. Egonsson. Rønnow-Rasmussen. http://www.fil.lu.se/hommageawlodek.
Ariew, R. (1992). “Descartes and the Tree of Knowledge,” Synthese, 1:101–116.
Beardsley, W. (2005), “Love in the Ruins: Passions in Descartes’ Meditations.” In J. Jenkins, J. Whiting, & C. Williams (Eds.), Persons and Passions: Essays in Honor of Annette Baier (pp. 34–47). Notre Dame: University of Notre Dame Press.
Beavers, A. F. (1989). “Desire and Love in Descartes’s Late Philosophy.” History of Philosophy Quarterly 6 (3):279–294.
Boehm, O. (2014), “Freedom and the Cogito,” British Journal for the History of Philosophy, 22: 704–724.
Boros, G. (2003). “Love as a Guiding Principle of Descartes’s Late Philosophy.” History of Philosophy Quarterly, 20(2), 149–163.
Brassfield, S. (2013), “Descartes and the Danger of Irresolution.” Essays in Philosophy, 14: 162–78.
Brown, D. J. (2006), Descartes and the Passionate Mind. Cambridge: Cambridge University Press.
Brown, D. J. (2012). Cartesian Functional Analysis. Australasian Journal of Philosophy 90 (1):75–92.
Chamberlain, C. (2019). “The body I call ‘mine’”: A sense of bodily ownership in Descartes. European Journal of Philosophy 27 (1): 3–24.
Clarke, D. M. (2005). Descartes’s Theory of Mind. Oxford University Press.
Cimakasky, Joseph & Polansky, Ronald (2012). Descartes’ ‘provisional morality’. Pacific Philosophical Quarterly 93 (3):353–372.
Davies, R. (2001), Descartes: Belief, Skepticism, and Virtue. London: Routledge.
Des Chene, D. (2012), “Using the Passions,” in M. Pickavé and L. Shapiro (eds.), Emotion and Cognitive Life in Medieval and Early Modern Philosophy. Oxford: Oxford University Press.
De Rosa, R. (2007a). ‘The Myth of Cartesian Qualia,’ Pacific Philosophical Quarterly 88(2), pp. 181–207.
Franco, A. B. (2015). “The Function and Intentionality of Cartesian Émotions.” Philosophical Papers 44 (3):277–319.
Franco, A. B. (2016). “Cartesian Passions: Our (Imperfect) Natural Guides Towards Perfection.” Journal of Philosophical Research 41: 401–438
Frierson, Patrick (2002). “Learning to love: From egoism to generosity in Descartes.” Journal of the History of Philosophy 40 (3):313–338.
Frigo, Alberto (2016). A very obscure definition: Descartes’s account of love in the Passions of the Soul and its scholastic background. British Journal for the History of Philosophy 24 (6):1097–1116.
Gottlieb, Joseph & Parvizian, Saja (2018). “Cartesian Imperativism.” Pacific Philosophical Quarterly (99): 702–725
Greenberg, Sean (2007). Descartes on the passions: Function, representation, and motivation. Noûs 41 (4):714–734.
Hatfield, G. (2013). ‘Descartes on Sensory Representation, Objective Reality, and Material Falsity,’ in K. Detlefsen (ed.) Descartes’ Meditations: A Critical Guide. Cambridge: Cambridge University Press, pp. 127–150.
Kambouchner, D. (2009). Descartes, la philosophie morale, Paris: Hermann.
LeDoeuff, M. (1989). “Red Ink in the Margins.” In The Philosophical Imaginary, trans. C. Gordon. Standford: Stanford Unviersity Press.
Marshall, J. (1998), Descartes’s Moral Theory. Ithaca: Cornell University Press.
Marshall, J. (2003). “Descartes’ Morale Par Provision,” in Passion and Virtue in Descartes, edited by B. Williston and A. Gombay. Amherst, NY: Humanity Books.191–238
Mihali, A. (2011). “Sum Res Volans: The Centrality of Willing for Descartes.” International Philosophical Quarterly 51 (2):149–179.
Morgan, V. G. (1994), Foundations of Cartesian Ethics. Atlantic Highlands: Humanities Press.
Murdoch, D. (1993). “Exclusion and Abstraction in Descartes’ Metaphysics,” The
Philosophical Quarterly, 43: 38–57.
Naaman-Zauderer, N. (2010), Descartes’ Deontological Turn: Reason, Will, and Virtue in the Later Writings. Cambridge: Cambridge University Press.
Newman, L., “Descartes’ Epistemology”, The Stanford Encyclopedia of Philosophy (Winter 2014 Edition), Edward N. Zalta (ed.), URL = .
Parvizian, Saja (2016). “Generosity, the Cogito, and the Fourth Meditation.” Res Philosophica 93 (1):219–243
Pereboom, Derk, 1994. “Stoic Psychotherapy in Descartes and Spinoza,” Faith and Philosophy, 11: 592–625.
Rodis-Lewis, G. (1957). La morale de Descartes. [1. ed.] Paris: Presses universitaires de France
Rodis-Lewis, G. (1987), “Le Dernier Fruit de la Métaphysique Cartésienne: la Generosity”, Etudes Philosophiques, 1: 43–54.
Rutherford, D. (2004), “On the Happy Life: Descartes vis-à-vis Seneca,” in S. K. Strange, & J. Zupko (eds.), Stoicism: Traditions and Transformations. Cambridge: Cambridge University Press.
Rutherford, D. (2014). “Reading Descartes as a Stoic: Appropriate Actions, Virtue, and the Passions,” Philosophie antique, 14: 129–155.
Rysiew, Patrick, “Epistemic Contextualism”, The Stanford Encyclopedia of Philosophy (Winter 2016 Edition), Edward N. Zalta (ed.), URL = .
Schmitter, A. M. (2002), “Descartes and the Primacy of Practice: The Role of the Passions in the Search for Truth,” Philosophical Studies (108), 99–108.
Shapiro, L. (1999), “Cartesian Generosity,” Acta Philosophica Fennica, 64: 249–75.
Shapiro, L. , “What Are the Passions Doing in the Meditations?,” in J. Jenkins, J. Whiting, & C. Williams (eds.), Persons and Passions: Essays in Honor of Annette Baier. Notre Dame: University of Notre Dame Press.
Shapiro, L. (2008a), “Descartes’s Ethics,” In J. Broughton, & J. Carriero (eds.), A Companion to Descartes. Malden: Blackwell Publishing.
Shapiro, L. (2008b), ‘”Turn My Will in Completely the Opposite Direction”: Radical Doubt and Descartes’s Account of Free Will,” in P. Hoffman, D. Owen, & G. Yaffe (eds.), Contemporary Perspectives on Early Modern Philosophy. Buffalo: Broadview Press.
Shapiro, L. (2011), “Descartes on Human Nature and the Human Good,” in C. Fraenkel, J. E. Smith, & P. Dario (eds.), The Rationalists: Between Tradition and Innovation. New York: Springer.
Shapiro, L. (2013), “Cartesian Selves,” in K. Detlefsen (ed.), Descartes’ Meditations: A Critical Guide. Cambridge: Cambridge University Press.
Simmons, A. (1999). ‘Are Cartesian Sensations Representational?’ Noûs 33(3), pp. 347–369.
Svensson, F. (2010). The Role of Virtue in Descartes’ Ethical Theory, Or: Was Descartes a Virtue Ethicist?. History of Philosophy Quarterly 27(3): 215–236
Svensson, F. (2011). Happiness, Well-being, and Their Relation to Virtue in Descartes’ Ethics. Theoria 77 (3):238–260.
Svensson, F. (2015). Non-Eudaimonism, The Sufficiency of Virtue for Happiness, and Two Senses of the Highest Good in Descartes’s Ethics. British Journal for the History of Philosophy 23 (2):277–296.
Svensson, F. (2019a) “A Cartesian Distinction in Virtue: Moral and Perfect.” In Mind, Body, and Morality: New Perspectives on Descartes and Spinoza edited by Martina Reuter, Frans Svensson. Routledge.
Svensson, F. (2019b). “Descartes on the Highest Good.” American Catholic Philosophical Quarterly 93 (4):701–721.
Sosa, E. (2012), “Descartes and Virtue Epistemology,” in K. J. Clark, & M. Rea (eds.), Reason, Metaphysics, and Mind: New Essays on the Philosophy of Alvin Plantinga. Oxford: Oxford University Press.
Williston, B. (1997). Descartes on Love and/as Error. Journal of the History of Ideas 58 (3):429–444.
Williston, B. (2003). “The Cartesian Sage and the Problem of Evil” in Passion and Virtue in Descartes, edited by B. Williston and A. Gombay. Amherst, NY: Humanity Books. 301–331.
Aristotle is a towering figure in ancient Greek philosophy, who made important contributions to logic, criticism, rhetoric, physics, biology, psychology, mathematics, metaphysics, ethics, and politics. He was a student of Plato for twenty years but is famous for rejecting Plato’s theory of forms. He was more empirically minded than both Plato and Plato’s teacher, Socrates.
A prolific writer, lecturer, and polymath, Aristotle radically transformed most of the topics he investigated. In his lifetime, he wrote dialogues and as many as 200 treatises, of which only 31 survive. These works are in the form of lecture notes and draft manuscripts never intended for general readership. Nevertheless, they are the earliest complete philosophical treatises we still possess.
As the father of western logic, Aristotle was the first to develop a formal system for reasoning. He observed that the deductive validity of any argument can be determined by its structure rather than its content, for example, in the syllogism: All men are mortal; Socrates is a man; therefore, Socrates is mortal. Even if the content of the argument were changed from being about Socrates to being about someone else, because of its structure, as long as the premises are true, then the conclusion must also be true. Aristotelian logic dominated until the rise of modern propositional logic and predicate logic 2000 years later.
The emphasis on good reasoning serves as the backdrop for Aristotle’s other investigations. In his natural philosophy, Aristotle combines logic with observation to make general, causal claims. For example, in his biology, Aristotle uses the concept of species to make empirical claims about the functions and behavior of individual animals. However, as revealed in his psychological works, Aristotle is no reductive materialist. Instead, he thinks of the body as the matter, and the psyche as the form of each living animal.
Though his natural scientific work is firmly based on observation, Aristotle also recognizes the possibility of knowledge that is not empirical. In his metaphysics, he claims that there must be a separate and unchanging being that is the source of all other beings. In his ethics, he holds that it is only by becoming excellent that one could achieve eudaimonia, a sort of happiness or blessedness that constitutes the best kind of human life.
Aristotle was the founder of the Lyceum, a school based in Athens, Greece; and he was the first of the Peripatetics, his followers from the Lyceum. Aristotle’s works, exerted tremendous influence on ancient and medieval thought and continue to inspire philosophers to this day.
Though our main ancient source on Aristotle’s life, Diogenes Laertius, is of questionable reliability, the outlines of his biography are credible. Diogenes reports that Aristotle’s Greek father, Nicomachus, served as private physician to the Macedonian king Amyntas (DL 5.1.1). At the age of seventeen, Aristotle migrated to Athens where he joined the Academy, studying under Plato for twenty years (DL 5.1.9). During this period Aristotle acquired his encyclopedic knowledge of the philosophical tradition, which he draws on extensively in his works.
Aristotle left Athens around the time Plato died, in 348 or 347 B.C.E. One explanation is that as a resident alien, Aristotle was excluded from leadership of the Academy in favor of Plato’s nephew, the Athenian citizen Speusippus. Another possibility is that Aristotle was forced to flee as Philip of Macedon’s expanding power led to the spread of anti-Macedonian sentiment in Athens (Chroust 1967). Whatever the cause, Aristotle subsequently moved to Atarneus, which was ruled by another former student at the Academy, Hermias. During his three years there, Aristotle married Pythias, the niece or adopted daughter of Hermias, and perhaps engaged in negotiations or espionage on behalf of the Macedonians (Chroust 1972). Whatever the case, the couple relocated to Macedonia, where Aristotle was employed by Philip, serving as tutor to his son, Alexander the Great (DL 5.1.3–4). Aristotle’s philosophical career was thus directly entangled with the rise of a major power.
After some time in Macedonia, Aristotle returned to Athens, where he founded his own school in rented buildings in the Lyceum. It was presumably during this period that he authored most of his surviving texts, which have the appearance of lecture transcripts edited so they could be read aloud in Aristotle’s absence. Indeed, this must have been necessary, since after his school had been in operation for thirteen years, he again departed from Athens, possibly because a charge of impiety was brought against him (DL 5.1.5). He died at age 63 in Chalcis (DL 5.1.10).
Diogenes tells us that Aristotle was a thin man who dressed flashily, wearing a fashionable hairstyle and a number of rings. If the will quoted by Diogenes (5.1.11–16) is authentic, Aristotle must have possessed significant personal wealth, since it promises a furnished house in Stagira, three female slaves, and a talent of silver to his concubine, Herpyllis. Aristotle fathered a daughter with Pythias and, with Herpyllis, a son, Nicomachus (named after his grandfather), who may have edited Aristotle’s Nicomachean Ethics. Unfortunately, since there are few extant sources on Aristotle’s life, one’s judgment about the accuracy and completeness of these details depends largely on how much one trusts Diogenes’ testimony.
Since commentaries on Aristotle’s work have been produced for around two thousand years, it is not immediately obvious which sources are reliable guides to his thought. Aristotle’s works have a condensed style and make use of a peculiar vocabulary. Though he wrote an introduction to philosophy, a critique of Plato’s theory of forms, and several philosophical dialogues, these works survive only in fragments. The extant Corpus Aristotelicum consists of Aristotle’s recorded lectures, which cover almost all the major areas of philosophy. Before the invention of the printing press, handwritten copies of these works circulated in the Near East, northern Africa, and southern Europe for centuries. The surviving manuscripts were collected and edited in August Immanuel Bekker’s authoritative 1831–1836 Berlin edition of the Corpus (“Bekker” 1910). All references to Aristotle’s works in this article follow the standard Bekker numbering.
The extant fragments of Aristotle’s lost works, which modern commentators sometimes use as the basis for conjectures about his philosophical development, are noteworthy. A fragment of his Protrepticus preserves a striking analogy according to which the psyche or soul’s attachment to the body is a form of punishment:
The ancients blessedly say that the psyche pays penalty and that our life is for the atonement of great sins. And the yoking of the psyche to the body seems very much like this. For they say that, as Etruscans torture captives by chaining the dead face to face with the living, fitting each to each part, so the psyche seems to be stretched throughout, and constrained to all the sensitive members of the body. (Pistelli 1888, 47.24–48.1)
According to this allegedly inspired theory, the fetters that bind the psyche to the body are similar to those by which the Etruscans torture their prisoners. Just as the Etruscans chain prisoners face to face with a dead body so that each part of the living body touches a part of the corpse, the psyche is said to be aligned with the parts of one’s living body. On this view, the psyche is embodied as a painful but corrective atonement for its badness. (See Bos 2003 and Hutchinson and Johnson’s webpage).
The incompatibility of this passage with Aristotle’s view that the psyche is inseparable from the body (discussed below) has been explained in various ways. Neo-Platonic commentators distinguish between Aristotle’s esoteric and exoteric writings, that is, writings intended for circulation within his school, and writings like the Protrepticus intended for a broader reading public (Gerson 2005, 47–75). Some modern scholars have argued to the contrary that the imprisonment of the psyche in the body indicates that Aristotle was still a Platonist at the time he composed the Protrepticus, which must have been written earlier than his mature works (Jaeger 1948, 100). Aristotle’s dialogue Eudemus, which contains arguments for the immortality of the psyche, and his Politicus, which is about the ideal statesman, seem to corroborate the view that Aristotle’s exoteric works hold much that is Platonic in spirit (Chroust 1965; 1966). The latter contains the seemingly Platonic assertion that “the good is the most exact of measures” (Kroll 1902, 168: 927b4–5).
But not all agree. Owen (1968, 162–163) argues that Aristotle’s fundamental logical distinction between individual and species depends on an antecedent break with Plato. According to this view, Aristotle’s On Ideas (Fine 1993), a collection of arguments against Platonic forms, shows that Aristotle rejected Platonism early in his career, though he later became more sympathetic to the master’s views. However, as Lachterman (1980) points out, such historical theses depend on substantive hermeneutical assumptions about how to read Aristotle and on theoretical assumptions about what constitutes a philosophical system. This article focuses not on this historical debate but on the theories propounded in Aristotle’s extant works.
2. Analytics or “Logic”
Aristotle is usually identified as the founder of logic in the West (although autonomous logical traditions also developed in India and China), where his “Organon,” consisting of his works the Categories, On Interpretation, Prior Analytics, Posterior Analytics, Sophistical Refutations, and Topics, long served as the traditional manuals of logic. Two other works—Rhetoric and Poetics—are not about logic, but also concern how to communicate to an audience. Curiously, Aristotle never used the words “logic” or “organon” to refer to his own work but calls this discipline “analytics.” Though Aristotelian logic is sometimes referred to as an “art” (Ross 1940, iii), it is clearly not an art in Aristotle’s sense, which would require it to be productive of some end outside itself. Nevertheless, this article follows the convention of referring to the content of Aristotle’s analytics as “logic.”
a. The Meaning and Purpose of Logic
What is logic for Aristotle? On Interpretation begins with a discussion of meaning, according to which written words are symbols of spoken words, while spoken words are symbols of thoughts (Int.16a3–8). This theory of signification can be understood as a semantics that explains how different alphabets can signify the same spoken language, while different languages can signify the same thoughts. Moreover, this theory connects the meaning of symbols to logical consequence, since commitment to some set of utterances rationally requires commitment to the thoughts signified by those utterances and to what is entailed by them. Hence, though Cook Wilson (1926, 30–33) correctly notes that Aristotle nowhere defines logic, it may be called the science of thinking, where the role of the science is not to describe ordinary human reasoning but rather to demonstrate what one ought to think given one’s other commitments. Though the elements of Aristotelian logic are implicit in our conscious reasoning, Aristotelian “analysis” makes explicit what was formerly implicit (Cook Wilson 1926, 49).
Aristotle shows how logic can demonstrate what one should think, given one’s commitments, by developing the syntactical concepts of truth, predication, and definition. In order for a written sentence, utterance, or thought to be true or false, Aristotle says, it must include at least two terms: a subject and a predicate. Thus, a simple thought or utterance such as “horse” is neither true nor false but must be combined with another term, say, “fast” in order to form a compound—“the horse is fast”—that describes reality truly or falsely. The written sentence “the horse is fast” has meaning insofar as it signifies the spoken sentence, which in turn has meaning in virtue of its signifying the thought that the horse is fast (Int.16a10–18, Cat.13b10–12, DA 430a26–b1). Aristotle holds that there are two kinds of constituents of meaningful sentences: nouns and their derivatives, which are conventional symbols without tense or aspect; and verbs, which have a tense and aspect. Though all meaningful speech consists of combinations of these constituents, Aristotle limits logic to the consideration of statements, which assert or deny the presence of something in the past, present, or future (Int.17a20–24).
Aristotle analyzes statements as cases of predication, in which a predicate P is attributed to a subject S as in a sentence of the form “S is P.” Since he holds that every statement expresses something about being, statements of this form are to be read as “S is (exists) as a P” (Bäck 2000, 11). In every true predication, either the subject and predicate are of the same category, or the subject term refers to a substance while the predicate term refers to one of the other categories. The primary substances are individuals, while secondary substances are species and genera composed of individuals (Cat.2a11–18). This distinction between primary and secondary reflects a dependence relation: if all the individuals of a species or genus were annihilated, the species and genus could not, in the present tense, be truly predicated of any subject.
Every individual is of a species and that species is predicated of the individual. Every species is the member of a genus, which is predicated of the species and of each individual of that species (Cat.2b13–22). For example, if Callias is of the species “man,” and the species is a member of the genus “animal,” then “man” is predicated of Callias, and “animal” is predicated both of “man” and of Callias. The individual, Callias, inherits the predicate “animal” in virtue of being of the species “man.” But inheritance stops at the individual and does not apply to its proper parts. For example, “man” is not truly predicated of Callias’ hand. A genus can be divided with reference to the specific differences among its members; for example, “biped” differentiates “man” from “horse.”
While no definition can be given of an individual or primary substance such as Callias, when one gives the genus and all the specific differences possessed by a kind of thing, one can define a thing’s species. A specific difference is a predicate that falls under one of the categories. Thus, Aristotelian categories can be seen as a taxonomical scheme, a way of organizing predicates for discovery, or as a metaphysical doctrine about the kinds of beings there are. But any reading must accommodate Aristotle’s views that primary substances are never predicated of a subject (Cat.3a6), that a predicate may fall under multiple categories (Cat.11a20–39), and that some terms, such as “good,” are predicated in all the categories (NE 1096a23–29). Moreover, definitions are reached not by demonstration but by other kinds of inquiry, such as dialectic, the art by which one makes divisions in a genus; and induction, which can reveal specific differences from the observation of individual examples.
b. Demonstrative Syllogistic
Syllogistic reasoning builds on Aristotle’s theory of predication, showing how to reason from premises to conclusions. A syllogism is a discourse in which when taking some statements as premises a different statement can be shown to follow as a conclusion (AnPr.24b18–22). The basic form of the Aristotelian syllogism involves a major premise, a minor premise, and a conclusion, so that it has the form
If A is predicated of all B,
And B is predicated of all C,
Then A is predicated of all C.
This is an assertion of formal logic, since by removing the values of the variables A, B, and C, one treats the inference formally, such that the values of the subject A and predicates B and C are not given as part of the syllogistic form (Łukasiewicz, 10–14).
Though this form can be utilized in dialectic, in which the major term A is related to C through the middle term B credibly rather than necessarily (AnPo.81b10–23), Aristotle is mainly concerned with how to use syllogistic in what he calls demonstrative reasoning, that is, in inference from certain premises to a certain conclusion. A demonstrative syllogism is not concerned with a mere opinion but proves a cause, that is, answers a “why” question (AnPo.85b 23–26).
The validity of a syllogism can be tested through comparison of four basic types of assertions: All S are P (A), No S are P (E), Some S are P (I), and Some S are not P (O). The truth conditions of these assertions are determined relationally: through contradiction, in which if one of the assertions is true, the other must be false; contrariety, in which both assertions cannot be true; and subalternation, in which the universal assertion’s being true requires that the particular assertion must be true, as well. These relationships are summed up in the traditional square of opposition used by medieval Aristotelian logicians. (see Groarke, Aristotle: Logic).
Figure 1: The Traditional Square of Opposition illustrates the relations between the fundamental judgment-forms in Aristotelian syllogistic: (A) All S are P, (E) No S are P, (I) Some S are P, and (O) Some S are not P.
Syllogistic may be employed dialectically when the premises are accepted on the authority of common opinion, from tradition, or from the wise. In any dialectical syllogism, the premises can be generally accepted opinions rather than necessary principles (Top.100a25–b21). At least some premises in rhetorical proofs must be not necessary but only probable, happening only for the most part.
When the premises are known, and conclusions are shown to follow from those premises, one gains knowledge by demonstration. Demonstration is necessary (AnPo.73a21–27) because the conclusion of a demonstrative syllogism predicates something that is either necessarily true or necessarily false of the subject of the premise. One has demonstrative knowledge when one knows the premises and has derived a necessary conclusion from them, since the cause given in the premises explains why the conclusion is so (AnPo.75a12–17, 35–37). Consequently, valid demonstration depends on the known premises containing terms for the genus of which the species in the conclusion is a member (AnPo.76a29–30).
One interesting problem that arises within Aristotle’s theory of demonstration concerns the connection between temporality and necessity. By the principle of excluded middle, necessarily, either there will be a sea-battle tomorrow or there will not be a sea-battle tomorrow. But since the sea-battle itself has yet neither come about nor failed to come about, it seems that one must say, paradoxically, that one alternative is necessary but that either alternative might come about (Int.19a22–34). The question of how to account for unrealized possibilities and necessities is part of Aristotle’s modal syllogistic, which is discussed at length in his Prior Analytics. For a discussion, see Malink (2013).
c. Induction, Experience, and Principles
Whenever a speaker reasons from premises, an auditor can ask for their demonstration. The speaker then needs to adduce additional premises for that demonstration. But if this line of questioning went on interminably, no demonstration could be made, since every premise would require a further demonstration, ad infinitum. In order to stop an infinite regress of premises, Aristotle postulates that for an inference to count as demonstrative, one must know its indemonstrable premises (AnPo.73a16–20). Thus, demonstrative science depends on the view that all teaching and learning proceed from already present knowledge (AnPo.72b5–20). In other words, the possibility of making a complete argument, whether inductive or deductive, depends on the reasoner possessing the concept in question.
The acquisition of concepts must in some way be perceptual, since Aristotle says that universals come to rest in the soul through experience, which comes about from many memories of the same thing, which in turn comes about by perception (AnPo.99b32–100a9). However, Aristotle holds that some concepts are already manifested in one’s perceptual experience: children initially call all men father and all women mother, only later developing the capacity to apply the relevant concepts to particular individuals (Phys.184b3–5). As Cook Wilson (1926, 45) puts it, perception is in a way already of a universal. Upon learning to speak, the child already possesses the concept “mother” but does not grasp the conditions of its correct application. The role of perception, and hence of memory and experience, is then not to supply the child with universal concepts but to fix the conditions under which they are correctly predicated of an individual or species. Hence the ability to arrive at definitions, which serve as starting points of a science, rests on the human being’s natural capacity to use language and on the culturally specific social and political conditions in which that capacity is manifested (Winslow 2013, 45–49).
While deduction proceeds by a form of syllogistic reasoning in which the major and minor premise both predicate what is necessarily true of a subject, inductive reasoning moves from particulars to universals, so it is impossible to gain knowledge of universals except by induction (AnPo.81a38–b9). This movement, from the observation of the same occurrence, to an experience that emerges from many memories, to a universal judgment, is a cognitive process by which human beings understand reality (see AnPo.88a2–5, Met.980b28–981a1, EN 1098b2–4, 1142a12).
But what makes such an inference a good one? Aristotle seems to say an inductive inference is sound when what is true in each case is also true of the class under which the cases fall (AnPr.68b15–29). For example, it is inferred from the observation that each kind of bileless animal (men, horses, mules, and so on) is long-lived just when the following syllogism is sound: (1) All men, horses, mules, and so on are long-lived; (2) All long-lived animals are bileless; therefore (3) all men, horses, mules, and so on are bileless (see Groarke sections 10 and 11). However, Aristotle does not think that knowledge of universals is pieced together from knowledge of particulars but rather he thinks that induction is what allows one to actualize knowledge by grasping how the particular case falls under the universal (AnPr.67a31–b5).
A true definition reveals the essential nature of something, what it is to be that thing (AnPo.90b30–31). A sound demonstration shows what is necessary of an observed subject (AnPo.90b38–91a5). It is essential, however, that the observation on which a definition is based be inductively true, that is, that it be based on causes rather than on chance. Regardless of whether one is asking what something is in a definition or why something is the way it is by giving its cause, it is only when the principles or starting points of a science are given that demonstration becomes possible. Since experience is what gives the principles of each science (AnPr.46a17–27), logic can only be employed at a later stage to demonstrate conclusions from these starting points. This is why logic, though it is employed in all branches of philosophy, is not a part of philosophy. Rather, in the Aristotelian tradition, logic is an instrument for the philosopher, just as a hammer and anvil are instruments for the blacksmith (Ierodiakonou 1998).
d. Rhetoric and Poetics
Just as dialectic searches for truth, Aristotelian rhetoric serves as its counterpart (Rhet.1354a1), searching for the means by which truth can be grasped through language. Thus, rhetorical demonstration, or enthymeme, is a kind of syllogism that strictly speaking belongs to dialectic (Rhet.1355a8–10). Because rhetoric uses the particularly human capacity of reason to formulate verbal arguments, it is the art that can cause the most harm when it is used wrongly. It is thus not a technique for persuasion at any cost, as some Sophists have taught, but a fundamentally second-personal way of using language that allows the auditor to reach a judgment (Grimaldi 1972, 3–5). More fundamentally, rhetoric is defined as the detection of persuasive features of each subject matter (Rhet.1355b12–22).
Proofs given in speech depend on three things: the character (ethos) of the speaker, the disposition (pathos) of the audience, and the meaning (logos) of the sounds and gestures used (Rhet.1356a2–6). Rhetorical proofs show that the speaker is worthy of credence, producing an emotional state (pathos) in the audience, or demonstrating a consequence using the words alone. Aristotle holds that ethos is the most important of these elements, since trust in the speaker is required if one is to believe the speech. However, the best speech balances ethos, pathos, and logos. In rhetoric, enthymemes play a deductive role, while examples play an inductive role (Rhet.1356b11–18).
The deductive form of rhetoric, enthymeme, is a dialectical syllogism in which the probable premise is suppressed so that one reasons directly from the necessary premise to the conclusion. For example, one may reason that an animal has given birth because she has milk (Rhet.1357b14–16) without providing the intermediate premise. Aristotle also calls this deductive form of inference “reasoning by signs” or “reasoning from evidence,” since the animal’s having milk is a sign of, or evidence for, her having given birth. Though the audience seemingly “immediately” grasps the fact of birth without it being given in perception, the passage from the perception to the fact is inferential and depends on the background assumption of the suppressed premise.
The inductive form of rhetoric, reasoning from example, can be illustrated as follows. Peisistratus in Athens and Theagenes in Megara both petitioned for guards shortly before establishing themselves as tyrants. Thus, someone plotting a tyranny requests a guard (Rhet.1357b30–37). This proof by example does not have the force of necessity or universality and does not count as a case of scientific induction, since it is possible someone could petition for a guard without plotting a tyranny. But when it is necessary to base some decision, for example, whether to grant a request for a bodyguard, on its likely outcome, one must look to prior examples. It is the work of the rhetorician to know these examples and to formulate them in such a way as to suggest definite policies on the basis of that knowledge.
Rhetoric is divided into deliberative, forensic, and display rhetoric. Deliberative rhetoric is concerned with the future, namely with what to do, and the deliberative rhetorician is to discuss the advantages and harms associated with a specific course of action. Forensic rhetoric, typical of the courtroom, concerns the past, especially what was done and whether it was just or unjust. Display rhetoric concerns the present and is about what is noble or base, that is, what should be praised or denigrated (Rhet.1358b6–16). In all these domains, the rhetorician practices a kind of reasoning that draws on similarities and differences to produce a likely prediction that is of value to the political community.
A common characteristic of insightful philosophers, rhetoricians, and poets is the capacity to observe similarities in things that are unlike, as Archytas did when he said that a judge and an alter are kindred, since someone who has been wronged has recourse to both (Rhet.1412a10–14). This noticing of similarities and differences is part of what separates those who are living the good life from those who are merely living (Sens.437a2–3). Likewise, the highest achievement of poetry is to use good metaphors, since to make metaphors well is to contemplate what is like (Poet.1459a6–9). Poetry is thus closely related to both philosophy and rhetoric, though it differs from them in being fundamentally mimetic, imitating reality through an artistic form.
Imitation in poetry is achieved by means of rhythm, language, and harmony (Poet.1447a13–16, 21–22). While other arts share some or all these elements—painting imitates visually by the same means, while dance imitates only through rhythm—poetry is a kind of vocalized music, in which voice and discursive meaning are combined. Aristotle is interested primarily in the kinds of poetry that imitate human actions, which fall into the broad categories of comedy and tragedy. Comedy is an imitation of worse types of people and actions, which reflect our lower natures. These imitations are not despicable or painful, but simply ridiculous or distorted, and observing them gives us pleasure (Poet.1449a31–38). Aristotle wrote a book of his Poetics on comedy, but the book did not survive. Hence, through a historical accident, the traditions of aesthetics and criticism that proceed from Aristotle are concerned almost completely with tragedy.
Tragedy imitates actions that are excellent and complete. As opposed to comedy, which is episodic, tragedy should have a single plot that ends in a presentation of pity and fear and thus a catharsis—a cleansing or purgation—of the passions (Poet.1449b24–28). (As discussed below, the passions or emotions also play an important role in Aristotle’s practical philosophy.) The most important aspect of a tragedy is how it uses a story or myth to lead the psyches of its audience to this catharsis (Poet.1450a32–34). Since the beauty or fineness of a thing—say, of an animal—consists in the orderly arrangement of parts of a definite magnitude (Poet.1450b35–38), the parts of a tragedy should also be proportionate.
A tragedy’s ability to lead the psyche depends on its myth turning at a moment of recognition at which the central character moves from a state of ignorance to a state of knowledge. In the best case, this recognition coincides with a reversal of intention, such as in Sophocles’ Oedipus, in which Oedipus recognizes himself as the man who was prophesied to murder his father and marry his mother. This moment produces pity and fear in the audience, fulfilling the purpose of tragic imitation (Poet.1452a23–b1). The pity and fear produced by imitative poetry are the source of a peculiar form of pleasure (Poet.1453b11–14). Though the imitation itself is a kind of technique or art, this pleasure is natural to human beings. Because of this potential to produce emotions and lead the psyche, poetics borders both on what is well natured and on madness (Poet.1455a30–34).
Why do people write plays, read stories, and watch movies? Aristotle thinks that because a series of sounds with minute differences can be strung together to form conventional symbols that name particular things, hearing has the accidental property of supporting meaningful speech, which is the cause of learning (Sens.437a10–18). Consequently, though sound is not intrinsically meaningful, voice can carry meaning when it “ensouled,” transmitting an appearance about how absent things might be (DA 420b5-10, 27–33). Poetry picks up on this natural capacity, artfully imitating reality in language without requiring that things are actually the way they are presented as being (Poet.1447a13–16).
The poet’s consequent power to lead the psyche through true or false imitations, like the rhetorician’s power to lead it through persuasive speech, leads to a parallel question: how should the poet use his power? Should the poet imitate things as they are, or as they should be? Though it is clear that the standard of correctness in poetry and politics is not the same (Poet.1460b13–1461a1), the question of how and to what extent the state should constrain poetic production remains unresolved.
3. Theoretical Philosophy
Aristotle’s classification of the sciences makes a distinction between theoretical philosophy, which aims at contemplation, and practical philosophy, which aims at action or production. Within theoretical philosophy, first philosophy studies objects that are motionless and separate from material things, mathematics studies objects that are motionless but not separate, and natural philosophy studies objects that are in motion and not separate (Met.1026a6–22).
This threefold distinction among the beings that can be contemplated corresponds to the level of precision that can be attained by each branch of theoretical philosophy. First philosophy can be perfectly exact because there is no variation among its objects and thus it has the potential to give one knowledge in the most profound sense. Mathematics is also absolutely certain because its objects are unchanging, but since there are many mathematical objects of a given kind (for example, one could draw a potentially infinite number of different triangles), mathematical proofs require a peculiar method that Aristotle calls “abstraction.” Natural philosophy gives less exact knowledge because of the diversity and variability of natural things and thus requires attention to particular, empirical facts. Studies of nature—including treatises on special sciences like cosmology, biology, and psychology—account for a large part of Aristotle’s surviving writings.
a. Natural Philosophy
Aristotle’s natural philosophy aims for theoretical knowledge about things that are subject to change. Whereas all generated things, including artifacts and products of chance, have a source that generates them, natural change is caused by a thing’s inner principle and cause, which may accordingly be called the thing’s “nature” (Phys.192b8–20). To grasp the nature of a thing is to be able to explain why it was generated essentially: the nature of a thing does not merely contribute to a change but is the primary determinant of the change as such (Waterlow 1982, p.28).
Though some hold that Aristotle’s principles are epistemic, explanatory concepts, principles are best understood ontologically as unique, continuous natures that govern the generation and self-preservation of natural beings. To understand a thing’s nature is primarily to grasp “how a being displays itself by its nature.” Such a grasp counts as a correct explanation only insofar as it constitutes a form of understanding of beings in themselves as they give themselves (Winslow 2007, 3–7).
Aristotle’s description of principles as the start and end of change (Phys.235b6) distinguishes between two kinds of natural change. Substantial change occurs when a substance is generated (Phys.225a1–5), for example, when the seed of a plant gives rise to another plant of the same kind. Non-substantial change occurs when a substance’s accidental qualities are affected, for example, the change of color in a ripening pomegranate. Aristotelians describe this as the activity of contraries of blackness and whiteness in the plant’s material in which the fruit of the pomegranate, as its juices become colored by ripening, itself becomes shaded, changing to a purple color (de Coloribus 796a20–26). Ripening occurs when heat burns up the air in the part of the plant near the ground, causing convection that alters the originally light color of the fruit to its dark contrary (de Plantis 820b19–23). Both kinds of change are caused by the plant’s containing in itself a principle of change. In substantial change, a new primary substance is generated; in non-substantial change, some property of preexisting substance changes to a contrary state.
A process of change is completely described when its four causes are given. This can be illustrated with Aristotle’s favorite example of the production of a bronze sculpture. The (1) material cause of the change is given when the underlying matter of the thing has been described, such as the bronze matter of which a statue is composed. The (2) formal cause is given when one says what kind of thing the thing is, for example, “sphere” for a bronze sphere or “Callias” for a bronze statue of Callias. The (3) efficient cause is given when one says what brought the change about, for example, when one names the sculptor. The (4) final cause is given when one says the purpose of the change, for example, when one says why the sculptor chose to make the bronze sphere (Phys.194b16–195a2).
In natural change the principle of change is internal, so the formal, efficient, and final causes typically coincide. Moreover, in such cases, the metaphysical and epistemological sides of causal explanation are normally unified: a formal cause counts both as a thing’s essence—what it is to be that thing—and as its rational account or reason for being (Bianchi 2014, 35). Thus, when speaking of natural changes rather than the making of an artifact, Aristotle will usually offer “hylomorphic” descriptions of the natural being as a compound of matter and form.
Because Aristotle holds that a thing’s underlying nature is analogous to the bronze in a statue (Phys.191a7–12), some have argued that the underlying thing refers to “prime matter,” that is, to an absolutely indeterminate matter that has no form. But Cook (1989) has shown that the underlying thing normally means matter that already has some form. Indeed, Aristotle claims that the matter of perceptible things has no separate existence but is always already informed by a contrary (Gen et Corr.329a25–27). The matter that traditional natural philosophy calls the “elements”—fire, water, air, and earth—already has the form of the basic contraries, hot and cold, and moist and dry, so that, for example, fire is matter with a hot and dry form (Gen et Corr.330a25–b4). Thus, even in the most basic cases, matter is always actually informed, even though the form is potentially subject to change. For example, throwing water on a fire cools and moistens it, and bringing about a new quality in the underlying material. Thus, Aristotle sometimes describes natural powers as being latent or active “in the material” (Meteor.370b14–18).
Aristotle’s general works in natural philosophy offer analyses of concepts necessarily assumed in accounts of natural processes, including time, change, and place. In general, Aristotle will describe changes that occur in time as arising from a potential, which is actualized when the change is complete. However, what is actual is logically prior to what is potential, since a potentiality aims at its own actualization and thus must be defined in terms of what is actual. Indeed, generically the actual is also temporally prior to potentiality, since there must invariably be a preexisting actuality that brings the potentiality to its own actualization (Met.1049b4–19). Perhaps because of the priority of the actual to the potential, whenever Aristotle speaks of natural change, he is concerned with a field of naturalistic inquiry that is continuous rather than atomistic and purposeful or teleological rather than mechanical. In his more specific naturalistic works, Aristotle lays out a program of specialized studies about the heavens and Earth, living things, and the psyche.
i. Cosmology and Geology
Aristotle’s cosmology depends on the basic observation that while bodies on Earth either rise to a limit or fall to Earth, heavenly bodies keep moving, without any apparent external force being exerted on them (DC 284a10–15). On the basis of this observation, he distinguishes between circular motion, which is operative in the “superlunary” heavens, and rectilinear motion on “sublunary” Earth below the Moon. Since all sublunary bodies move in a rectilinear pattern, the heavenly bodies must be composed of a different body that naturally moves in a circle (DC 269a2–10, Meteor.340b6–15). This body cannot have an opposite, because there is no opposite to circular motion (DC 270a20, compare 269a19–22). Indeed, since there is nothing to oppose its motion, Aristotle supposes that this fifth element, which he calls “aether,” as well as the heavenly bodies composed of it, move eternally (DC 275b1–5, 21–25).
In Aristotle’s view the heavens are ungenerated, neither coming to be nor passing away (DC 279b18–21, 282a24–30). Aristotle defines time as the number of motion, since motion is necessarily measured by time (Phys.224a24). Thus, the motion of the eternal bodies is what makes time, so the life and being of sublunary things depends on them. Indeed, Aristotle says that their own time is eternal or “aeon.”
Noticing that water naturally forms spherical droplets and that it flows towards the lowest point on a plane, Aristotle concludes that both the heavens and the earth are spherical (DC 287b1–14). This is further confirmed by observations of eclipses (DC 297b23–31) and that different stars are visible at different latitudes (DC 297b14–298a22).
The gathering of such observations is an important part of Aristotle’s scientific procedure (AnPr.46a17–22) and sets his theories above those of the ancients that lacked such “experience” (Phys.191a24–27). Just as in his biology, where Aristotle draws on animal anatomy observed at sacrifices (HA 496b25) and records reports from India (HA 501a25), so in his astronomy he cites Egyptian and Babylonian observations of the planets (DC 292a4–9). By gathering evidence from many sources, Aristotle is able to conclude that the stars and the Moon are spherical (DC 291b11–20) and that the Milky Way is an appearance produced by the sight of many stars moving in the outermost sphere (Meteor.346a16–24).
Assuming the hypothesis that the Earth does not move (DC 289b6–7), Aristotle argues that there are in the heavens both stars, which are large and distant from earth, and planets, which are smaller and closer. The two can be distinguished since stars appear to twinkle while planets do not (Aristotle somewhat mysteriously attributes the twinkling stars to their distance from the eye of the observer) (DC 290b14–24). Unlike earthly creatures, which move because of their distinct organs or parts, both the moving stars and the unmoving heaven that contains them are spherical (DC 289a30–b11). As opposed to superlunary (eternal) substances, sublunary beings, like clouds and human beings, participate in the eternal through coming to be and passing away. In doing so, the individual or primary substance is not preserved, but rather the species or secondary substance is preserved (as we shall see below, the same thought is utilized in Aristotle’s explanation of biological reproduction) (Gen et Corr.338b6–20).
Aristotle holds that the Earth is composed of four spheres, each of which is dominated by one of the four elements. The innermost and heaviest sphere is predominantly earth, on which rests upper spheres of water, air, and fire. The sun acts to burn up or vaporize the water, which rises to the upper spheres when heated, but when cooled later condenses into rain (Meteor.354b24–34). If unqualified necessity is restricted to the superlunary sphere, teleology—the seeking of ends that may or may not be brought about—seems to be limited to the sublunary sphere.
Due to his belief that the Earth is eternal, being neither created nor destroyed, Aristotle holds that the epochs move cyclically in patterns of increase and decrease (Meteor.351b5–19). Aristotle’s cyclical understanding of both natural and human history is implicit in his comment that while Egypt used to be a fertile land, it has over the centuries grown arid (Meteor.351b28–35). Indeed, parts of the world that are ocean periodically become land, while those that are land are covered over by ocean (Meteor.253a15–24). Because of periodic catastrophes, all human wisdom that is now sought concerning both the arts and divine things was previously possessed by forgotten ancestors. However, some of this wisdom is preserved in myths, which pass on knowledge of the divine by allegorically portraying the gods in human or animal form so that the masses can be persuaded to follow laws (Met.1074a38-b14, compare Meteor.339b28–30, Pol.1329b25).
Aristotle’s geology or earth science, given in the latter books of his Meteorology, offers theories of the formation of oceans, of wind and rainfall, and of other natural events such as earthquakes, lightning, and thunder. His theory of the rainbow suggests that drops of water suspended in the air form mirrors which reflect the multiply-colored visual ray that proceeds from the eye without its proper magnitude (Meteor.373a32–373b34). Though the explanations given by Aristotle of these phenomena contradict those of modern physics, his careful observations often give interest to his account.
Aristotle’s material science offers the first description of what are now called non-Newtonian fluids—honey and must—which he characterizes as liquids in which earth and heat predominate (Meteor.385b1–5). Although the Ancient Greeks did not distill alcohol, he reports on the accidental distillation of some ethanol from wine (“sweet wine”), which he observes is more combustible than ordinary wine (Meteor.387b10–14). Finally, Aristotle’s material science makes an informative distinction between compounds, in which the constituents maintain their identity, and mixtures, in which one constituent comes to dominate or in which a new kind of material is generated (see Sharvy 1983 for discussion). Though it would be inaccurate to describe him as a methodological empiricist, Aristotle’s collection and careful recording of observations shows that in all of his scientific endeavors, his explanations were designed to accord with publicly observable natural phenomena.
ii. Biology
The phenomenon of life, as opposed to inanimate nature, involves distinctive types of change (Phys.244b10–245a5) and thus requires distinctive types of explanation. Biological explanations should give all four causes of an organism or species—the material of which it is composed, the processes that bring it about, the particular form it has, and its purpose. For Aristotle, the investigation of individual organisms gives one causal knowledge since the individuals belong to a natural kind. Men and horses both have eyes, which serve similar functions in each of them, but because their species are different, a man’s eye is similar to the eyes of other men, while a horse’s eyes are similar to the eyes of other horses (HA 486a15–20). Biology should explain both why homologous forms exist in different species and the ways in which they differ, and therefore the causes for the persistence of each natural kind of living thing.
Although all four causes are relevant in biology, Aristotle tends to group final causes with formal causes in teleological explanations, and material causes with efficient causes in mechanical explanations. Boylan (section 4) shows, for example, that Aristotle’s teleological explanation of respiration is that it exists in order to bring air into the body to produce pneuma, which is the means by which an animal moves itself. Aristotle’s mechanical explanation is that air that has been heated in the lungs is pushed out by colder air outside the body (On Breath 481b10–16, PA 642a31–b4).
Teleological explanations are necessary conditionally; that is, they depend on the assumption that the biologist has correctly identified the end for the sake of which the organism behaves as it does. Mechanical explanations, in distinction, have absolute necessity in the sense that they require no assumptions about the purpose of the organism or behavior. In general, however, teleological explanations are more important in biology (PA 639b24–26), because making a distinction between living and inanimate things depends on the assumption that “nature does nothing in vain” (GA 741b5).
The final cause of each kind corresponds to the reason that it continues to persist. As opposed to superlunary, eternal substances, sublunary living things cannot preserve themselves individually or, as Aristotle puts it, “in number.” Nevertheless, because living is better than not living (EN 1170b2–5), each individual has a natural drive to preserve itself “in kind.” Such a drive for self-preservation is the primary way in which living creatures participate in the divine (DA 415a25–b7). Nutrition and reproduction therefore are, in Aristotle’s philosophy, value-laden and goal-directed activities. They are activated, whether consciously or not, for the good of the species, namely for its continuation, in which it imitates the eternal things (Gen et Corr.338b12–17). In this way, life can be considered to be directed toward and imitative of the divine (DC 292b18–22).
This basic teleological or goal-directed orientation of Aristotle’s biology allows him to explain the various functions of living creatures in terms of their growth and preservation of form. Perhaps foremost among these is reproduction, which establishes the continuity of a species through a generation. As Aristotle puts it, the seed is temporally prior to the fully developed organism, since each organism develops from a seed. But the fully developed organism is logically prior to the seed, since it is the end or final cause, for the sake of which the seed is produced (PA 641b29–642a2).
In asexual reproduction in plants and animals, the seed is produced by an individual organism and implanted in soil, which activates it and thus actualizes its potentiality to become an organism of the kind from which it was produced. Aristotle thus utilizes a conception of “type” as an endogenous teleonomic principle, which explains why an individual animal can produce other animals of its own type (Mayr 1982, 88). Hence, the natural kind to which an individual belongs makes it what it is. Animals of the same natural kind have the same form of life and can reproduce with one another but not with animals of other kinds.
In animal sexual reproduction, Aristotle understands the seed possessed by the male as the source or principle of generation, which contains the form of the animal and must be implanted in the female, who provides the matter (GA 716a14–25). In providing the form, the male sets up the formation of the embryo in the matter provided by the female, as rennet causes milk to coagulate into cheese (GA 729a10–14). Just as rennet causes milk to separate into a solid, earthy part (or cheese), and a fluid, watery part (or whey), so the semen causes the menstrual fluid to set. In this process, the principle of growth potentially contained in the seed is activated, which, like a seed planted in soil, produces an animal’s body as the embryo (GA 739b21–740a9).
The form of the animal, its psyche, may thus be said to be potentially in the matter, since the matter contains all the necessary nutrients for the production of the complete organism. However, it is invariably the male that brings about the reproduction by providing the principle of the perceptual soul, a process Aristotle compares with the movement of automatic puppets by a mover that is not in the puppet (GA 741b6–15). (Whether the female produces the nutritive psyche is an open question.) Thus, form or psyche is provided by the male, while the matter is provided by the female: when the two come together, they form a hylomorphic product—the living animal.
While the form of an animal is preserved in kind by reproduction, organisms are also preserved individually over their natural lifespans through feeding. In species that have blood, feeding is a kind of concoction, in which food is chewed and broken down in the stomach, then enters the blood, and is finally cooked up to form the external parts of the body. In plants, feeding occurs by the nutritive psyche alone. But in animals, the senses exist for the sake of detecting food, since it is by the senses that animals pursue what is beneficial and avoid what is harmful. In human beings, a similar explanation can be given of the intellectual powers: understanding and practical wisdom exist so that human beings might not only live but also enjoy the good life achievable by action (Sens.436b19–437a3).
Although Aristotle’s teleology has been criticized by some modern biologists, others have argued that his biological work is still of interest to naturalists. For example, Haldane (1955) shows that Aristotle gave the earliest report of the bee waggle dance, which received a comprehensive explanation only in the 20th century work of Von Frisch. Aristotle also observed lordosis behavior in cattle (HA 572b1–2) and notes that some plants and animals are divisible (Youth and Old Age 468b2–15), a fact that has been vividly illustrated in modern studies of planaria. Even when Aristotle’s biological explanations are incorrect, his observations may be of enduring value.
iii. Psychology
Psychology is the study of the psyche, which is often translated as “soul.” While prior philosophers were interested in the psyche as a part of political inquiry, for Aristotle, the study of the psyche is part of natural science (Ibn Bajjah 1961, 24), continuous with biology. This is because Aristotle conceives of the psyche as the form of a living being, the body being its material. Although the psyche and body are never really separated, they can be given different descriptions. For example, the passion of anger can be described physiologically as a boiling of the blood around the heart, while it can be described dialectically as the desire to pay back with pain someone who has insulted one (DA 403a25–b2). While the physiologist examines the material and efficient causes, the dialectician considers only the form and definition of the object of investigation (DA 403a30–b3). Since the psyche is “the first principle of the living thing” (DA 402a6–7), neither the dialectical method nor the physiological method nor a combination of the two is sufficient for a systematic account of the psyche (DA 403a2, b8). Rather than relying on dialectical or materialist speculation, Aristotle holds that demonstration is the proper method of psychology, since the starting point is a definition (DA 402b25–26), and the psyche is the form and definition of a living thing.
Aristotle conceives of psychology as an exact science, with greater precision than the lesser sciences (DA 402a1–5), and accordingly offers a complete sequence of the kinds or “parts” of psyche. The nutritive psyche—possessed by both plants and animals—is responsible for the basic functions of nourishment and reproduction. Perception is possible only in an animal that also has the nutritive power that allows it to grow and reproduce, while desire depends on perceiving the object desired, and locomotion depends on desiring objects in different locations (DA 415a1–8). More intellectual powers like imagination, judgment, and understanding itself exist only in humans, who also have the lower powers.
The succession of psychological powers ensures the completeness, order, and necessity of the relations of psychological parts. Like rectilinear figures, which proceed from triangles to quadrilaterals, to pentagons, and so forth, without there being any intermediate forms, there are no other psyches than those in this succession (DA 414b20–32). This demonstrative approach ensures that although the methods of psychology and physiology are distinct, psychological divisions map onto biological distinctions. For Aristotle, the parts of the psyche are not separable or “modular” but related genetically: each posterior part of the psyche “contains” the parts before it, and each lower part is the necessary but not sufficient condition for possession of the part that comes after it.
The psyche is defined by Aristotle as the first actuality of a living animal, which is the form of a natural body potentially having life (DA 412a19–22). This form is possessed even when it is not being used; for example, a sleeping person has the power to hear a melody, though while he is sleeping, he is not exercising the power. In distinction, though a corpse looks just like a sleeping body, it has no psyche, since it lacks the power to respond to such stimuli. The second actuality of an animal comes when the power is actually exercised such as when one actually hears the melody (DA 417b9–16).
Perception is the reception of the form of an object of perception without its matter, just as wax receives the seal of a ring without its iron or gold (DA 424a17–28). When one sees wine, for example, one perceives something dark and liquid without becoming dark and liquid. Some hold that Aristotle thinks the reception of the form happens in matter so that part of the body becomes like the object perceived (for example, one’s eye might be dark while one is looking at wine). Others hold that Aristotelian perception is a spiritual change so that no bodily change is required. But presumably one is changing both bodily and spiritually all the time, even when one is not perceiving. Consequently, the formulation that perception is of “form without matter” is probably not intended to describe physiological or spiritual change but rather to indicate the conceptual nature of perception. For, as discussed in the section on first philosophy below, Aristotle considers forms to be definitions or concepts; for example, one defines “horse” by articulating its form. If he is using “form” in the same way in his discussion of perception, he means that in perceiving something, such as in seeing a horse, one gains an awareness of it as it is; that is, one grasps the concept of the horse. In that case, all the doctrine means is that perception is conceptual, giving one a grasp not just of parts of perceptible objects, say, the color and shape of a horse, but of the objects themselves, that is, of the horse as horse. Indeed, Aristotle describes perception as conferring knowledge of particulars and in that sense being like contemplation (DA 417b19–24).
This theory of perception distinguishes three kinds of perceptible objects: proper sensibles, which are perceived only by one sense modality; common sensibles, which are perceived by all the senses; and accidental sensibles, which are facts about the sensible object that are not directly given (DA 418a8–23). For example, in seeing wine, its color is a proper sensible, its volume a common sensible, and the fact that it belongs to Callias an accidental sensible. While one normally could not be wrong about the wine’s color, one might overestimate or underestimate its volume under nonstandard conditions, and one is apt to be completely wrong about the accidental sensible (for example, Callias might have sold the wine).
The five senses are distinguished by their proper sensibles: though the wine’s color might accidentally make one aware that it is sweet, color is proper to sight and sweetness to taste. But this raises a question: how do the different senses work together to give one a coherent experience of reality? If they were not coordinated, then one would perceive each quality of an object separately, for example, darkness and sweetness without putting them together. However, actual perceptual experience is coordinated: one perceives wine as both dark and sweet. In order to explain this, Aristotle says that they must be coordinated by the central sense, which is probably located in the body’s central organ, the heart. When one is awake, and the external sense organs are functioning normally, they are coordinated in the heart to discern reality as being the way it is (Sens.448b31–449a22).
Aristotle claims that one hears that one hears and sees that one sees (DA 425b12–17). Though there is a puzzle as to whether such higher-order seeing is due to sight itself or to the central perceptual power (compare On Sleep 455a3–26), the higher-order perception counts as an awareness of how the perceptual power grasps an object in the world. Though later philosophers named this higher-order perception “consciousness” and argued that it could be separated from an actualized perception of a real object, for Aristotle it is intrinsically dependent on the first-order grasp of an object (Nakahata 2014, 109–110). Indeed, Aristotle describes perceptual powers as being potentially like the perceptual object in actuality (DA 418a3–5) and goes so far as to say that the activity of the external object and that of the perceptual power are one, though what it is to be each one is different (DA 425b26–27). Thus, consciousness seems to be a property that arises automatically when perception is activated.
In at least some animals, the perceptual powers give rise to other psychological powers that are not themselves perceptual in a strict sense. In one simple case, the perception of a color is altered by its surroundings, that is, by how it is illuminated and by the other colors in one’s field of vision. Far from assuming the constancy of perception, Aristotle notes that under such circumstances, one color can take the place of another and appear differently than it does under standard conditions, for example, of full illumination (Meteor.375a22–28).
Memory is another power that arises through the collection of many perceptions. Memory is an affection of perception (though when the content of the memory is intellectual, it is an affection of the judgmental power of the psyche, see Mem.449b24–25), produced when the motion of perception acts like a signet ring in sealing wax, impressing itself on an animal and leaving an image in the psyche (Mem.450a25–b1). The resultant image has a depictive function so that it can be present even when the object it portrays is absent: when one remembers a person, for example, the memory-image is fully present in one’s psyche, though the person might be absent (Mem.450b20–25).
Closely related to memory, the imagination is a power to present absent things to oneself. Identical neither to perception nor judgment (DA 427b27–8, 433a10), imagining has an “as if” quality. For example, imagining a terror is like looking at a picture without feeling the corresponding emotion of fear (DA 427b21–24). Imagination may be defined as a kind of change or motion that comes about by means of activated perception (DA 429a1–2). This does not entail that imagination is merely reproductive but simply that activated perceptions trigger the imagination, which in turn produces an image or appearance “before our eyes” (DA 427b19–20). The resultant appearances that “comes to be for us” (DA 428a1–2, 11–12) could be true or false, since unlike the object of perception, what is imagined is not present (Humphreys 2019).
Human beings are distinct from other animals, Aristotle says, in their possession of rational psyche. Foremost among the rational powers is intellect or understanding (this article uses the terms interchangeably), which grasps universals in a way that is analogous to the perceptual grasp of particulars. However, unlike material particulars grasped by perception, universals are not mixed with body and are thus in a sense contained in the psyche itself (DA 417b22–24, 432a1–3). This has sometimes been called the intentional inexistence of an object, or intentionality, the property of being directed to or about something. Since one can think or understand any universal, the understanding is potentially about anything, like an empty writing tablet (DA 429b29–430a1).
The doctrine of the intentionality of intellect leads Aristotle to make a distinction between two kinds of intellect. Receptive or passive intellect is characterized by the ability to become like all things and is analogous to the writing tablet. Productive or active intellect is characterized by the ability to bring about all things and is analogous to the act of writing. The active intellect is thus akin to the light that illuminates objects, making them perceptible by sight. Aristotle holds that the soul never thinks without an image produced by imagination to serve as its material. Thus, in understanding something, the productive intellect actuates the receptive intellect, which stimulates the imagination to produce a particular image corresponding to the universal content of the understanding. Hence, while Aristotle describes the active intellect as unaffected, separate, and immaterial, it serves to bring to completion the passive intellect, the latter of which is inseparable from imagination and hence from perception and nutrition.
Aristotle’s insistence that intellect is not a subject of natural science (PA 641a33–b9) motivates the view that thinking requires a contribution from the supernatural or divine. Indeed, in Metaphysics (1072b19–30) Aristotle argues that intellect actively understanding the intelligible is the everlasting God. For readers like the medieval Arabic commentator Ibn Rushd, passive intellect is spread like matter among thinking beings. This “material intellect” is activated by God, the agent intellect, so that when one is thinking, one participates in the activity of the divine intellect. According to this view, every act of thinking is also an act of divine illumination in which God actuates one’s thinking power as the writer actuates a blank writing tablet.
However, in other passages Aristotle says that when the body is destroyed, the soul is destroyed too (Length and Shortness of Life, 465b23–32). Thus, it seems that Aristotle’s psychological explanations assume embodiment and require that thinking be something done by the individual human being. Indeed, Aristotle argues that if thinking is either a kind of imaginative representation or impossible without imagination, then it will be impossible without body (DA 403a8–10). But the psyche never thinks without imagination (DA 431a16–17). It seems to follow that far from being a part of the everlasting thinking of God, human thinking is something that happens in a living body and ends when that body is no longer alive. Thus, Jiminez (2014, 95–99) argues that thinking is embodied in three ways: it is proceeded by bodily processes, simultaneous with embodied processes, and anticipates bodily processes, namely intentional actions. For further discussion see Jiminez (2017).
The whole psyche governs the characteristic functions and changes of a living thing. The nutritive psyche is the formal cause of growth and metabolism and is shared by plants, while the perceptual psyche gives rise to desire, which causes self-moving animals to act. When one becomes aware of an apparent good by perception or imagination, one forms either an appetite, the desire for pleasure, or thumos, the spirited desire for revenge or honor. A third form of desire, wish, is the product of the rational psyche (DA 433a20–30).
Boeri has pointed out that Aristotle’s psychology cuts a middle path between physicalism, which identifies the psyche with body, and dualism, which posits the independent existence of the soul and body. By characterizing the psyche as he does, Aristotle can at once deny that the psyche is a body but also insist that it does not exist without a body. The living body of an animal can thus be thought of as a form that has been “materialized” (Boeri 2018, 166–169).
b. Mathematics
Aristotle was educated in Plato’s Academy, in which it was commonly argued that mathematical objects like lines and numbers exist independently of physical beings and are thus ”separable” from matter. Aristotle’s conception of the hierarchy of beings led him to reject Platonism since the category of quantity is posterior to that of substance. But he also rejects nominalism, the view that mathematical things are not real. Against both positions, Aristotle argues that mathematical things are real but do not exist separately from sensible bodies (Met.1090a29–30, 1093b27–28). Mathematical objects thus depend on the things in which they inhere and have no separate or independent being (Met.1059b12–14).
Although mathematical beings are not separate from the material cosmos, when the mathematician defines what it is to be a sphere or circle, he does not include a material like gold or bronze in the definition, because it is not the gold ball or bronze ring that the mathematician wants to define. The mathematician is justified in proceeding in this way, because although there are no separate entities beyond the concrete thing, it is just the mathematical aspects of real things that are relevant to mathematics (DC 278a2–6). This process by which the material features of a substance are systematically ignored by the mathematician, who focuses only on the quantitative features, Aristotle describes as “abstraction.” Because it always involves final ends, no abstraction is possible in natural science (PA 641b11–13, Phys.193b31–35). A consequence of this abstraction is that “why” questions in mathematics are invariably answered not by providing a final cause but by giving the correct definition (Phys.198a14–21, 200a30–34).
One reason that Aristotle believes that mathematics must proceed by abstraction is that he wants to prevent a multiplication of entities. For example, he does not want to say that, in addition to there being a sphere of bronze, there is another separate, mathematical sphere, and that in addition to that sphere, there is a separate mathematical plane cutting it, and that in addition to that plane, there is an additional line limiting the plane (see Katz 2014). It is enough for a mathematical ontology simply to acknowledge that natural objects have real mathematical properties not separate in being, which can nevertheless be studied independently from natural investigation. Aristotle also favors this view due to his belief that mathematics is a demonstrative science. Aristotle was aware that geometry uses diagrammatic representations of abstracted properties, which allow one to grasp how a demonstration is true not just of a particular object but of any class of objects that share its quantitative features (Humphreys 2017). Through the concept of abstraction, Aristotle could explain why a particular diagram may be used to prove a universal geometrical result.
Why study mathematics? Although Aristotle rejected the Platonic doctrine that mathematical beings are separate, intermediate entities between perceptible things and forms, he agreed with the Platonists that mathematics is about things that are beautiful and good, since it offers insight into the nature of arrangement, symmetry, and definiteness (Met.1078a31–b6). Thus, the study of mathematics reveals that beauty is not so much in the eye of the beholder as it is in the nature of things (Hoinski and Polansky 2016, 51–60). Moreover, Aristotle holds that mathematical beings are all potential objects of the intellect, which exist only potentially when they are not understood. The activity of understanding is the actuation of their being, but also actuates the intellect (Met.1051a26–33). Mathematics, then, not only gives insight into beauty but is also a source of intellectual pleasure, since gaining mathematical knowledge exercises the human being’s best power.
c. First Philosophy
In addition to natural and mathematical sciences, there is a science of independent beings that Aristotle calls “first philosophy” or “wisdom.” What is the proper aim of this science? In some instances, Aristotle seems to say that it concerns being insofar as it is (Met.1003a21–22), whereas in others, he seems to consider it to be equivalent to “theology,” restricting contemplation to the highest kind of being (Met.1026a19–22), which is unchanging and separable from matter. However, Menn (2013, 10–11) shows that Aristotle is primarily concerned with describing first philosophy as a science that seeks the causes and sources of being qua being. Hence, when Aristotle holds that wisdom is a kind of rational knowledge concerning causes and principles (Met.982a1–3), he probably means that the investigation of these causes of being as being seeks to discover the divine things as the cause of ordinary beings. First philosophy is consequently quite unlike natural philosophy and mathematics, since rather than proceeding from systematic observation or from hypotheses, it begins with an attitude of wonder towards ordinary things and aims to contemplate them not under a particular description but simply as beings (Sachs 2018).
The fundamental premise of this science is the law of noncontradiction, which states that something cannot both be and not be (Met.1006a1). Aristotle holds that this law is indemonstrable and necessary to assume in any meaningful discussion about being. Consequently, a person who demands a demonstration of this principle is no better than a plant. As Anscombe (1961, 40) puts it, “Aristotle evidently had some very irritating people to argue with.” But as Anscombe also points out, this principle is what allows Aristotle to make a distinction between substances as the primary kind of being and accidents that fall in the other categories. While it is possible for a substance to take on contrary accidents, for example, coffee first being hot and later cold, substances have no contraries. The law requires that a substance either is or is not, independently of its further, accidental properties.
Aristotle insists that in order for the word “being” to have any meaning at all, there must be some primary beings, whereas other beings modify these primary beings (Met.1003b6–10). As we saw in the section on Aristotle’s logic, primary substances are individual substances while their accidents are what is predicated of them in the categories. This takes on metaphysical significance when one thinks of this distinction in terms of a dependence relation in which substances can exist independently of their accidents, but accidents are dependent in being on a substance. For example, a shaggy dog is substantially a dog, but only accidentally shaggy. If it lost all its hair, it would cease to be shaggy but would be no less a dog: it would then be a non-shaggy dog. But if it ceased to be a dog—for example, if it were turned into fertilizer—then it would cease to be shaggy at the same moment. Unlike the “shagginess,” “dogness” cannot be separated from a shaggy dog: the “what it is to be” a dog is the dog’s dogness in the category of substance, while its accidents are in other categories, in this case shagginess being in the category of quality (Met.1031a1–5).
Given that substances can be characterized as forms, as matter, or as compounds of form and matter, it seems that Aristotle gives the cause and source of a being by listing its material and formal cause. Indeed, Aristotle sometimes describes primary being as the “immanent form” from which the concrete primary being is derived (Met.1037a29). This probably means that a primary substance is always a compound, its formal component serving as the substance’s final cause. However, primary beings are not composed of other primary beings (Met.1041a3–5). Thus, despite some controversy on the question, there seems to be no form of an individual, form being what is shared by all the individuals of a kind.
A substance is defined by a universal, and thus when one defines the form, one defines the substance (Met.1035b31–1036a1). However, when one grasps a substance directly in perception or thought, one grasps the compound of form and matter (Met.1036a2–8). But since form by itself does not make a primary substance, it must be immanent—that is, compounded with matter—in each individual, primary substance. Rather, in a form-matter compound, such as a living thing, the matter is both the prior stuff out of which the thing has become and the contemporaneous stuff of which it is composed. The form is what makes what a thing is made of, its matter, into that thing (Anscombe 1961, 49, 53).
Due to this hylomorphic account, one might worry that natural science seems to explain everything there is to explain about substances. However, Aristotle insists that there is a kind of separable and immovable being that serves as the principle or source of all other beings, which is the special object of wisdom (Met.1064a35–b1). This being might be called the good itself, which is implicitly pursued by substances when they come to be what they are. In any case, Aristotle insists that this source and first of beings sets in motion the primary motion. But since whatever is in motion must be moved by something else, and the first thing is not moved by something else, it is itself motionless (Met.1073a25–34). As we have seen, even the human intellect is “not affected” (DA 429b19–430a9), producing its own object of contemplation in a pure activity. Following this, Aristotle describes the primary being as an intellect or a kind of intellect that “thinks itself” perpetually (Met.1072b19–20). Thus, we can conceive of the Aristotelian god as being like our own intellect but unclouded by what we undergo as mortal, changing, and fallible beings (Marx 1977, 7–8).
4. Practical Philosophy
Practical philosophy is distinguished from theoretical philosophy both in its goals and in its methods. While the aim of theoretical philosophy is contemplation and the understanding of the highest things, the aim of practical philosophy is good action, that is, acting in a way that constitutes or contributes to the good life. But human beings can only thrive in a political community: the human is a “political animal” and thus the political community exists by nature (Pol.1253a2–5, compare EN 1169b16–19). Thus, ethical inquiry is part of political inquiry into what makes the best life for a human being. Because of the intrinsic variability and complexity of human life, however, this inquiry does not possess the exactness of theoretical philosophy (EN 1094b10–27).
In a similar way that he holds animals are said to seek characteristic ends in his biology, Aristotle holds in his “ergon argument” that the human being has a proper ergon—work or function (EN 1097b24–1098a18). Just as craftsmen like flautists and sculptors and bodily organs like eyes and ears have a peculiar work they do, so the human being must do something peculiarly human. Such function is definitive, that is, distinguishes what it is to be the thing that carries it out. For example, a flautist is a flautist insofar as she plays the flute. But the function serves as an implicit success condition for being that thing. For example, what makes a flautist good as a what she is (“good qua flautist” one might say) is that she plays the flute well. Regardless of the other work she does in her other capacities (qua human, qua friend, and so forth) the question “is she a good flautist?” can be answered only in reference to the ergon of the flautist, namely flute playing.
The human function cannot be nutrition or perception, since those activities are shared with other living things. Since other animals lack reason, the human function must be an activity of the psyche not without reason. A human being that performs this function well will be functioning well as a human being. In other words, by acting virtuously one will by that fact achieve the human good (Angier 2010, 60–61). Thus, Aristotle can summarize the good life as consisting of activities and actions in accordance with arete—excellence or virtue—and the good for the human being as the activity of the psyche in accordance with excellence in a complete life (EN 1098a12–19). Though it has sometimes been objected that Aristotle assumes without argument that human beings must have a characteristic function, Angier (2010, 73–76) has shown that the key to Aristotle’s argument is his comparison of the human function to a craft: just as a sculptor must possess a wide variety of subordinate skills to achieve mastery in his specialized activity, so in acting well the human being must possess an inclusive set of dispositions and capacities that serve to fulfill the specialized task of reason.
Ethics and politics are, however, not oriented merely to giving descriptions of human behavior but on saying what ends human beings ought to pursue, that is, on what constitutes the good life for man. While the many, who have no exposure to philosophy, should agree that the good life consists in eudaimonia—happiness or blessedness—there is disagreement as to what constitutes this state (EN 1095a18–26). The special task of practical philosophy is therefore to say what the good life consists in, that is, to give a more comprehensive account of eudaimonia than is available from the observation of the diverse ends pursued by human beings. As Baracchi (2008, 81–83) points out, eudaimonia indicates a life lived under the benevolent or beneficial sway of the daimonic, that is, of an order of existence beyond the human. Thus, the view that eudaimonia is a state of utmost perfection and completion for a human being (Magna Moralia 1184a14, b8) indicates that the full actualization of a human depends on seeking something beyond what is strictly speaking proper to the human.
a. Habituation and Excellence
Though the original meaning of ethics has been obscured due to modern confusion of pursuing proper ends with following moral rules, in the Aristotelian works, ethical inquiry is limited to the investigation of what it is for a human being to flourish according to her own nature. For the purposes of this inquiry, Aristotle distinguishes three parts of the psyche: passions, powers, and habits (EN 1105b20). Passions include attitudes such as feeling fear, hatred, or pity for others, while powers are those parts of our form that allow us to have such passions and to gain knowledge of the world. However, while all human beings share passions and powers, they differ with regard to how they are trained or habituated and thus with respect to their dispositions or states of character. Those who are habituated correctly are said to be excellent and praiseworthy, while those whose characters are misshapen through bad habituation are blameworthy (EN 1105b28–a2).
How does a human being become good, cultivating excellence within herself? Aristotle holds that this happens by two related but distinct mechanisms. Intellectual excellences arise by teaching, whereas ethical excellences by character, such as moderation and courage, arise by ethos, habituation, or training (EN 1103a14–26). Since pleasure or pain results from each of our activities (EN 1104b4), training happens through activity; for example, one learns to be just by doing just things (EN 1103a35–b36). Legislators, who aim to make citizens good, therefore must ensure that citizens are trained from childhood to produce certain good habits—excellences of character—in them (EN 1103b23–25).
Such training takes place via pleasure and pain. If one is brought up to take pleasure or suffer pain in certain activities, one will develop the corresponding character (EN 1104b18–25). This is why no one becomes good unless one does good things (EN 1105b11–12). Rather than trying to answer the question of why one ought to be good in the abstract, Aristotle assumes that taking pleasure in the right kinds of activities will lead one to have a good life, where “right kinds” means those activities that contribute to one’s goal in life. Hence the desires of children can be cultivated into virtuous dispositions by providing rewards and punishments that induce them to follow good reason (EN 1119b2–6).
Since Aristotle conceives of perception as the reception of the perceived object’s form without its matter, to perceive correctly is to grasp an object as having a pleasurable or painful generic form (DA 424a17–19, 434a27–30). The cognitive capacity of perception and the motive capacity of desire are linked through pleasure, which is also “in the soul” (EE 1218b35). Excellence is not itself a pleasure but rather a deliberative disposition to take pleasure in certain activities, a mean between extreme states (EN 1106b36–1107a2).
Although he offers detailed descriptions of the virtues in his ethical works, Aristotle summarizes them in a table:
Excess
Mean
Deficiency
Irascibility
Gentleness
Spiritlessness
Rashness
Courage
Cowardice
Shamelessness
Modesty
Diffidence
Profligacy
Temperance
Insensitiveness
Envy
Righteous Indignation
Malice
Greed
Justice
Loss
Prodigality
Liberality
Meanness
Boastfulness
Honesty
Self-deprecation
Flattery
Friendliness
Surliness
Subservience
Dignity
Stubborness
Luxuriousness
Hardness
Endurance
Vanity
Greatness of Spirit
Smallness of Spirit
Extravagance
Magnificence
Shabbiness
Rascality
Prudence
Simpleness
This shows that each excellence is a mean between excessive and defective states of character (EE 1220b35–1221a15). Accordingly, good habituation is concerned with avoiding extreme or pathological states of character. Thus, Aristotle can say that ethical excellence is “concerned with pleasures and pains” (EN 1104b8–11), since whenever one has been properly trained to take the correct pleasure and suffer correct pain when one acts in excess or defect, one possesses the excellence in question.
b. Ethical Deliberation
Human action displays excellence only when it is undertaken voluntarily, that is, is chosen as the means to bring about a goal wished for by the agent. Excellence in general is thus best understood as a disposition to make correct choices (EN 1106b36–1107a2), where “choice” is understood as the product of deliberation or what “has been deliberated upon” (EN 1113a4). Deliberation is not about ends but about what contributes to an end already given by one of the three types of desire discussed above: appetite, thumos, or wish (EN 1112b11–12, 33–34).
But if all excellent action must be chosen, how can actions undertaken in an instant, such as when one acts courageously, be excellent? Since such actions can be undertaken without the agent having undergone a prior process of conscious deliberation, which takes time, it seems that one must say that quick actions were hypothetically deliberated, that is, that they count as what one would have chosen to do had one had time to deliberate (Segvic 2008, 162–163).
Such reasoning can be schematized by the so-called the “practical syllogism.” For example, supposing one accepts the premises
One should not drink heavy water
This water in this cup is heavy
The syllogism concludes with one’s not drinking water from the cup (EN 1142a22–23). If this is how Aristotle understands ethical deliberation, then it seems that all one’s voluntary actions count as deliberated even if one has not spent any time thinking about what to do.
However, Contreras (2018, 341) points out that the “practical syllogism” cannot represent deliberation since its conclusion is an action, whereas the conclusion of deliberation is choice. Though one’s choice typically causes one to act, something external could prevent one from acting even once the choice has been made. Thus, neither are choice and action the same, nor are the processes or conditions from which they result identical. Moreover, even non-rational desires like appetite and thumos present things under the “guise of the good” so that whatever one desires appears to be good. Hence an action based on those desires could still be described by a practical syllogism, though it would not be chosen through deliberation. Deliberation does not describe a kind of deduction but a process of seeking things that contribute to an aim already presented under the guise of the good (Segvic 2008, 164–167).
This “seeking” aspect of deliberation is brought out in Aristotle’s comparison of the deliberator to the geometer, who searches and analyzes by diagrams (EN 1112b20–24). Geometrical analysis is the method by which a mathematician works backwards from a desired result to find the elements that constitute that result. Similarly, deliberation is a search for the elements that would allow the end one has in view to be realized (EN 1141b8–15).
However, while geometrical reasoning is abstracted from material conditions, the prospective reasoning of deliberation is constrained both modally and temporally. One cannot deliberate about necessities, since practical things must admit of being otherwise than they are (DA 433a29–30). Similarly, one cannot deliberate about the past, since what is chosen is not what has become—“no one chooses that Ilium be destroyed”—but what may or may not come about in the future (EN 1139b5–9, DA 431b7–8). One can describe deliberation, then, as starting from premises in the future perfect tense, and as working backwards to discover what actions would make those statements true.
In addition to these constraints, the deliberating agent must have a belief about herself, namely that she is able to either bring about or not bring about the future state in question (EN 1112a18–31). Since rational powers alone are productive of contrary effects, deliberation must be distinctively rational, since it produces a choice to undertake or not to undertake a certain course of action (Met.1048a2–11). In distinction to technical deliberation, the goal of which is to produce something external to the activity that brings it about, in ethical deliberation there is no external end since good action is itself the end (EN 1140b7). So rather than concerning what an agent might produce externally, deliberation is ethical when it is about the agent’s own activity. Thus, deliberation ends when one has reached a decision, which may be immediately acted upon or put into practice later when the proper conditions arise.
c. Self and Others
Life will tend to go well for a person who has been habituated to the right kinds of pleasures and pains and who deliberates well about what to do. Unfortunately, this is not always sufficient for happiness. For although excellence might help one manage misfortunes well and avoid becoming miserable as their result, it is not reasonable to call someone struck with a major misfortune blessed or happy (EN 1100b33–1101a13). So there seems to be an element of luck in happiness: although bad luck cannot make one miserable, one must possess at least some external goods in order to be happy.
One could also ruin things by acting in ignorance. When one fails to recognize a particular as what it is, one might bring about an end one never intended. For example, one might set off a loaded catapult through one’s ignorance of the fact that it was loaded. Such actions are involuntary. But there is a more fundamental kind of moral ignorance for which one can be blamed, which is not the cause of involuntary actions but of badness (EN 1110b25–1111a11). In the first case, one does what one does not want to do because of ignorance, so is not worthy of blame. In the second case, one does what one wants to do and is thus to be blamed for the action.
Given that badness is a form of ignorance about what one should do, it is reasonable to ask whether acting acratically, that is, doing what one does not want to do, just comes down to being ignorant. This is the teaching of Socrates, who, arguing against what appears to be the case, reduced acrasia to ignorance (EN 1145b25–27). Though Aristotle holds that acrasia is distinct from ignorance, he also thinks it is impossible for knowledge to be dragged around by the passions like a slave. Aristotle must, then, explain how being overcome by one’s passions is possible, when knowledge is stronger than the passions.
Aristotle’s solution is to limit acrasia to those cases in which one generically knows what to do but fails to act on it because one’s knowledge of sensibles is dragged along by the passions (EN 1147b15–19). In other words, he admits that the passions can overpower perceptual knowledge of particulars but denies that it can dominate intellectual knowledge of universals. Hence, like Socrates, Aristotle thinks of acrasia as a form of ignorance, though unlike Socrates, he holds that this ignorance is temporary and relates only to one’s knowledge of particulars. Acrasia consists, then, in being unruled with respect to thumos or with respect to sensory pleasures. In such cases, one is unruled because one’s passions or lower desires temporarily take over and prevent one from grasping things as one should (EN 1148a2–22). In this sense, acrasia represents a conflict between the reasoning and unreasoning parts of the psyche (for discussion see Weinman 2007, 95–99).
If living well and acting well are the same (EN 1095a18–20, EE 1219b1–4) and acting well consists in part in taking the proper pleasure in one’s action, then living well must be pleasurable. Aristotle thinks the pleasure one has in living well comes about through a kind of self-consciousness, that of being aware of one’s own activity. In such activity, one grasps oneself as the object of a pleasurable act of perception or contemplation and consequently takes pleasure in that act (Ortiz de Landázuri 2012). But one takes pleasure in a friend’s life and activity almost as one takes pleasure in one’s own life (EN 1170a15–b8). Thus, the good life may be accompanied not only by a pleasurable relation to oneself but also by relationships to others in which one takes a contemplative pleasure in their activities.
The value of friendship follows from the ideas that when a person is a friend to himself, he wishes the good for himself and thus to improve his own character. Only such a person who has a healthy love of self can form a friendship with another person (EN 1166b25–29). Indeed, one’s attitudes towards a friend are based on one’s attitudes towards oneself (EN 1166a1–10), attitudes which are extended to another in the formation of a friendship (EN 1168b4–7). However, because people are by nature communal or political, in order to lead a complete life, one needs to form friendships with excellent people, and it is in living together with others that one comes to lead a happy life. When a true friendship between excellent persons is formed, each will regard one another with the same attitude with which he regards himself, and thus as an “another self” (EN 1170b5–19)
Friendship is a bridging concept between ethics concerning the relations of individuals and political science, which concerns the nature and function of the state. For Aristotle, friendship holds a state together, so the lawgiver must focus on promoting friendship above all else (EN 1155a22–26). Indeed, when people are friends, they treat one another with mutual respect so that justice is unnecessary or redundant (EN 1155a27–29). Aristotle’s ethics are thus part of his political philosophy. Just as an individual’s good action depends on her taking the right kinds of pleasures, so a thriving political community depends on citizens taking pleasure in one another’s actions. Such love of others and mutual pleasure are strictly speaking neither egoistic nor altruistic. Instead, they rest on the establishment of a harmony of self and others in which the completion of the individual life and the life of the community amount to the same thing.
d. The Household and the State
Aristotle’s political philosophy stems from the idea that the political community or state is a creation of nature prior to the individual who lives within it. This is shown by the fact that the individual human being is dependent on the political community for his formation and survival. One who lives outside the state is either a beast or a god, that is, does not participate in what is common to humanity (Pol.1253a25–31). The political community is natural and essentially human, then, because it is only within this community that the individual realizes his nature as a human being. Thus, the state exists not only for the continuation of life but for the sake of the good life (Pol.1280a31–33).
Aristotle holds that the human being is a “political animal” due to his use of speech. While other gregarious animals have voice, which nature has fashioned to indicate pleasure and pain, the power of speech enables human beings to indicate not only this but also what is expedient and inexpedient and what is just and unjust (Pol.1253a9–18). Berns (1976, 188–189) notes that for Aristotle, the speech symbol’s causes are largely natural: the material cause of sound, the efficient cause of the living creatures that produce them, and the final cause of living together, are all parts of human nature. However, the formal cause, the distinctive way in which symbols are organized, is conventional. This allows for a variability of constitutions and hence the establishment of good or bad laws. Thus, although the state is natural for human beings, the specific form it takes depends on the wisdom of the legislator.
Though the various forms of constitution cannot be discussed here (for discussion, see Clayton, Aristotle: Politics), the purpose of the state is the good of all the citizens (Pol.1252a3), so a city is excellent when its citizens are excellent (Pol.1332a4). This human thriving is most possible, however, when the political community is ruled not by an individual but by laws themselves. This is because even the best rulers are subject to thumos, which is like a “wild beast,” whereas law itself cannot be perverted by the passions. Thus, Aristotle likens rule of law to the “rule of God and reason alone” (Pol.1287a16–32). Although this is the best kind of political community, Aristotle does not say that the best life for an individual is necessarily the political life. Instead he leaves open the possibility that the theoretical life, in which philosophy is pursued for its own sake, is the best way for a person to live.
The establishment of any political community depends on the existence of the sub-political sphere of the household, the productive unit in which goods are produced for consumption. Whereas the political sphere is a sphere of freedom and action, the household consists of relations of domination: that of the master and slave, that of marriage, and that of procreation. Hence household management or “economics” is distinct from politics, since the organization of the household has the purpose of production of goods rather than action (Pol.1253b9–14). Crucial to this household production is the slave, which Aristotle defines as a living tool (Pol.1253b30–33) who is controlled by a master in order to produce the means necessary for the survival and thriving of the household and state. As household management, economics is concerned primarily with structuring slave labor, that is, with organizing the instruments of production so as to make property necessary for the superior, political life.
Aristotle thus offers a staunch defense of the institution of slavery. Against those who claim that slavery is contrary to nature, Aristotle argues that there are natural slaves, humans who are born to be ruled by others (Pol.1254a13–17). This can be seen by analogy: the body is the natural slave of the psyche, such that a good person exerts a despotic rule over his body. In the same way, humans ought to rule over other animals, males over females, and masters over slaves (Pol.1254a20–b25). But this is only natural when the ruling part is more noble than the part that is ruled. Thus, the enslavement of the children of conquered nobles by victors in a war is a mere convention since the children may possess the natures of free people. For Aristotle, then, slavery is natural and just only when it is in the interest of slave and master alike (Pol.1255b13–15).
The result of these doctrines is the view that political community is composed of “unlikes.” Just as a living animal is composed of psyche and body, and psyche is composed of a rational part and an appetite, so the family is composed of husband and wife, and property of master and slave. It is these relations of domination, in Aristotle’s view, that constitute the state, holding it together and making it function (Pol.1277a5–11). As noted in the biographical section, Aristotle had close ties to the expanding Macedonian empire. Thus his political philosophy, insofar as it is prescriptive of how a political community should be managed, might have been intended to be put into practice in the colonies established by Alexander. If that is the case, then perhaps Aristotle’s politics is at base a didactic project intended to teach an indefinite number of future legislators (Strauss 1964, 21).
5. Aristotle’s Influence
Aristotle and Plato were the most influential philosophers in antiquity, both because their works were widely circulated and read and because the schools they founded continued to exert influence for hundreds of years after their deaths. Aristotle’s school gave rise to the Peripatetic movement, with his student Theophrastus being its most famous member. In late antiquity, there emerged a tradition of commentators on Aristotle’s works, beginning with Alexander of Aphrodisias, but including the Neo-Platonists Simplicius, Syrianus, and Ammonius. Many of their commentaries have been edited and translated into English as part of the Ancient Commentators on Aristotle project.
In the middle ages, Aristotle’s works were translated into Arabic, which led to generations of Islamic Aristotelians, such as Ibn Bajjah and Ibn Rushd (see Alwishah and Hayes 2015). In the Jewish philosophical tradition, Maimonides calls Aristotle the chief of the philosophers and uses Aristotelian concepts to analyze the contents of the Hebrew Bible. Though Boethius’ Latin commentaries on Aristotle’s logical works were available from the fifth century onwards, the publication of Aristotle’s works in Latin in the 11th and 12th centuries led to a revival of Aristotelian ideas in Europe. Indeed, a major controversy broke out at the University of Paris in the 1260s between the Averroists—followers of Ibn Rushd who believed that thinking happens through divine illumination—and those who held that the active intellect is individual in humans (see McInerny 2002). A further debate, concerning realism (the doctrine that universals are real) and nominalism (the doctrine that universals exist “in name” only) continued for centuries. Although they disagreed in their interpretations, prominent scholastics like Bacon, Buridan, Ockham, Scotus, and Aquinas, tended to accept Aristotelian doctrines on authority, often referring to Aristotle simply as “The Philosopher.”
Beginning in the sixteenth century, the scholastics came under attack, particularly from natural philosophers, often leading to the disparagement of Aristotelian positions. Copernicus’ model made Earth not the center of the universe as in Aristotle’s cosmology but a mere satellite of the sun. Galileo showed that some of the predictions of Aristotle’s physical theory were incorrect; for example, heavier objects do not fall faster than lighter objects. Descartes attacked the teleological aspect of Aristotle’s physics, arguing for a mechanical conception of all of nature, including living things. Hobbes critiqued the theory of perception, which he believed unrealistically described forms or ideas as travelling through the air. Later, Hume disparaged causal powers as mysterious, thus undermining the conception of the four causes. Kantian and utilitarian ethics argued that duties to humanity rather than happiness were the proper norms for action. Darwin showed that species are not eternal, casting doubt on Aristotle’s conception of biological kinds. Frege’s logic in the late nineteenth century developed notions of quantification and predication that made the syllogism obsolete. By the beginning of the twentieth century, Aristotle looked not particularly relevant to modern philosophical concerns.
The latter part of the twentieth century, however, has seen a slow but steady intellectual shift, which has led to a large family of neo-Aristotelian positions being defended by contemporary philosophers. Anscombe’s (1958) argument for a return to virtue ethics can be taken as a convenient starting point of this change. Anscombe’s claim, in summary, is that rule-based ethics of the deontological or utilitarian style is unconvincing in an era wherein monotheistic religions have declined, and commandments are no longer understood to issue from a divine authority. Modern relativism and nihilism on this view are products of the correct realization that without anyone making moral commandments, there is no reason to follow them. Since virtue ethics grounds morality in states of character rather than in universal rules, only a return to virtue ethics would allow for a morality in a secular society. In accordance with this modern turn to virtue ethics, neo-Aristotelian theories of natural normativity have increasingly been defended, for example, by Thompson (2008). In political philosophy, Arendt’s (1958) distinction between the public and private spheres takes the tension between the political community and household as a fundamental force of historical change.
In the 21st century, philosophers have drawn on Aristotle’s theoretical philosophy. Cartwright and Pemberton (2013) revive the concept of natural powers being part of the basic ontology of nature, which explain many of the successes of modern science. Umphrey (2016) argues for the real existence of natural kinds, which serve to classify material entities. Finally, the ‘Sydney School’ has adopted a neo-Aristotelian, realist ontology of mathematics that avoids the extremes of Platonism and nominalism (Franklin 2011). These philosophers argue that, far from being useless antiques, Aristotelian ideas offer fruitful solutions to contemporary philosophical problems.
6. Abbreviations
a. Abbreviations of Aristotle’s Works
Cat.
Categoriae
Categories
Int.
Liber de interpretatione
On Interpretation
AnPr.
Analytica priora
Prior Analytics
AnPo.
Analytica posteriora
Posterior Analytics
Phys.
Physica
Physics
Met.
Metaphysica
Metaphysics
Meteor.
Meteorologica
Meteorology
DC
De Caelo
On the Heavens
HA
Historia Animalium
The History of Animals
Genn et Corr.
DeGeneratione et Corruptione
On Generation and Corruption
EN
Ethica Nicomachea
Nicomachean Ethics
DA
De Anima
On the Soul
MA
De Motu Animalium
On the Motion of Animal
Mem.
De Memoria
On Memory
Sens.
De Sensu et Sensibili
On Sense and its Objects
Pol.
Politica
Politics
Top.
Topica
Topics
Rhet.
Rhetorica
Rhetoric
Poet.
Poetica
Poetics
SophElen.
De Sophisticiis Elenchiis
Sophistical Refutations
b. Other Abbreviations
DL
Diogenes Laertius, The Life of Aristotle.
Bekker
“August Immanuel Bekker.” Encyclopedia Britannica. 9th ed., vol. 3, Cambridge University Press, 1910, p. 661.
7. References and Further Reading
a. Aristotle’s Complete Works
Aristotelis Opera. Edited by A.I. Bekker, Clarendon, 1837.
Complete Works of Aristotle. Edited by J. Barnes, Princeton University Press, 1984.
b. Secondary Sources
i. Life and Early Works
Bos, A.P. “Aristotle on the Etruscan Robbers: A Core Text of ‘Aristotelian Dualism.’” Journal of the History of Philosophy, vol. 41, no. 3, 2003, pp. 289–306.
Chroust, A-H. “Aristotle’s Politicus: A Lost Dialogue.” Rheinisches Museum für Philologie, Neue Folge, 108. Bd., 4. H, 1965, pp. 346–353.
Chroust, A-H. “Eudemus or on the Soul: A Lost Dialogue of Aristotle on the Immortality of the Soul.” Mnemosyne, Fourth Series, vol. 19, fasc. 1, 1966, pp. 17–30.
Chroust, A-H. “Aristotle Leaves the Academy.” Greece and Rome, vol. 14, issue 1, April 1967, pp. 39–43.
Chroust, A-H. “Aristotle’s Sojourn in Assos.” Historia: Zeitschrift für Alte Geschischte, Bd. 21, H. 2, 1972, pp. 170–176.
Fine, G. On Ideas. Oxford University Press, 1993.
Jaeger, W. Aristotle: Fundamentals of the History of His Development. 2nd ed., Oxford: Clarendon Press, 1948.
Kroll, W., editor. Syrianus Commentaria in Metaphysica (Commentaria in Aristotelem Graeca, vol. VI, part I). Berolini, typ. et impensis G. Reimeri, 1902.
Lachterman, D.R. “Did Aristotle ‘Develop’? Reflections on Werner Jaeger’s Thesis.” The Society for Ancient Greek Philosophy Newsletter, vol. 33, 1980.
Owen, G.E.L. “The Platonism of Aristotle.” Studies in the Philosophy of Thought and Action, edited by P.F. Strawson, Oxford University Press, 1968, pp. 147–174.
Bäck, A.T. Aristotle’s Theory of Predication. Leiden: Brill, 2000.
Cook Wilson, J. Statement and Inference, vol.1. Clarendon, 1926.
Groarke, L.F. “Aristotle: Logic.” Internet Encyclopedia of Philosophy, www.iep.utm.edu/aris-log.
Ierodiakonou, K. “Aristotle’s Logic: An Instrument, Not a Part of Philosophy.” Aristotle: Logic, Language and Science, edited by N. Avgelis and F. Peonidis, Thessaloniki, 1998, pp. 33–53.
Lukasiewicz, J. Aristotle’s Syllogistic. 2nd ed., Clarendon, 1957.
Malink, M. Aristotle’s Modal Syllogistic. Harvard University Press, 2013.
iii. Theoretical Philosophy
Anscombe, G.E.M. and P.T. Geach. Three Philosophers. Cornell University Press, 1961.
Bianchi, E. The Feminine Symptom. Fordham University Press, 2014.
Boeri, M. D. “Plato and Aristotle on What Is Common to Soul and Body. Some Remarks on a Complicated Issue.” Soul and Mind in Greek Thought. Psychological Issues in Plato and Aristotle, edited by M.D. Boeri, Y.Y. Kanayama, and J. Mittelmann, Springer, 2018, pp. 153–176.
Boylan, M. “Aristotle: Biology.” Internet Encyclopedia of Philosophy, https://www.iep.utm.edu/aris-bio.
Cook, K. “The Underlying Thing, the Underlying Nature and Matter: Aristotle’s Analogy in Physics I 7.” Apeiron, vol. 22, no. 4, 1989, pp. 105–119.
Hoinski, D. and R. Polansky. “Aristotle on Beauty in Mathematics.” Dia-noesis, October 2016, pp. 37–64.
Humphreys, J. “Abstraction and Diagrammatic Reasoning in Aristotle’s Philosophy of Geometry.” Apeiron, vol. 50, no. 2, April 2017, pp. 197–224.
Humphreys, J. “Aristotelian Imagination and Decaying Sense.” Social Imaginaries. 5:1, 37-55, Spring 2019.
Ibn Bjjah. Ibn Bajjah’s ‘Ilm al-Nafs (Book on the Soul). Translated by M.S.H. Ma’Sumi, Karachi: Pakistan Historical Society, 1961.
Ibn Rushd. Long Commentary on the De Anima of Aristotle. Translated by R.C. Taylor, Yale University Press, 2009.
Jiminez, E. R. “Mind in Body in Aristotle.” The Bloomsbury Companion to Aristotle, edited by C. Baracchi, Bloomsbury, 2014.
Jiminez, E. R. Aristotle’s Concept of Mind. Cambridge University Press, 2017.
Katz, E. “An Absurd Accumulation: Metaphysics M.2, 1076b11–36.” Phronesis, vol. 59, no. 4, 2014, pp. 343–368.
Marx, W. Introduction to Aristotle’s Theory of Being as Being. The Hague: Martinus Nijhoff, 1977.
Mayr, E. The Growth of Biological Thought. Harvard University Press, 1982.
Menn, S. “The Aim and the Argument of Aristotle’s Metaphysics.” Humboldt-Universität zu Berlin, 2013, www.philosophie.hu-berlin.de/de/lehrbereiche/antike/mitarbeiter/menn/contents.
Nakahata, M. “Aristotle and Descartes on Perceiving That We See.” The Journal of Greco-Roman Studies, vol. 53, no. 3, 2014, pp. 99–112.
Sachs, J. “Aristotle: Metaphysics.” Internet Encyclopedia of Philosophy, www.iep.utm.edu/aris-met.
Sharvy, R. “Aristotle on Mixtures.” The Journal of Philosophy, vol. 80, no. 8, 1983, pp. 439–457.
Waterlow, S. Nature, Change, and Agency in Aristotle’s Physics: A Philosophical Study. Clarendon, 1982.
Winslow, R. Aristotle and Rational Discovery. New York: Continuum, 2007.
iv. Practical Philosophy
Angier, T. Techne in Aristotle’s Ethics: Crafting the Moral Life. London: Continuum, 2010.
Baracchi, C. Aristotle’s Ethics as First Philosophy. Cambridge University Press, 2008.
Berns, L. “Rational Animal-Political Animal: Nature and Convention in Human Speech and Politics.” The Review of Politics, vol. 38, no. 2, 1976, pp. 177–189.
Clayton, E. “Aristotle: Politics.” Internet Encyclopedia of Philosophy, www.iep.utm.edu/aris-pol.
Contreras, K.E. “The Rational Expression of the Soul in the Aristotelian Psychology: Deliberating Reasoning and Action.” Eidos, vol. 29, 2018, pp. 339–365 (in Spanish).
Ortiz de Landázuri, M.C. “Aristotle on Self-Perception and Pleasure.” Journal of Ancient Philosophy, vol. VI, issue. 2, 2012.
Segvic, H. From Protagoras to Aristotle. Princeton University Press, 2008.
Strauss, L. The City and Man. University of Chicago Press, 1964.
Weinman, M. Pleasure in Aristotle’s Ethics. London: Continuum, 2007.
v. Aristotle’s Influence
Alwishah, A. and J. Hayes, editors. Aristotle and the Arabic Tradition. Cambridge University Press, 2015.
Anscombe, G.E.M. “Modern Moral Philosophy.” Philosophy, vol. 33, no. 124, 1958, pp. 1–19.
Arendt, H. The Human Condition. 2nd ed., University of Chicago Press, 1958.
Cartwright, N. and J. Pemberton. “Aristotelian Powers: Without Them, What Would Modern Science Do?” Powers and Capacities in Philosophy: The New Aristotelianism, edited by R. Groff and J. Greco, Routledge, 2013, pp. 93–112.
Franklin, J. “Aristotelianism in the Philosophy of Mathematics.” Studia Neoaristotelica, vol. 8, no. 1, 2011, pp. 3–15.
McInerny, R. Aquinas Against the Averroists: On There Being Only One Intellect. Purdue University Press, 2002.
Umphrey, S. Natural Kinds and Genesis. Lanham: Lexington Books, 2016.
Author Information
Justin Humphreys
Email: jhh@sas.upenn.edu
University of Pennsylvania
U. S. A.
David Hume: Moral Philosophy
Although David Hume (1711-1776) is commonly known for his philosophical skepticism, and empiricist theory of knowledge, he also made many important contributions to moral philosophy. Hume’s ethical thought grapples with questions about the relationship between morality and reason, the role of human emotion in thought and action, the nature of moral evaluation, human sociability, and what it means to live a virtuous life. As a central figure in the Scottish Enlightenment, Hume’s ethical thought variously influenced, was influenced by, and faced criticism from, thinkers such as Shaftesbury (1671-1713), Francis Hutcheson (1694-1745), Adam Smith (1723-1790), and Thomas Reid (1710-1796). Hume’s ethical theory continues to be relevant for contemporary philosophers and psychologists interested in topics such as metaethics, the role of sympathy and empathy within moral evaluation and moral psychology, as well as virtue ethics.
Hume’s moral thought carves out numerous distinctive philosophical positions. He rejects the rationalist conception of morality whereby humans make moral evaluations, and understand right and wrong, through reason alone. In place of the rationalist view, Hume contends that moral evaluations depend significantly on sentiment or feeling. Specifically, it is because we have the requisite emotional capacities, in addition to our faculty of reason, that we can determine that some action is ethically wrong, or a person has a virtuous moral character. As such, Hume sees moral evaluations, like our evaluations of aesthetic beauty, as arising from the human faculty of taste. Furthermore, this process of moral evaluation relies significantly upon the human capacity for sympathy, or our ability to partake of the feelings, beliefs, and emotions of other people. Thus, for Hume there is a strong connection between morality and human sociability.
Hume’s philosophy is also known for a novel distinction between natural and artificial virtue. Regarding the latter, we find a sophisticated account of justice in which the rules that govern property, promising, and allegiance to government arise through complex processes of social interaction. Hume’s account of the natural virtues, such as kindness, benevolence, pride, and courage, is explained with rhetorically gripping and vivid illustrations. The picture of human excellence that Hume paints for the reader equally recognizes the human tendency to praise the qualities of the good friend and those of the inspiring leader. Finally, the overall orientation of Hume’s moral philosophy is naturalistic. Instead of basing morality on religious and divine sources of authority, Hume seeks an empirical theory of morality grounded on observation of human nature.
Hume’s moral philosophy is found primarily in Book 3 of The Treatise of Human Nature and his Enquiry Concerning the Principles of Morals, although further context and explanation of certain concepts discussed in those works can also be found in his Essays Moral, Political, and Literary. This article discusses each of the topics outlined above, with special attention given to the arguments he develops in the Treatise.
Many philosophers have believed that the ability to reason marks a strict separation between humans and the rest of the natural world. Views of this sort can be found in thinkers such as Plato, Aristotle, Aquinas, Descartes, and Kant. One of the more philosophically radical aspects of Hume’s thought is his attack on this traditional conception. For example, he argues that the same evidence we have for thinking that human beings possess reason should also lead us to conclude that animals are rational (T 1.3.16, EHU 9). Hume also contends that the intellect, or “reason alone,” is relatively powerless on its own and needs the assistance of the emotions or “passions” to be effective. This conception of reason and emotion plays a critical role in Hume’s moral philosophy.
One of the foremost topics debated in the seventeenth and eighteenth century about the nature of morality was the relationship between reason and moral evaluation. Hume rejected a position known as moral rationalism. The moral rationalists held that ethical evaluations are made solely upon the basis of reason without the influence of the passions or feelings. The seventeenth and eighteenth century moral rationalists include Ralph Cudworth (1617-1688), Samuel Clarke (1675-1729), and John Balguy (1688-1748). Clarke, for instance, writes that morality consists in certain “necessary and eternal” relations (Clarke 1991[1706]: 192). He argues that it is “fit and reasonable in itself” that one should preserve the life of an innocent person and, likewise, unfit and unreasonable to take someone’s life without justification (Clarke 1991[1706]: 194). The very relationship between myself, a rational human being, and this other individual, another rational human being who is innocent of any wrongdoing, implies that it would be wrong of me to kill this person. The moral truths implied by such relations are just as evident as the truths implied by mathematical relations. It is just as irrational to (a) deny the wrongness of killing an innocent person as it would be to (b) deny that three multiplied by three is equal to nine (Clarke 1991[1706]: 194). As evidence, Clarke points out that both (a) and (b) enjoy nearly universal agreement. Thus, Clarke believes we should conclude that both (a) and (b) are self-evident propositions discoverable by reason alone. Consequently, it is in virtue of the human ability to reason that we make moral evaluations and recognize our moral duties.
a. The Influence Argument
Although Hume rejects the rationalist position, Hume does allow that reason has some role to play in moral evaluation. In the second Enquiry Hume argues that, although our determinations of virtue and vice are based upon an “internal sense or feeling,” reason is needed to ascertain the facts required to form an accurate view of the person being evaluated and, thus, is necessary for accurate moral evaluations (EPM 1.9). Hume’s claim, then, is more specific. He denies that moral evaluation is the product of “reason alone.” It is not solely because of the rational part of human nature that we can distinguish moral goodness from moral badness. Not “every rational being” can make moral evaluations (T 3.1.1.4). Purely rational beings that are devoid of feelings and emotion, if any such beings exist, could not understand the difference between virtue and vice. Something other than reason is required. Below is an outline of the argument Hume gives for this conclusion at T 3.1.1.16. Call this the “Influence Argument.”
Moral distinctions can influence human actions.
“Reason alone” cannot influence human actions.
Therefore, moral distinctions are not the product of “reason alone.”
Let us begin by considering premise (1). Notice that premise (1) uses the term “moral distinctions.” By “moral distinction” Hume means evaluations that differentiate actions or character traits in terms of their moral qualities (T 3.1.1.3). Unlike the distinctions we make with our pure reasoning faculty, Hume claims moral distinctions can influence how we act. The claim that some action, X, is vicious can make us less likely to perform X, and the opposite in the case of virtue. Those who believe it is morally wrong to kill innocent people will, consequently, be less likely to kill innocent people. This does not mean moral evaluations motivate decisively. One might recognize that X is a moral duty, but still fail to do X for various reasons. Hume only claims that the recognition of moral right and wrong can motivate action. If moral distinctions were not practical in this sense, then it would be pointless to attempt to influence human behavior with moral rules (T 3.1.1.5).
Premise (2) requires a more extensive justification. Hume provides two separate arguments in support of (2), which have been termed by Rachel Cohon as the “Divide and Conquer Argument” and the “Representation Argument” (Cohon 2008). These arguments are discussed below.
b. The Divide and Conquer Argument
Hume reminds us that the justification for premise (2) of the Influence Argument was already established earlier at Treatise 2.3.3 in a section entitled “Of the influencing motives of the will.” Hume begins this section by observing that many believe humans act well by resisting the influence of our passions and following the demands of reason (T 2.3.3.1). For instance, in the RepublicPlato (427–347 B.C.E.) outlines a conception of the well-ordered soul in which the rational part rules over the soul’s spirited and appetitive parts. Or, consider someone who knows that eating another piece of cake is harmful to her health, and values her health, but still eats another piece of cake. Such situations are often characterized as letting passion or emotion defeat reason. Below is the argument that Hume uses to reject this conception.
Reason is either demonstrative or probable.
Demonstrative reason alone cannot influence the will (or influence human action).
Probable reason alone cannot influence the will (or influence human action).
Therefore, “reason alone” cannot influence the will (or influence human action).
This argument is referred to as the “Divide and Conquer Argument” because Hume divides reasoning into two types, and then demonstrates that neither type of reasoning can influence the human will by itself. From this, it follows that “reason alone” cannot influence the will.
The first type of reasoning Hume discusses is demonstrative reasoning that involves “abstract relations of ideas” (T 2.3.3.2). Consider demonstratively certain judgments such as “2+2=4” or “the interior angles of a triangle equal 180 degrees.” This type of reason cannot motivate action because our will is only influenced by what we believe has physical existence. Demonstrative reason, however, only acquaints us with abstract concepts (T 2.3.3.2). Using Hume’s example, mathematical demonstrations might provide a merchant with information about how much money she owes to another person. Yet, this information only matters because she has a desire to square her debt (T 2.3.3.2). It is this desire, not the demonstrative reasoning itself, which provides the motivational force.
Why can probable reasoning not have practical influence? Probable reasoning involves making inferences on the basis of experience (T 2.3.3.1). An example of this is the judgments we make of cause and effect. As Hume established earlier in the Treatise, our judgments of cause and effect involve recognizing the “constant conjunction” of certain objects as revealed through experience (see, for instance, T 1.3.6.15). Since probable reasoning can inform us of what actions have a “constant conjunction” with pleasure or pain, it might seem that probable reasoning could influence the will. However, the fundamental motivational force does not arise from our ability to infer the relation of cause and effect. Rather, the source of our motivation is the “impulse” to pursue pleasure and avoid pain. Thus, once again, reason simply plays the role of discovering how to satisfy our desires (T 2.3.3.3). For example, my belief that eating a certain fruit will cause good health seems capable of motivating me to eat that fruit (T 3.3.1.2). However, Hume argues that this causal belief must be accompanied with some passion, specifically the desire for good health, for it to move the will. We would not care about the fact that eating the fruit contributes to our health if health was not a desired goal. Thus, Hume sketches a picture in which the motivational force to pursue a goal always comes from passion, and reason merely informs us of the best means for achieving that goal (T 2.3.3.3).
Consequently, when we say that some passion is “unreasonable,” we mean either that the passion is founded upon a false belief or that passion impelled us to choose the wrong method for achieving our desired end (T 2.3.3.7). In this context Hume famously states that it is “not contrary to reason to prefer the destruction of the whole world to the scratching of my finger” (T 2.3.3.6). It can be easy to misunderstand Hume’s point here. Hume does not believe there is no basis for condemning the person who prioritizes scratching her finger. Hume’s point is simply that reason itself cannot distinguish between these choices. A being that felt completely indifferent toward both the suffering and well-being of other human beings would have no preference for what outcome results (EPM 6.4).
The second part of Hume’s thesis is that, because “reason alone” cannot motivate actions, there is no real conflict between reason and passion (T 2.3.3.1). The view that reason and passion can conflict misunderstands how each functions. Reason can only serve the ends determined by our passions. As Hume explains in another well-known quote “Reason is, and ought only to be the slave of the passions” (T 2.3.3.4). Reason and passion have fundamentally different functions and, thus, cannot encroach upon one another. Why do we commonly describe succumbing to temptation as a failure to follow reason? Hume explains that the operations of the passions and reason often feel similar. Specifically, both the calm passions that direct us toward our long-term interest, as well as the operations of reason, exert themselves calmly (T 2.3.3.8). Thus, the person who possesses “strength of mind,” or what is commonly called “will power,” is not the individual whose reason conquers her passions. Instead, being strong-willed means having a will that is primarily influenced by calm instead of violent passions (T 2.3.3.10).
c. The Representation Argument
The second argument in support of premise (2) of the “Influence Argument” is found in both T 3.3.1 and T 2.3.3. This argument is commonly referred to as the “Representation Argument.” It is expressed most succinctly at T 3.3.1.9. The argument has two parts. The first part of the argument is outlined below.
That which is an object of reason must be capable of being evaluated as true or false (or be “truth-apt”).
That which is capable of being evaluated as true or false (or is “truth-apt”) must be capable of agreement (or disagreement) with some relation of ideas or matter of fact.
Therefore, that which can neither agree (nor disagree) with any relation of ideas or matter of fact cannot be an object of reason.
The first portion of the argument establishes what reason can (and cannot) accomplish. Premise (1) relies on the idea that the purpose of reason is to discover truth and falsehood. In fact, in an earlier Treatise section Hume describes truth as the “natural effect” of our reason (T 1.4.1.1). So, whatever is investigated or revealed through reason must be the sort of claim that it makes sense to evaluate as true or false. Philosophers call such claims “truth-apt.” What sorts of claims are truth-apt? Only those claims which can agree (or disagree) with some abstract relation of ideas or fact about existence. For instance, the claim that “the interior angles of a triangle add up to 180 degrees” agrees with the relation of ideas that makes up our concept of triangle. Thus, such a claim is true. The claim that “China is the most populated country on planet Earth” agrees with the empirical facts about world population and, thus, can also be described as true. Likewise, the claims that “the interior angles of a triangle add up to 200 degrees” or that “the United States is the most populated country on planet Earth” do not agree with the relevant ideas or existential facts. Yet, because it is appropriate to label each of these as false, both claims are still “truth-apt.” From this, it follows that something can only be an object of reason if it can agree or disagree with a relation of ideas or matter of fact.
Is that which motivates our actions “truth-apt” and, consequently, within the purview of reason? Hume addresses that point in the second part of the Representation Argument:
4. Human “passions, volitions, and actions” (PVAs) can neither agree (nor disagree) with any relation of ideas or matter of fact.
5. Therefore, PVAs cannot be objects of reason (or reason cannot produce action).
Why does the argument talk about “passions, volitions, and actions” (PVAs) in premise (4)? PVAs are the component parts of motivation. Passions cause desire or aversion toward a certain object, which results in the willing of certain actions. Thus, the argument hinges on premise (4)’s claim that PVAs can never agree or disagree with relations of ideas or matters of fact. Hume’s justification for this claim is again found at T 2.3.3.5 from the earlier Treatise section “Of the Influencing Motives of the Will.” Here Hume argues that for something to be truth-apt it must have a “representative quality” (T 2.3.3.5). That is, it must represent some type of external reality. The claim that “the interior angles of a triangle equal 180 degrees” represents a fact about our concept of a triangle. The claim that “China is the most populated country on planet Earth” represents a fact about the current population distribution of Earth. Hume argues the same cannot be said of passions such as anger. The feeling of anger, just like the feeling of being thirsty or being ill, is not meant to be a representation of some external object (T 2.3.3.5). Anger, of course, is a response to something external. For example, one might feel anger in response to a friend’s betrayal. However, this feeling of anger is not meant to represent my friend’s betrayal. A passion or emotion is simply a fact about the person who feels it. Consequently, since reason only deals with what is truth-apt, it follows that (5) PVAs cannot be objects of reason.
d. Hume and Contemporary Metaethics
Hume’s moral philosophy has continued to influence contemporary philosophical debates in metaethics. Consider the following three metaethical debates.
Moral Realism and Anti-Realism: Moral realism holds that moral statements, such as “lying is morally wrong,” describe mind-independent facts about the world. Moral anti-realism denies that moral statements describe mind-independent facts about the world.
Moral Cognitivism and Noncognitivism: Moral cognitivism holds that moral statements, such as “lying is morally wrong,” are capable of being evaluated as true or false (or are “truth-apt”). Moral noncognitivism denies that such statements can be evaluated as true or false (or can be “truth-apt”).
Moral Internalism and Externalism: Moral internalism holds that someone who recognizes that it is one’s moral obligation to perform X necessarily has at least some motive to perform X. Moral externalism holds that one can recognize that it is one’s moral obligation to perform X and simultaneously not have any motive to perform X.
While there is not just one “Humean” position on each of these debates, many contemporary meta-ethicists who see Hume as a precursor take a position that combines anti-realism, noncognitivism, and internalism. Much of the support for reading Hume as an anti-realist comes from consideration of his moral sense theory (which is examined in the next section). Evidence for an anti-realist reading of Hume is often found at T 3.1.1.26. Hume claims that, for any vicious action, the moral wrongness of the action “entirely escapes you, as long as you consider the object.” Instead, to encounter the moral wrongness you must “turn your reflexion into your own breast” (T 3.1.1.26). The wrongness of murder, taking Hume’s example, does not lie in the act itself as something that exists apart from the human mind. Rather, the wrongness of murder lies in how the observer reacts to the murder or, as we will see below, the painful sentiment that such an act produces in the observer.
The justification for reading Hume as an internalist comes primarily from the Influence Argument, which relies on the internalist idea that moral distinctions can, by themselves, influence the will and produce action. The claim that Hume is a noncognitivist is more controversial. Support for reading Hume as a noncognitivist is sometimes found in the so-called “is-ought” paragraph. There Hume warns us against deriving a conclusion that we “ought, or ought not” do something from the claim that something “is, and is not” the case (T 3.1.1.27). There is significant debate among Hume scholars about what Hume means to say in this passage. According to one interpretation, Hume is denying that it is appropriate to derive moral conclusions (such as “one should give to charity”) from any set of strictly factual or descriptive premises (such as “charity relieves suffering”). This is taken to imply support for noncognitivism by introducing a strict separation between facts (which are truth-apt) and values (which are not truth-apt).
Some have questioned the standard view of Hume as a noncognitivist. Hume does think (as seen in the Representation Argument) that the passions, which influence the will, are not truth-apt. Does the same hold for the moral distinctions themselves? Rachel Cohon has argued, to the contrary, that moral distinctions describe statements that are evaluable as true or false (Cohon 2008). Specifically, they describe beliefs about what character traits produce pleasure and pain in human spectators. If this interpretation is correct, then Hume’s metaethics remains anti-realist (moral distinctions refer to facts about the minds of human observers), but it can also be cognitivist. That is because the claim that human observers feel pleasure in response to some character trait represents an external matter of fact and, thus, can be denominated true or false depending upon whether it represents this matter of fact accurately.
2. Hume’s Moral Sense Theory
Hume claims that if reason is not responsible for our ability to distinguish moral goodness from badness, then there must be some other capacity of human beings that enables us to make moral distinctions (T 3.1.1.4). Like his predecessors Shaftesbury (1671-1713) and Francis Hutcheson (1694-1745), Hume believes that moral distinctions are the product of a moral sense. In this respect, Hume is a moral sentimentalist. It is primarily in virtue of our ability to feel pleasure and pain in response to various traits of character, and not in virtue of our capacity of “reason alone,” that we can distinguish between virtue and vice. This section covers the major elements of Hume’s moral sense theory.
a. The Moral Sense
Moral sense theory holds, roughly, that moral distinctions are recognized through a process analogous to sense perception. Hume explains that virtue is that which causes pleasurable sensations of a specific type in an observer, while vice causes painful sensations of a specific type. While all moral approval is a sort of pleasurable sensation, this does not mean that all pleasurable sensations qualify as instances of moral approval. Just as the pleasure we feel in response to excellent music is different from the pleasure we derive from excellent wine, so the pleasure we derive from viewing a person’s character is different from the pleasure we derive from inanimate objects (T 3.1.2.4). So, moral approval is a specific type of pleasurable sensation, only felt in response to persons, with a particular phenomenological quality.
Along with the common experience of feeling pleasure in response to virtue and pain when confronted with vice (T 3.1.2.2), Hume also thinks this view follows from his rejection of moral rationalism. Everything in the mind, Hume argues, is either an impression or idea. Hume understands an impression to be the first, and most forceful, appearance of a sensation or feeling in the human mind. An idea, by contrast, is a less forceful copy of that initial impression that is preserved in memory (T 1.1.1.1). Hume holds that all reasoning involves comparing our ideas. This means that moral rationalism must hold that we arrive at an understanding of morality merely through a comparison of ideas (T 3.1.1.4). However, since Hume has shown that moral distinctions are not the product of reason alone, moral distinctions cannot be made merely through comparison of ideas. Therefore, if moral distinctions are not made by comparing ideas, they must be based upon our impressions or feelings.
Hume’s claim is not that virtue is an inherent quality of certain characters or actions, and that when we encounter a virtuous character we feel a pleasurable sensation that constitutes evidence of that inherent quality. If that were true, then the moral status of some character trait would be inferred from the fact that we are experiencing a pleasurable sensation. This would conflict with Hume’s anti-rationalism. Hume reiterates this point, stating that spectators “do not infer a character to be virtuous, because it pleases: But in feeling that it pleases [they] in effect feel that it is virtuous” (T 3.1.2.3). Because moral distinctions are not made through a comparison of ideas, Hume believes it is more accurate to say that morality is a matter of feeling rather than judgment (T 3.1.2.1). Since virtue and vice are not inherent properties of actions or persons, what constitutes the virtuousness (or viciousness) of some action or character must be found within the observer or spectator. When, for example, someone determines that some action or character trait is vicious, this just means that your (human) nature is constituted such that you respond to that action or character trait with a feeling of disapproval (T 3.1.1.26). One’s ability to see the act of murder, not merely as a cause of suffering and misery, but as morally wrong, depends upon the emotional capacity to feel a painful sentiment in response to this phenomenon. Thus, Hume claims that the quality of “vice entirely escapes you, as long as you consider the object” (T 3.1.1.26). Virtue and vice exist, in some sense, through the sentimental reactions that human observers toward various “objects.”
This provides the basis for Hume’s comparison between moral evaluation and sense perception, which lies at the foundation of his moral sense theory. Just like the experiences of taste, smell, sight, hearing, and touch produced by our physical senses, virtue and vice exist in the minds of human observers instead of in the actions themselves (T 3.1.1.26). Here Hume appeals to the primary-secondary quality distinction. Sensory qualities and moral qualities are both observer-dependent. Just as there would be no appearance of color if there were no observers, so there would also be no such thing as virtue or vice without beings capable of feeling approval or disapproval in response to human actions. Likewise, a human being who lacked the required emotional capacities would be unable to understand what the rest of us mean when we say that some trait is virtuous or vicious. For instance, imagine a psychopath who has the necessary reasoning ability to understand the consequences of murder, but lacks aversion toward it and, thus, cannot determine or recognize its moral status. In fact, the presence of psychopathy, and the inability of psychopaths to understand moral judgments, is sometimes taken as an objection to moral rationalism.
Furthermore, our moral sense responds specifically to some “mental quality” (T 3.3.1.3) of another person. We can think of a “mental quality” as a disposition one has to act in certain ways or as a character trait. For example, when we approve of the courageous individual, we are approving of that person’s willingness to stand resolute in the face of danger. Consequently, actions can only be considered virtuous derivatively, as signs of another person’s mental dispositions and qualities (T 3.3.1.4). A single action, unlike the habits and dispositions that characterize our character, is fleeting and may not accurately represent our character. Only settled character traits are sufficiently “durable” to determine our evaluations of others (T 3.3.1.5). For this reason, Hume’s ethical theory is sometimes seen as a form of virtue ethics.
b. The General Point of View
Hume posits an additional requirement that some sentiment must meet to qualify as a sentiment of moral approval (or disapproval). Imagine a professor unfairly shows favor toward one student by giving her an “A” for sub-standard work. In this case, it is not difficult to imagine the student being pleased with the professor’s actions. However, if she was honest, that student would likely not think she was giving moral approval of the professor’s unfair grading. Instead, she is evaluating the influence the professor’s actions have upon her perceived self-interest. This case suggests that there is an important difference between the evaluations we make of other people based upon how they influence our interests, and the evaluations we make of others based upon their moral character.
This idea plays a significant role in Hume’s moral theory. Moral approval only occurs from a perspective in which the spectator does not take her self-interest into consideration. Rather, moral approval occurs from a more “general” vantage point (T 3.1.2.4). In the conclusion to the second Enquiry Hume makes this point by distinguishing the languages of morality and self-interest. When someone labels another “his enemy, his rival, his antagonist, his adversary,” he is evaluating from a self-interested point of view. By contrast, when someone labels another with moral terms like “vicious or odious or depraved,” she is inhabiting a general point of view where her self-interest is set aside (EPM 9.6). Speaking the language of morality, then, requires abstracting away from one’s personal perspective and considering the wider effects of the conduct under evaluation. This unbiased point of view is one aspect of what Hume refers to as the “general” (T 3.3.1.15) or “common” (T 3.3.1.30, EPM 9.6) point of view. Furthermore, he suggests that the ability to transcend our personal perspective, and adopt a general vantage point, ties human beings together as “the party of humankind against vice and disorder, its common enemy” (EPM 9.9). Thus, Hume’s theory of moral approval is related in important ways to his larger goal of demonstrating that moral life is an expression of human sociability.
The general vantage point from which moral evaluations are made does not just exclude considerations of self-interest. It also corrects for other factors that can distort our moral evaluations. For instance, adoption of the general point of view corrects our natural tendency to give greater praise to those who exist in close spatial-temporal proximity. Hume notes that someone might feel a stronger degree of praise for her hardworking servant than she feels for the historical representation of Marcus Brutus (T 3.3.1.16). From an objective point of view, Brutus merits greater praise for his moral character. However, we are acquainted with our servant and frequently interact with him. Brutus, on the other hand, is only known to us through historical accounts. Temporal distance causes our immediate, natural feelings of praise for Brutus to be less intense than the approval we give to our servant. Yet, this variation is not reflected in our moral evaluations. We do not judge that our servant has a superior moral character, and we do not automatically conclude that those who live in our own country are morally superior to those living in foreign countries (T 3.3.1.14). So, Hume needs some explanation of why our considered moral evaluations do not match our immediate feelings.
Hume responds by explaining that, when judging the quality of someone’s character, we adopt a perspective that discounts our specific spatial-temporal location or any other special resemblance we might have with the person being evaluated. Hume tells us that this vantage point is one in which we consider the influence that the person in question has upon his or her contemporaries (T 3.3.3.2). When we evaluate Brutus’ character, we do not consider the influence that his qualities have upon us now. As a historical figure who no longer exists, Brutus’ virtuous character does not provide any present benefit. Instead, we evaluate Brutus’ character based upon the benefits it had for those who lived in Brutus’ own time. We recognize that if we had lived in Brutus’ own time, and were a fellow Roman citizen with him, then we would express much greater praise and admiration for his character (T 3.3.1.16).
Hume identifies a second type of correction that the general point of view is responsible for as well. Hume observes that we have the capacity to praise someone whose character traits are widely beneficial, even when unfortunate external circumstances prevent those traits from being effective (T 3.3.1.19). For example, we might imagine a generous, kind-hearted individual whose generosity fails to make much of an impact on others because she is of modest means. Hume claims, in these cases, our considered moral evaluation is not influenced by such external circumstances: “Virtue in rags is still virtue” (T 3.3.1.19). At the same time, we might be puzzled how this could be the case since we naturally give stronger praise to the person whose good fortune enables her virtuous traits to produce actual benefits (T 3.3.1.21). Hume makes a two-fold response here. First, because we know that (for instance) a generous character is often correlated with benefits to society, we establish a “general rule” that links these together (T 3.3.1.20). Second, when we take up the general point of view, we ignore the obstacles of misfortune that prevent this virtuous person’s traits from achieving their intended goal (T 3.3.1.21). Just as we discount spatial-temporal proximity, so we also discount the influence of fortune when making moral evaluations of another’s character traits.
So, adopting the general point of view requires spectators to set aside a multitude of considerations: self-interest, demographic resemblance, spatial-temporal proximity, and the influence of fortune. What motivates us to adopt this vantage point? Hume explains that doing so enables us to discuss the evaluations we make of others. If we each evaluated from our personal perspective, then a character that garnered the highest praise from me might garner only than mild praise from you. The general point of view, then, provides a common basis from which differently situated individuals can arrive at some common understanding of morality (T 3.3.1.15). Still, Hume notes that this practical solution may only regulate our language and public judgments of our peers. Our personal feelings often prove too entrenched. When our actual sentiments are too resistant to correction, Hume notes that we at least attempt to conform our language to the objective standard (T 3.3.1.16).
In addition to explaining why it is that we adopt the general point of view, one might also think that Hume owes us an explanation of why this perspective constitutes the standard of correctness for moral evaluation. In one place Hume states that the “corrections” we make to our sentiments from the general point of view are “alone regarded, when we pronounce in general concerning the degrees of vice and virtue” (T 3.3.1.21). Nine paragraphs later Hume again emphasizes that the sentiments we feel from the general point of view constitute the “standard of virtue and morality” (T 3.3.1.30). What gives the pronouncements we make from the general point of view this authoritative status?
Hume scholars are divided on this point. One possibility, developed by Geoffrey Sayre-McCord, is that adopting the general point of view enables us to avoid the practical conflicts that inevitably arise when we judge character traits from our individual perspectives (Sayre-McCord 1994: 213-220). Jacqueline Taylor, focusing primarily on the second Enquiry, argues that the normative authority of the general point of view arises from the fact that it arises from a process of social deliberation and negotiation requiring the virtues of good judgment (Taylor 2002). Rachel Cohon argues that evaluations issuing from the general point of view are most likely to form true ethical beliefs (Cohon 2008: 152-156). In a somewhat similar vein, Kate Abramson argues that the general point of view enables us to correctly determine whether some character trait enables its possessor to act properly within the purview of her relationships and social roles (Abramson 2008: 253). Finally, Phillip Reed argues that, to the contrary, the general point of view does not constitute Hume’s “standard of virtue” (Reed 2012).
3. Sympathy and Humanity
a. Sympathy
We have seen that, for Hume, a sentiment can qualify as a moral sentiment only if it is not the product of pure self-interest. This implies that human nature must possess some capacity to get outside of itself and take an interest in the fortunes and misfortunes of others. When making moral evaluations we approve qualities that benefit the possessor and her associates, while disapproving of those qualities that make the possessor harmful to herself or others (T 3.3.1.10). This requires that we can take pleasure in that which benefits complete strangers. Thus, moral evaluation would be impossible without the capacity to partake of the pleasure (or pain) of any being that shares our underlying human nature. Hume identifies “sympathy” as the capacity that makes moral evaluation possible by allowing us to take an interest in the public good (T 3.3.1.9). The idea that moral evaluation is based upon sympathy can also be found in the work of Hume’s contemporary Adam Smith (1723-1790). However, the account of sympathy found in Smith’s work also differs in important ways from what we find in Hume.
Because of the central role that sympathy plays in Hume’s moral theory, his account of sympathy deserves further attention. Hume tells us that sympathy is the human capacity to “receive” the feelings and beliefs of other people (T 2.1.11.2). That is, it is the process by which we experience what others are feeling and thinking. This process begins by forming an idea of what another person is experiencing. This idea might be formed through observing the effects of another’s feeling (T 2.1.11.3). For instance, from my observation that another person is smiling, and my prior knowledge that smiling is associated with happiness, I form an idea of the other’s happiness. My idea of another’s emotion can also be formed prior to the other person feeling the emotion. This occurs through observing the usual causes of that emotion. Hume provides the example of someone who observes surgical instruments being prepared for a painful operation. He notes that this person would feel terrified for the person about to suffer through the operation even though the operation had not yet begun (T 3.3.1.7). This is because the observer already established a prior mental association between surgical instruments and pain.
Since sympathy causes us to feel the sentiments of others, simply having an idea of another’s feeling is insufficient. That idea must be converted into something with more affective potency. Our idea of what another feels must be transformed into an impression (T 2.1.11.3). The reason this conversion is possible is that the only difference between impressions and ideas is the intensity with which they are felt in the mind (T 2.1.11.7). Recall that impressions are the most forceful and intense whereas ideas are merely “faint images” of our impressions (T 1.1.1.1). Hume identifies two facts about human nature which explain what causes our less vivacious idea of another’s passion to be converted into an impression and, notably, become the very feeling the other is experiencing (T 2.1.11.3). First, we always experience an impression of ourselves which is not surpassed in force, vivacity, and liveliness by any other impression. Second, because we have this lively impression of ourselves, Hume believes it follows that whatever is related to that impression must receive some share of that vivacity (T 2.1.11.4). From these points, it follows that our idea of another’s impression will be enlivened if that idea has some relation to ourselves.
Hume explains the relationship between our idea of another’s emotion and ourselves in terms of his more general conception of how the imagination produces associations of ideas. Hume understands the association of ideas as a “gentle force” that explains why certain mental perceptions repeatedly occur together. He identifies three such ways in which ideas become associated: resemblance (the sharing of similar characteristics), contiguity (proximity in space or time), and causation (roughly, the constant conjunction of two ideas in which one idea precedes another in time) (T 1.1.4.1). Hume appeals to each of these associations to explain the relationship between our idea of another’s emotion and our impression of self (T 2.1.11.6). However, resemblance plays the most important role. Although each individual human is different from one another, there is also an underlying commonality or resemblance within all members of the human species (T 2.1.11.5). For example, when we form an idea of another’s happiness, we implicitly recognize that we ourselves are also capable of that same feeling. That idea of happiness, then, becomes related to ourselves and, consequently, receives some of the vivacity that is held by the impression of our self. In this way, our ideas of how others feel become converted into impressions and we “feel with” our fellow human beings.
Although sympathy makes it possible for us to care for others, even those we have no close or immediate connection with, Hume acknowledges that it does not do so in an entirely impartial or egalitarian manner. The strength of our sympathy is influenced both by the universal resemblance that exists among all human beings as well as more parochial types of resemblances. We will sympathize more easily with those who share various demographic similarities such as language, culture, citizenship, or place of origin (T 2.1.11.5). Consequently, when the person we are sympathizing with shares these similarities we will form a stronger conception of their feelings, and when such similarities are absent our conception of their feeling will be comparatively weaker. Likewise, we will have stronger sympathy with those who live in our own city, state, country, or time, than we will with those who are spatially or temporally distant. In fact, it is this aspect of sympathy which prompts Hume to introduce the general point of view (discussed above). It is our natural sympathy that causes us to give stronger praise those who exist in closer spatial-temporal proximity, even though our considered moral evaluations do not exhibit such variation. Hume poses this point as an objection to his claim that our moral evaluations proceed from sympathy (T 3.3.1.14). Hume’s appeal to the general point of view allows him to respond to this objection. Moral evaluations arise from sympathetic feelings that are corrected by the influence of the general point of view.
b. Humanity
While sympathy plays a crucial role in Hume’s moral theory as outlined in the Treatise, explicit mentions of sympathy are comparatively absent from the Enquiry. In place of Hume’s detailed description of sympathy, we find Hume appealing to the “principle of humanity” (EPM 9.6). He understands this as the human disposition that produces our common praise for that which benefits the public and common blame for that which harms the public (EPM 5.39). The principle of humanity explains why we prefer seeing things go well for our peers instead of seeing them go badly. It also explains why we would not hope to see our peers suffer if that suffering in no way benefited us or satisfied our resentment from a prior provocation (EPM 5.39). Like sympathy, then, Hume uses humanity to explain our concern for the well-being of others. However, Hume’s discussion of humanity in the Enquiry does not appeal (at least explicitly) to the cognitive mechanism that underlies Hume’s account of sympathy, and he even expresses skepticism about the possibility of explaining this mechanism. So, the Enquiry does not discuss how our idea of another’s pleasures and pains is converted into an impression. This does not necessarily mean that sympathy is absent from the Enquiry. For instance, in Enquiry Section V Hume describes having the feelings of others communicated to us (EPM 5.18) and details how sharing our sentiments in a social setting can strengthen our feelings (EPM 5.24, EPM 5.35).
As he did with sympathy in the Treatise, Hume argues that the principle of humanity makes moral evaluations possible. It is because we naturally approve of that which benefits society, and disapprove of that which harms society, that we see some character traits as virtuous and others as vicious. Hume’s justification for this claim follows from his rejection of the egoists (EPM 5.6). Here Hume has in mind those like Thomas Hobbes (1588-1679) and Bernard Mandeville (1670-1733), who each believed that our moral judgments are the product of self-interest. Those qualities we consider virtuous are those that serve our interests, and those that we consider vicious are those that do not serve our interests. Hume gives a variety of arguments against this position. He contends that egoism cannot explain why we praise the virtues of historical figures (EPM 5.7) or recognize the virtues of our enemies (EPM 5.8). If moral evaluations are not the product of self-interest, then Hume concludes that they must be caused by some principle which gives us real concern for others. This is the principle of humanity. Hume admits that the sentiments produced by this principle might often be unable to overpower the influence that self-interest has on our actions. However, this principle is strong enough to give us at least a “cool preference” for that which is beneficial to society, and provides the foundation upon which we distinguish the difference between virtue and vice (EPM 9.4).
4. Hume’s Classification of the Virtues and the Standard of Virtue
Since Hume thinks virtuous qualities benefit society, while vicious qualities harm society, one might conclude that Hume should be placed within the utilitarian moral tradition. While Hume’s theory has utilitarian elements, he does not think evaluations of virtue and vice are based solely upon considerations of collective utility. Hume identifies four different “sources” of moral approval, or four different effects of character traits that produce pleasure in spectators (T 3.3.1.30). Hume generates these categories by combining two different types of benefits that traits can have (usefulness and immediate agreeability) with two different types of benefactor that a trait can have (the possessor of the trait herself and other people) (EPM 9.1). Below is an outline of the four resulting sources of moral approval.
We praise traits that are useful to others. For example, justice (EPM 3.48) and benevolence (EPM 2.22).
We praise traits that are useful to the possessor of the trait. For example, discretion or caution (EPM 6.8), industry (EPM 6.10), frugality (EPM 6.11), and strength of mind (EPM 6.15).
We praise traits with immediate agreeability to others. For example, good manners (EPM 8.1) and the ability to converse well (EPM 8.5).
We praise traits that are immediately agreeable to the possessor. For example, cheerfulness (EPM 7.2) and magnanimity (EPM 7.4-7.18).
What does Hume mean by “immediate agreeability”? Hume explains that immediately agreeable traits please (either the possessor or others) without “any further thought to the wider consequences that trait brings about” (EPM 8.1). Although being well-mannered has beneficial long-term consequences, Hume believes we also praise this trait because it is immediately pleasing to company. As we shall see below, this distinction implies that a trait can be praised for its immediate agreeability even if the trait has harmful consequences more broadly.
There is disagreement amongst Hume scholars about how this classification of virtue is related to Hume’s definition of what constitutes a virtue, or what is termed the “standard of virtue.” That is, what is the standard which determines whether some character trait counts as a virtue? The crux of this disagreement can be found in two definitions of virtue that Hume provides in the second Enquiry.
First Definition: “personal merit consists altogether in the possession of mental qualities, useful or agreeable to the person himself or to others” (EPM 9.1).
Second Definition: “It is the nature, and, indeed, the definition of virtue, that it is a quality of the mind agreeable to or approved of by every one who considers or contemplates it” (EPM 8.n50).
The first definition suggests that virtue is defined in terms of its usefulness or agreeableness. On this basis, we might interpret Hume as believing that a trait fails to qualify as a virtue if it is neither useful nor agreeable. This interpretation is also supported by places in the text where Hume criticizes approval of traits that fail to meet the standard of usefulness and agreeableness. One prominent example is his discussion of the religiously motivated “monkish virtues.” There he criticizes those who praise traits such as “[c]elibacy, fasting, penance, mortification, self-denial, humility, silence, solitude” on the grounds that these traits are neither useful to society nor agreeable to their possessors (EPM 9.3). The second definition, however, holds that what determines whether some character trait warrants the status of virtue is the ability of that trait to generate spectator approval. On this view, some trait is a virtue if it garners approval from a general point of view, and the sources of approval (usefulness and agreability) simply describe those features of character traits that human beings find praiseworthy.
5. Justice and the Artificial Virtues
The four-fold classification of virtue discussed above deals with the features of character traits that attract our approval (or disapproval). However, in the Treatise Hume’s moral theory is primarily organized around a distinction between the way we approve (or disapprove) of some character trait. Hume tells us that some virtues are “artificial” whereas other virtues are “natural” (T 3.1.2.9). In this context, the natural-artificial distinction tracks whether the entity in question results from the plans or designs of human beings (T 3.1.2.9). On this definition, a tree would be natural whereas a table would be artificial. Unlike the former, the latter required some process of human invention and design. Hume believes that a similar type of distinction is present when we consider different types of virtue. There are natural virtues like benevolence, and there are artificial virtues like justice and rules of property. In addition to justice and property, Hume also classifies the keeping of promises (T 3.1.2.5), allegiance to government (T 3.1.2.8), laws of international relations (T 3.1.2.11), chastity (T 3.1.2.12), and good manners (T 3.1.2.12) as artificial virtues.
The designs that constitute the artificial virtues are social conventions or systems of cooperation. Hume describes the relationship between artificial virtues and their corresponding social conventions in different ways. The basic idea is that we would neither have any motive to act in accordance with the artificial virtues (T 3.2.1.17), nor would we approve of artificially virtuous behavior (T 3.2.1.1), without the relevant social conventions. No social scheme is needed for us to approve of an act of kindness. However, the very existence of people who respect property rights, and our approval of those who respect property rights, requires some set of conventions that specify rules regulating the possession of goods. As we will see, Hume believes the conventions of justice and property are based upon collective self-interest. In this way, Hume uses the artificial-natural virtue distinction to carve out a middle position in the debate between egoists (like Hobbes and Mandeville), who believe that morality is a product of self-interest, and moral sense theorists (like Shaftesbury and Hutcheson), who believe that our sense of virtue and vice is natural to human nature. The egoists are right that some virtues are the product of collective self-interest (the artificial virtues), but the moral sense theorists are also correct insofar as other virtues (the natural virtues) have no relation to self-interest.
a. The Circle Argument
In Treatise 3.2.1 Hume provides an argument for the claim that justice is an artificial virtue (T 3.2.1.1). Understanding this argument requires establishing three preliminary points. First, Hume uses the term “justice,” at least in this context, to refer narrowly to the rules that regulate property. So, his purpose here is to prove that the disposition to follow the rules of property is an artificial virtue. That is, it would make no sense to approve of those who are just, nor to act justly, without the appropriate social convention. Second, Hume uses the concept of a “mere regard to the virtue of the action” (T 3.2.1.4) or a “sense of morality or duty” (T 3.2.1.8). This article uses the term “sense of duty.” The sense of duty is a specific type of moral motivation whereby someone performs a virtuous action only because she feels it is her ethical obligation to do so. For instance, imagine that someone has a job interview and knows she can improve her chances of success by lying to the interviewers. She might still refrain from lying, not because this is what she desires, but because she feels it is her moral obligation. She has, thus, acted from a sense of duty.
Third, a crucial step in Hume’s argument involves showing that a sense of duty cannot be the “first virtuous motive” to justice (T 3.2.1.4). What does it mean for some motive to be the “first motive?” It is tempting to think that Hume uses the phrase “first motive” as a synonym for “original motive.” Original motives are naturally present in the “rude and more natural condition” of human beings prior to modern social norms, rules, and expectations (T 3.2.1.9). For example, parental affection provides an original motive to care for one’s children (T 3.2.1.5). As we will see, Hume does not believe that the sense of duty can be an original motive to justice. One can only act justly from a sense of duty after some process of education, training, or social conditioning (T 3.2.1.9). However, while Hume does believe that many first motives are original in human nature, it cannot be his position that all first motives are original in human nature. This is because he does not believe there is any original motive to act justly, but he does think there is a first motive to act justly. Therefore, it is best to understand Hume’s notion of the first motive to perform some action as whatever motive (whether original or arising from convention) first causes human beings to perform that action.
With these points in place, let us consider the basic structure of Hume’s reasoning. His fundamental claim is that there is no original motive that can serve as the first virtuous motive of just actions. That is, there is nothing in the original state of human nature, prior to the influence of social convention, that could first motivate someone to act justly. While in our present state a “sense of duty” can serve as a sufficient motive to act justly, human beings in our natural condition would be bewildered by such a notion (T 3.2.1.9). However, if no original motive can be found that first motivates justice, then it follows that justice must be an artificial virtue. This is implied from Hume’s definition of artificial virtue. If the first motive for some virtue is not an original motive, then that virtue must be artificial.
Against Hume, one might argue that human beings have a natural “sense of justice” and that this serves as an original motive for justice. Hume rejects this claim with an argument commonly referred to as the “Circle Argument.” The foundation of this argument is the previously discussed claim that when making a moral evaluation of an action, we are evaluating the motive, character trait, or disposition that produced that action (T 3.2.1.2). Hume points out that we often retract our blame of another person if we find out they had the proper motive, but they were prevented from acting on that motive because of unfortunate circumstances (T 3.2.1.3). Imagine a good-hearted individual who gives money to charity. Suppose also that, through no fault of her own, her donation fails to help anyone because the check was lost in the mail. In this case, Hume argues, we would still praise this person even though her donation was not beneficial. It is the willingness to help that garners our praise. Thus, the moral virtue of an action must derive completely from the virtuous motive that produces it.
Now, assume for the sake of argument that the first virtuous motive of some action is a sense of duty to perform that action. What would have to be the case for a sense of duty to be a virtuous motive that is worthy of praise? At minimum, it would have to be true that the action in question is already virtuous (T 3.2.1.4). It would make no sense to claim that there is a sense of duty to perform action X, but also hold that action X is not virtuous. Unfortunately, this brings us back to where we began. If action X is already virtuous prior to our feeling any sense of duty to perform it, then there must likewise already be some other virtuous motive that explains action X’s status as a virtue. Thus, since some other motive must already be able to motivate just actions, a sense of duty cannot be the first motive to justice. Therefore, our initial assumption causes us to “reason in a circle” (T 3.2.1.4) and, consequently, must be false. From this, it follows that an action cannot be virtuous unless there is already some motive in human nature to perform it other than our sense, developed later, that performing the action is what is morally right (T 3.2.1.7). The same, then, would hold for the virtue of justice. This does not mean that a sense of duty cannot motivate us to act justly (T 3.2.1.8), nor does it necessarily mean that a sense of duty cannot be a praiseworthy motive. Hume’s point is simply that a sense of duty cannot be what first motivates us to act virtuously.
Having dispensed with the claim that a sense of duty can be an original motive, Hume then considers (and rejects) three further possible candidates of original motives that one might claim could provide the first motive to justice. These are: (i) self-interest, (ii) concern for the public interest, (iii) concern for the interests of the specific individual in question. Hume does not deny that each of these are original motives in human nature. Instead, he argues that none of them can adequately account for the range of situations in which we think one is required to act justly. Hume notes that unconstrained self-interest causes injustice (T 3.2.1.10), that there will always be situations in which one can act unjustly without causing any serious harm to the public (T 3.2.1.11), and that there are situations in which the individual concerned will benefit from us acting unjustly toward her. For example, this individual could be a “profligate debauchee” who would only harm herself by keeping her possessions (T 3.2.1.13). Consequently, if there is no original motive in human nature that can produce just actions, it must be the case that justice is an artificial virtue.
b. The Origin of Justice
Thus far Hume has established that justice is an artificial virtue, but has still not identified the “first motive” of justice. Hume begins to address this point in the next Treatise section entitled “Of the origin of justice and property.” We will see, however, that Hume’s complete account of what motivates just behavior goes beyond his comments here. Hume begins his account of the origin of justice by distinguishing two questions.
Question 1: What causes human beings in their natural, uncultivated state to form conventions that specify property rights? That is, how do the conventions of justice arise?
Question 2: Once the conventions of justice are established, why do we consider it a virtue to follow the rules specified by those conventions? In other words, why is justice a virtue?
Answering Question 1 requires determining what it is about the “natural” human condition (prior to the establishment of modern, large-scale society) that motivates us to construct the specific rules, norms, and social expectations associated with justice. Hume does this by outlining an account of how natural human beings come to recognize the benefits of establishing and preserving practices of cooperation.
Hume begins by claiming that the human species has many needs and desires it is not naturally equipped to meet (T 3.2.2.2). Human beings can only remedy this deficiency through societal cooperation that provides us with greater power and protection from harm than is possible in our natural state (T 3.2.2.3). However, natural humans must also become aware that societal cooperation is beneficial. Fortunately, even in our “wild uncultivated state,” we already have some experience of the benefits that are produced through cooperation. This is because the natural human desire to procreate, and care for our children, causes us to form family units (T 3.2.2.4). The benefits afforded by this smaller-scale cooperation provide natural humans with a preview of the benefits promised by larger-scale societal cooperation.
Unfortunately, while our experience with living together in family units shows us the benefits of cooperation, various obstacles remain to establishing it on a larger scale. One of these comes from familial life itself. The conventions of justice require us to treat others equally and impartially. Justice demands that we respect the property rights of those we love and care for just as we respect the property rights of those whom we do not know. Yet, family life only strengthens our natural partiality and makes us place greater importance on the interests of our family members. This threatens to undermine social cooperation (T 3.2.2.6). For this reason, Hume argues that we must establish a set of rules to regulate our natural selfishness and partiality. These rules, which constitute the conventions of justice, allow everyone to use whatever goods we acquire through our labor and good fortune (T 3.2.2.9). Once these social norms are in place, it then becomes possible to use terms such as “property, right, [and] obligation” (T 3.2.2.11).
This account further supports Hume’s claim that justice is an artificial virtue. Justice remedies specific problems that human beings face in their natural state. If circumstances were such that those problems never arose, then the conventions of justice would be pointless. Certain background conditions must be in place for justice to originate. John Rawls (1921-2002) refers to these conditions as the “circumstances of justice” (Rawls 1971: 126n). The remedy of justice is required because the goods we acquire are vulnerable to being taken by others (T 3.2.2.7), resources are scarce (T 3.2.2.7), and human generosity is limited (T 3.2.2.6). Regarding scarcity and human generosity, Hume explains that our circumstances lie at a mean between two extremes. If resources were so prevalent that there were enough goods for everyone, then there would be no reason to worry about theft or establish property rights (EPM 3.3). On the other hand, if scarcity were too extreme, then we would be too desperate to concern ourselves with the demands of justice. Nobody worries about acting justly after a shipwreck (EPM 3.8). In addition, if humans were characterized by thoroughgoing generosity, then we would have no need to restrain the behavior of others through rules and restrictions (EPM 3.6). By contrast, if human beings were entirely self-interested, without any natural concern for others, then there could be no expectation that others would abide by any rules that are established (EPM 3.9). Justice is only possible because human life is not characterized by these extremes. If human beings were characterized by universal generosity, then justice could be replaced with “much nobler virtues, and more valuable blessings” (T 3.2.2.16).
Another innovative aspect of Hume’s theory is that he does not believe the conventions of justice are based upon promises or explicit agreements. This is because Hume believes that promises themselves only make sense if certain human conventions are already established (T 3.2.2.10). Thus, promises cannot be used to explain how human beings move from their natural state to establishing society and social cooperation. Instead, Hume explains that the conventions of justice arise from “a general sense of common interest” (T 3.2.2.10) and that cooperation can arise without explicit agreement. Once it is recognized that everyone’s interest is served when we all refrain from taking the goods of others, small-scale cooperation becomes possible (T 3.2.2.10). In addition to allowing for a sense of security, cooperation serves the common good by enhancing our productivity (T 3.2.5.8). Our understanding of the benefits of social cooperation becomes more acute by a gradual process through which we steadily gain more confidence in the reliability of our peers (T 3.2.2.10). None of this requires an explicit agreement or promise. He draws a comparison with how two people rowing a boat can cooperate by an implicit convention without an explicit promise (T 3.2.2.10).
Although the system of norms that constitutes justice is highly advantageous and even necessary for the survival of society (T 3.2.2.22), this does not mean that society gains from each act of justice. An individual act of justice can make the public worse off than it would have otherwise been. For example, justice requires us to pay back a loan to a “seditious bigot” who will use the money destructively or wastefully (T 3.2.2.22). Artificial virtues differ from the natural virtues in this respect (T 3.3.1.12). This brings us to Hume’s second question about the virtue of justice. If not every act of justice is beneficial, then why do we praise obedience to the rules of justice? The problem is especially serious for large, modern societies. When human beings live in small groups the harm and discord caused by each act of injustice is obvious. Yet, this is not the case in larger societies where the connection between individual acts of justice and the common good is much weaker (T 3.2.2.24).
Consequently, Hume must explain why we continue to condemn injustice even after society has grown larger and more diffuse. On this point Hume primarily appeals to sympathy. Suppose you hear about some act of injustice that occurs in another city, state, or country, and harms individuals you have never met. While the bad effects of the injustice feel remote from our personal point of view, Hume notes that we can still sympathize with the person who suffers the injustice. Thus, even though the injustice has no direct influence upon us, we recognize that such conduct is harmful to those who associate with the unjust person (T 3.2.2.24). Sympathy allows our concern for justice to expand beyond the narrow bounds of the self-interested concerns that first produced the rules.
Thus, it is self-interest that motivates us to create the conventions of justice, and it is our capacity to sympathize with the public good that explains why we consider obedience to those conventions to be virtuous (T 3.2.2.24). Furthermore, we can now better understand how Hume answers the question of what first motivates us to act justly. Strictly speaking, the “first motive” to justice is self-interest. As noted previously, it was in the immediate interest of early humans living in small societies to comply with the conventions of justice because the integrity of their social union hinged upon absolute fidelity to justice. As we will see below, this is not the case in larger, modern societies. However, all that is required for some motive to be the first motive to justice is that it is what first gives humans some reason to act justly in all situations. The fact that this precise motive is no longer present in modern society does not prevent it from being what first motivates such behavior.
c. The Obligation of Justice and the Sensible Knave
Given that justice is originally founded upon considerations of self-interest, it may seem especially difficult to explain why we consider it wrong of ourselves to commit injustice in larger modern societies where the stakes of non-compliance are much less severe. Here Hume believes that general rules bridge the gap. Hume uses general rules as an explanatory device at numerous points in the Treatise. For example, he explains our propensity to draw inferences based upon cause and effect through the influence of general rules (T 1.3.13.8). When we consistently see one event (or type of event) follow another event (or type of event), we automatically apply a general rule that makes us expect the former whenever we experience the latter. Something similar occurs in the present context. Through sympathy, we find that sentiments of moral disapproval consistently accompany unjust behavior. Thus, through a general rule, we apply the same sort of evaluation to our own unjust actions (T 3.2.2.24).
Hume believes our willingness to abide by the conventions of justice is strengthened through other mechanisms as well. For instance, politicians encourage citizens to follow the rules of justice (T 3.2.2.25) and parents encourage compliance of their children (T 3.2.2.26). Thus, the praiseworthy motive that underlies compliance with justice in large-scale societies is, to a large extent, the product of social conditioning. This fact might make us suspicious. If justice is an artificial virtue, and if much of our motivation to follow its rules comes from social inculcation, then we might wonder whether these rules deserve our respect.
Hume recognizes this issue. In the Treatise he briefly appeals to the fact that having a good reputation is largely determined by whether we follow the rules of property (T 3.2.2.27). Theft, and the unwillingness to follow the rules of justice, does more than anything else to establish a bad reputation for ourselves. Furthermore, Hume claims that our reputation in this regard requires that we see each rule of justice as having absolute authority and never succumb when we are tempted to act unjustly (T 3.2.2.27). Suppose Hume is right that our moral reputation hangs on our obedience to the rules of justice. Even if true, it is not obvious that this requires absolute obedience to these rules. What if I can act unjustly without being detected? What if I can act unjustly without causing any noticeable harm? Is there any reason to resist this temptation?
Hume takes up this question directly in the Enquiry, where he considers the possibility of a “sensible knave.” The knave recognizes that, in general, justice is crucial to the survival of society. Yet, the knave also recognizes that there will always be situations in which it is possible to act unjustly without harming the fabric of social society. So, the knave follows the rules of justice when he must, but takes advantage of those situations where he knows he will not be caught (EPM 9.22). Hume responds that, even if the knave is never caught, he will lose out on a more valuable form of enjoyment. The knave forgoes the ability to reflect pleasurably upon his own conduct for the sake of material gain. In making this trade, Hume judges that knaves are “the greatest dupes” (EPM 9.25). The person who has traded the peace of mind that accompanies virtue in order to gain money, power, or fame has traded away that which is more valuable for something much less valuable. The enjoyment of a virtuous character is incomparably greater than the enjoyment of whatever material gains can be attained through injustice. Thus, justice is desirable from the perspective of our own personal happiness and self-interest (EPM 9.14).
Hume admits it will be difficult to convince genuine knaves of this point. That is, it will be difficult to convince someone who does not already value the possession of a virtuous character that justice is worth the cost (EPM 9.23). Thus, Hume does not intend to provide a defense of justice that can appeal to any type of being or provide a reason to be just that makes sense to “all rational beings.” Instead, he provides a response that should appeal to those with mental dispositions typical of the human species. If the ability to enjoy a peaceful review of our conduct is nearly universal in the human species, then Hume will have provided a reason to act justly that can make some claim upon nearly every human being.
6. The Natural Virtues
After providing his Treatise account of the artificial virtues, Hume moves to a discussion of the natural virtues. Recall that the natural virtues, unlike the artificial virtues, garner praise without the influence of any human convention. Hume divides the natural virtues into two broad categories: those qualities that make a human great and those that make a human good (T 3.3.3.1). Hume consistently associates a cluster of qualities with each type of character. The great individual is confident, has a sense of her value, worth, or ability, and generally possesses qualities that set her apart from the average person. She is courageous, ambitious, able to overcome difficult obstacles, and proud of her achievements (EPM 7.4, EPM 7.10). By contrast, the good individual is characterized by gentle concern for others. This person has the types of traits that make someone a kind friend or generous philanthropist (EPM 2.1). Elsewhere, Hume explains the distinction between goodness and greatness in terms of the relationship we would want to have with the good person or the great person: “We cou’d wish to meet with the one character in a friend; the other character we wou’d be ambitious of in ourselves” (T 3.3.4.2).
Alexander of Macedonia exemplifies an extreme case of greatness. Hume recounts how Alexander responded when his general Parmenio suggested he accept the peace offering made by the Persian King Darius III. When Parmenio advises Alexander to accept Darius’ offering, Alexander responds that “So would I too […] were I Parmenio” (EPM 7.5). There are certain constraints that apply to the average person that Alexander does not think apply to himself. This is consistent with the fact that the great individual has a strong sense of self-worth, self-confidence, and even a sense of superiority.
a. Pride and Greatness of Mind
Given the characteristics Hume associates with greatness, it should not be a surprise that Hume begins the Treatise section entitled “Of Greatness of Mind” by discussing pride (T 3.3.2). Those qualities and accomplishments that differentiate one from the average person are also those qualities most likely to make us proud and inspire confidence. Thus, Hume notes that pride forms a significant part of the hero’s character (T 3.3.2.13). However, Hume faces a problem—how can a virtuous character trait be based upon pride? He observes that we blame those who are too proud and praise those with enough modesty to recognize their own weaknesses (T 3.3.2.1). If we commonly find the pride of others disagreeable, then why do we praise the boldness, confidence, and prideful superiority of the great person?
Hume must explain when pride is praiseworthy, and when it is blameworthy. In part, Hume believes expressions of pride become disagreeable when the proud individual boasts about qualities she does not possess. This results from an interplay between the psychological mechanisms of sympathy and comparison. Sympathy enables us to adopt the feelings, sentiments, and opinions of other people and, consequently, participate in that which affects another person. Comparison is the human propensity for evaluating the situation of others in relation to ourselves. It is through comparison that we make judgments about the value of different states of affairs (T 3.3.2.4). Notice that sympathy and comparison are each a stance or attitude we can take toward those who are differently situated. For example, if another individual has secured a desirable job opportunity (superior to my own), then I might sympathize with the benefits she reaps from her employment and participate in her joy. Alternatively, I might also compare the benefits and opportunities her job affords with my own lesser situation. The result of this would be a painful feeling of inferiority or jealousy. Thus, each of these mechanisms has an opposite tendency (T 3.3.2.4).
What determines whether we will respond with sympathy or comparison to another’s situation? This depends upon how lively our idea of the other person’s situation is. Hume supports this by considering three different scenarios (T 3.3.2.5). First, imagine someone is sitting safely on a beach. Taken by itself, this fact would not provide much enjoyment or satisfaction. This individual might try to imagine some other people who are sailing through a dangerous storm to make her current safety more satisfying by comparison. Yet, since this is an acknowledged fiction, and Hume holds that ideas we believe are true have greater influence than mere imaginations (T 1.3.7.7), doing so would produce neither sympathy nor comparison. Second, imagine that the individual on the beach could see, far away in the distance, a ship sailing through a dangerous storm. In this case, the idea of their precarious situation would be more lively. Consequently, the person on the beach could increase her satisfaction with her own situation by comparison. Yet, it is crucial that this idea of the suffering experienced by those in danger does not become too lively. In a third scenario Hume imagines that those in danger of shipwreck were so close to shore that the observer could see their expressions of fear, anxiety, and suffering. In this case, Hume holds that the idea would be too lively for comparison to operate. Instead, we would fully sympathize with the fear of the passengers and we would not gain any comparative pleasure from their plight.
From this example, Hume derives the following principle: comparison occurs whenever our idea of another’s situation is lively enough to influence our passions, but not so lively that it causes us to sympathize (T 3.3.2.5). Hume uses this principle to explain why we are offended by those who are proud of exaggerated accomplishments. When someone boasts about some quality she does not actually have, Hume believes our conception of her pride has the intermediate liveliness that allows for comparison. Our conception of her pride gains liveliness from her presence directly before us (the enlivening relation of contiguity in space and time). Yet, because we do not believe her claims about her merit, our conception of her pride is not so lively that it causes us to sympathize (T 3.3.2.6). Consequently, we disapprove of someone’s exaggerated arrogance because it makes us compare ourselves unfavorably against the pretended achievements and accomplishments of the conceited individual (T 3.3.2.7).
Importantly, Hume does not categorically condemn pride. Justified pride in real accomplishments is both useful (T 3.3.2.8) and agreeable to the possessor (T 3.3.2.9). However, direct expressions of pride, even if based on legitimate accomplishments, still cause disapproval. Recall that sympathizing with another’s pride requires that we believe their self-evaluation matches their actual merit. Yet, it is difficult for us to have such a belief. This is because we know that people are likely to overestimate the value of their own traits and accomplishments. The consequence is that, as a “general rule,” we are skeptical that another person’s pride is well-founded, and we blame those who express pride directly (T 3.3.2.10). It is because boasting and outward expressions of pride cause discomfort through drawing us into unfavorable comparisons that we develop rules of good manners (T 3.3.2.10). Just as we create artificial rules of justice to preserve the harmony of society, so artificial rules of good manners preserve the harmony of our social interactions. Among these unspoken rules is a prohibition against directly boasting about our accomplishments in the presence of others. However, if others infer indirectly through our actions and comportment that we feel pride, then our pride can garner approval (T 3.3.2.10). Thus, Hume believes that pride can be a virtuous trait of character provided it is not overtly expressed and based upon actual accomplishments (T 3.3.2.11).
Hume uses these points to combat attacks on the worth of pride from two different fronts. First, there are those “religious declaimers” who criticize pride and, instead, favor the Christian view which instead prizes humility (T 3.3.2.13). These religious moralists hold, not just that humility requires us to avoid directly boasting about our accomplishments, but that humility requires sincerely undervaluing our character and accomplishments (T 3.3.2.11). Here Hume seems to have in mind something like the view that we should keep in mind the comparative weakness of our own intellect in comparison to that of God. Or, perhaps, that proper worship of God requires that one humble oneself before the divine with an appropriate sense of relative worthlessness. Hume argues that such conceptions do not accurately represent the common regard we pay to pride (T 3.3.2.13).
The second criticism of pride comes from those who charge that the pride of the great individual often causes personal and social harm. The concern is that praising pride and self-assurance can overshadow the more valuable virtues of goodness. This can be seen most clearly in Hume’s discussion of military heroism. The military hero may cause great harm by leaving the destruction of cities and social unrest in his wake. Yet, despite this acknowledged harm, Hume claims that most people still find something “dazzling” about the military hero’s character that “elevates the mind” (T 3.3.2.15). The pride, confidence, and courage of the hero seem, at least temporarily, to blind us to the negative consequences of the hero’s traits. This pride is not communicated directly, but it is communicated indirectly through observing the hero overcoming daunting challenges. As a result, those who admire the military hero participate via sympathy in the pleasure the military hero derives from his own pride and self-assured courage, and this causes us to overlook the negative consequences of the hero’s actions (T 3.3.2.15).
This passage provides additional confirmation that Hume’s ethics cannot be placed neatly into the utilitarian or consequentialist moral tradition. Just as the religious moralist fails to recognize the common praise given to warranted pride in one’s accomplishments, so the consequentialist fails to recognize the human tendency to praise certain traits of character without considering their social utility. Hume’s ethics reminds us of the value of human greatness. In this vein, he writes that the heroes of ancient times “have a grandeur and force of sentiment, which astonishes our narrow souls, and is rashly rejected as extravagant and supernatural” (EPM 7.17). Likewise, Hume contends that if the ancients could see the extent to which virtues like justice and humanity predominate in modern times, that they would consider them “romantic and incredible” (EPM 7.18). Hume’s ethical theory attempts to give proper credit to the qualities of greatness prized by the ancients, as well as the qualities of goodness emphasized by the moderns.
b. Goodness, Benevolence, and the Narrow Circle
Hume turns to a discussion of goodness in a Treatise section entitled “Of Goodness and Benevolence.” Under the heading of “goodness,” Hume lists the following traits: “generosity, humanity, compassion, gratitude, friendship, fidelity, zeal, disinterestedness, liberality, and all those other qualities, which form the character of the good and benevolent” (T 3.3.3.3). Again, these traits are united by their tendency to make us considerate friends, generous philanthropists, and attentive caregivers.
Hume explains that we praise such qualities both because of their tendency to promote the good of society as well as their immediate agreeability to those who possess them. Generosity, of course, is socially useful insofar as it benefits other people. Hume also sees the gentle virtues of goodness as correctives to the destructive excesses of greatness, ambition, and courage (T 3.3.3.4). A complication here is that evaluating another’s generosity depends significantly upon the scope of benefactors we take into consideration. Praise for socially useful traits comes from sympathizing with the pleasure that is caused to those who benefit from them. How far should our sympathy extend when making this evaluation? How wide is the scope of potential benefactors we must consider when judging whether someone is generous or selfish? For example, if we interpret this scope more narrowly, then we might think that the person who takes good care of her children, helps her friends in need, and pushes for positive change in local politics exhibits admirable generosity with her time, energy, and attention. Contrastingly, if we interpret the scope more expansively, then the fact that she fails to make any positive impact on many people who are suffering all over the world will count against her.
Hume answers that when judging another’s generosity, because we do not expect “impossibilities” from human nature, we limit our view to the agent’s “narrow circle” (T 3.3.3.2). Broadly, Hume’s claim is that we limit our focus to those people that the agent can reasonably be expected to influence. A more detailed explanation of this point requires answering two further questions. First, what is the “impossibility” we do not expect of others? Second, just how “narrow” is the “narrow circle” that Hume believes we focus on when evaluating generosity?
Let’s begin with the first question. Given Hume’s statement that recognition of the “impossibility” comes from our knowledge of human nature (T 3.3.3.2), we might think that Hume is making a claim about the naturally confined altruism of human beings. We do not expect that the generous person will be beneficial to those who live far away because human beings rarely concern themselves with those who are spatio-temporally distant or with whom we infrequently interact (T 3.3.3.2). This reading fits naturally with Hume’s previously discussed claim that the strength of sympathy is influenced by our relation to the person sympathized with. It also coheres well with Hume’s claim, emphasized in his discussion of the “circumstances of justice,” that human beings are naturally selfish (although not completely selfish).
An alternative reading, however, holds that the “impossibility” Hume identifies is not primarily the human inability to care about distant strangers. Hume sometimes discusses the possibility of “extensive sympathy” that enables us to care about those who are distant and unrelated (T 3.3.6.3). This suggests Hume might have some other sort of “impossibility” in mind. One possibility would be the “impossibility” of undertaking effective action outside one’s “narrow circle.” In support of this reading, Hume mentions being “serviceable and useful within one’s sphere” (T 3.3.3.2). Perhaps Hume’s point is just that, given human motivational structure and the practical realities of human life, it is unreasonable to expect someone to be able to have a significant impact beyond the sphere of one’s daily interactions. Although, we should note that the practical boundaries to acting effectively outside one’s “narrow circle” are significantly more relaxed today than they were in Hume’s time.
Moving to the second question, how we understand the “impossibility” of expecting benevolence outside of one’s “narrow circle” may depend upon just how close the boundaries of the “narrow circle” are drawn. Many of the ways Hume refers to the agent’s proper sphere of influence suggest he did not think of it as simply a tightly bound group of personal acquaintances and close relations. In a few passages Hume suggests that we consider all those who have “any” connection or association with the agent (T 3.3.1.18; T 3.3.1.30; T 3.3.3.2). Each of these passages leaves open the possibility that the agent’s “sphere” may be much more expansive than the phrase “narrow circle” would immediately suggest.
The proper sphere of influence may also depend upon the role, position, and relationships that the person in question inhabits. In one place, Hume claims that a perfect moral character is one that is not deficient in its relationships with others (T 3.3.3.9). In the second Enquiry Hume imagines a virtuous individual, Cleanthes, whose excellent character is evidenced by the fact that his qualities enable him to perform all his various personal, social, and professional roles (EPM 9.2). Thus, how “narrow,” or expansive, one’s circle is may depend upon the extent to which that person’s attachments and position make her conduct matter to others. For example, when evaluating the character traits of an elected public official we would consider a wider sphere of influence than we would when considering the same traits in most private citizens.
Benevolence is not only praised for its utility to others. Hume also discusses how it is immediately agreeable to the benevolent individual herself. This is a feature that is found in all emotions associated with love, just as it is a feature of all emotions associated with hatred to be immediately disagreeable (T 3.3.3.4). Mirroring his discussion of military heroism, Hume points out that we cannot help but praise benevolence, generosity, and humanity even when excessive or counter-productive (T 3.3.3.6). We say that someone is “too good” as a way of laying “kind” blame upon them for a harmful act with good-hearted intentions (EPM 7.22). Thus, the virtue of benevolence is praised, at least to some extent, in all its forms (T 3.3.3.6; EPM 2.5). However, Hume notes that we react much more harshly to excesses of anger. While not all forms of anger should be criticized (T 3.3.3.7), excessive anger or cruelty is the worst vice (T 3.3.3.8). Whereas cruelty is both immediately disagreeable and harmful, the harms of excessive benevolence can at least be compensated by its inherent agreeability.
c. Natural Abilities
Hume’s ethics is based upon the idea that virtues are mental traits of persons that garner praise. The resulting “catalogue of virtues” (T 3.3.4.2), then, paints a portrait of what human beings believe to be the ideal member of their species. One might argue that this approach to ethics is fundamentally flawed because a mental trait can garner praise without being a moral quality. For example, the rare ability to learn and understand complex concepts is often seen as a natural talent. Such talent is admirable, but is it a moral virtue? Does it not make more sense to feel pity for someone who lacks some natural ability instead of blaming her for failing her moral duty?
Hume’s position is that there is not a significant difference between the supposed categories of moral virtue and natural ability. To understand his view, we need to answer the following question: why must a virtuous trait be a mental quality or disposition? It is not because other types of traits do not garner the approval of spectators. Hume discusses our approval of sex appeal (T 3.3.5.2), physical fitness (T 3.3.5.3), and health (T 3.3.5.4). He also recognizes how the same principle of sympathy which produces approval of virtue also produces our approval of these physical attributes and our admiration for the wealthy (T 3.3.5.6). Instead, the reason virtue is limited to mental qualities is that virtue is supposed to constitute personal merit, or the set of qualities, dispositions, and characteristics that we specifically admire in persons (EPM 1.10). The implication, then, is that the qualities of the mind constitute who we are as persons. So, while Hume does not deny that there is such a thing as bodily merit, he does not see bodily merit as the proper scope of moral philosophy.
If the “catalogue of virtues” is a list of the mental traits we admire in persons, then the catalogue must include certain qualities not normally placed in the category of moral virtue and vice. Common usage of the terms “virtue” and “vice” is narrower than the set of those qualities that we find admirable about persons (EPM App. 4.1). For example, it is common to think that an extraordinary genius is someone with an exceptional talent (instead of a virtue), or a person who is especially lacking in common sense as having some type of defect (instead of a vice). Despite this common language convention, Hume emphasizes that intelligence and common sense are still mental qualities that we admire in persons. Consequently, Hume states that he will leave it to the “grammarians” to decide where to draw the line between virtue, talent, and natural ability (T 3.3.4.4, EPM App 4.1). It is not a distinction Hume believes is philosophically important since, regardless of precisely where the line is drawn, natural abilities like understanding and intelligence are undoubtedly characteristics we praise in persons. Hume quips that nobody, no matter how “good-natured” and “honest,” could be considered virtuous if he is an “egregious blockhead” (EPM App 4.2).
Hume faced criticism from contemporaries on this point. For example, James Beattie (1753-1803) argued that, while it is entirely appropriate to blame someone for failing to act with generosity or justice, it would be entirely inappropriate to blame someone because they lack beauty or intelligence (Beattie 1773: 294). Beattie holds that some quality can only be considered a moral virtue if it is within our control to develop or, at least, act in ways that are consistent with it. Hume anticipates this objection. He agrees that it would be inappropriate to blame someone for a natural lack of intelligence. Yet, he denies that this shows that natural abilities such as intelligence should not be considered part of personal merit. The reason we do not blame someone for their natural defects is that doing so would be pointless. We blame the person who is unjust, or unkind, because these behavior patterns and dispositions can be changed through exerting social pressure. However, we cannot shame someone into being more intelligent (T 3.3.4.4). Yet, we still think a penetrating mind is a quality possessed by the ideal person. So, while those who lack some natural ability are not to blame, this lack still influences our evaluation of their personal merit.
This issue is important for the for the overall plausibility of Hume’s account of the natural virtues. Specifically, the question of natural abilities has an important connection with the role greatness should play in the catalogue of virtue. Beattie claims that he wants nothing to do with the term “great man.” This is because the person who possesses the natural abilities of Hume’s “great man” is better able to cause destruction and harm. Here we should recall Hume’s description of the military hero. For this reason, Beattie holds that virtue is concerned with the qualities of the “good man” that can be acquired by anyone and tend to the good of society (Beattie 1773: 296). If Beattie is correct that the qualities of greatness are natural abilities, then Hume’s attempt to include both goodness and greatness within the catalogue of virtue requires him to provide a satisfactory defense of this point.
7. References and Further Reading
a. Hume’s Works
Hume, David (2007 [1739-1740]) A Treatise of Human Nature: A Critical Edition, ed. David Fate Norton and Mary J. Norton. Oxford: Clarendon Press.
Cited in text as “T” followed by Book, part, section, and paragraph numbers.
Hume, David (2000 [1748]) An Enquiry concerning Human Understanding: A Critical Edition, ed. Tom L. Beauchamp. Oxford: Clarendon Press.
Cited in text as “EHU” followed by section and paragraph.
Hume, David (1998 [1751]) An Enquiry concerning the Principles of Morals: A Critical Edition, ed. Tom L. Beauchamp. Oxford: Clarendon Press.
Cited in text as “EPM” followed by section and paragraph.
Hume, David Essays Moral, Political, and Literary, ed. Eugene F. Miller, revised edition, (Indianapolis: Liberty Fund, 1987).
Cited in text as “EMPL” followed by the page number.
b. Further Reading
Baier, Annette (1991) A Progress of Sentiments. Cambridge: Harvard University Press.
An account of the Treatise that emphasizes the continuity between Hume’s ethics and his epistemology, metaphysics, and skepticism.
Botros, Sophie (2006) Hume, Reason, and Morality: A Legacy of Contradiction.
Focuses on Hume’s theory of motivation, and arguments against the moral rationalist, and develops an account of why these arguments are still relevant for contemporary metaethical debates.
Bricke, John (1996) Mind and Morality: An Examination of Hume’s Moral Psychology. New York: Oxford University Press.
Discusses Hume’s theory of agency, the will, and defends a noncognitivist interpretation of Hume on moral evaluation.
Cohon, Rachel (2008) Hume’s Morality: Feeling and Fabrication. New York: Oxford University Press.
Argues against “standard” views of Hume’s moral philosophy by arguing that Hume’s philosophy is both non-realist and cognitivist. Also includes novel and influential interpretations of the artificial virtues.
Darwall, Stephen (1995) The British Moralists and the Internal ‘Ought.’ Cambridge: Cambridge University Press.
Places Hume’s theory in its historical context and situates Hume as a member of an empirical, naturalist tradition in ethics alongside thinkers such as Hobbes, Locke, and Hutcheson.
Gill, Michael (2006) The British Moralists on Human Nature and the Birth of Secular Ethics. Cambridge: Cambridge University Press.
Provides further historical context for Hume’s place within seventeenth and eighteenth-century moral philosophy with a particular focus on the way in which the British moralists founded morality on human nature and disentangled morality from divine and religious sources.
Harrison, Jonathan (1976) Hume’s Moral Epistemology. Oxford: Clarendon Press.
Harrison, Jonathan (1981) Hume’s Theory of Justice. Oxford: Clarendon Press.
Each of these works provides a detailed, textual, and critical commentary on the major arguments Hume puts forward in service of his metaethical views and his conception of justice.
Herdt, Jennifer (1997) Religion and Faction in Hume’s Moral Philosophy. Cambridge: Cambridge University Press.
An account of sympathy that focuses on its connection to human sociability and the tendency that sympathy has for allowing human beings to overcome faction and division.
Mackie, J.L. (1980) Hume’s Moral Theory. London: Routledge.
Situates Hume’s moral theory within the context of his predecessors and successors and provides critical discussion of the main doctrines of Hume’s ethical thought: Hume’s anti-rationalism, sentimentalism, and a detailed discussion and critique of Hume’s artificial-natural virtue distinction.
Mercer, Philip. (1972) Sympathy and Ethics: A Study of the Relationship between Sympathy and Morality with Special Reference to Hume’s Treatise. Oxford: Clarendon Press.
Provides critical, detailed commentary on Hume’s account of sympathy and its relationship to his moral philosophy.
Norton, David Fate (1982) David Hume: Common-Sense Moralist, Sceptical Metaphysician. Princeton:Princeton University Press.
Discusses the relation between Hume’s epistemology and ethics. Puts forward the view that Hume was only skeptical regarding the former, but was a realist about morality.
Reed and Vitz (eds.) (2018) Hume’s Moral Philosophy and Contemporary Psychology. New York: Routledge.
A collection of essays that draws discusses the relevance of Hume’s moral philosophy for a wide array of topics in psychology. These topics include: mental illness, the situationist critique of virtue ethics, character development, sympathy, and the methodology of Hume’s science of human nature among other topics.
Swanton, Christine (2015) The Virtue Ethics of Hume and Nietzsche. Malden, MA: Wiley Blackwell.
Argues that Hume should be placed within the tradition of virtue ethics. Includes discussion of how a virtue theoretic interpretation can be reconciled with his rejection of rationalism and his sentimentalism, as well as the problem of why justice is a virtue.
c. Other Works Cited
Abramson, Kate (2008) “Sympathy and Hume’s Spectator-centered Theory of Virtue.” In Elizabeth Radcliffe (ed.), A Companion to Hume. Malden, MA: Blackwell Publishing.
Beattie, James (1773) An essay on the nature and immutability of truth, in opposition to sophistry and scepticism. The third edition. Dublin, MDCCLXXIII. Eighteenth Century Collections Online. Gale.
Clarke, Samuel (1991[1706]) A Discourse of Natural Religion. Indianapolis: Hackett Publishing Company.
Rawls, John (1971) A Theory of Justice. Cambridge: Harvard University Press.
Reed, Philip (2012) “What’s Wrong with Monkish Virtues? Hume on the Standard of Virtue.” History of Philosophy Quarterly 29.1: 39-53.
Sayre-McCord, Geoffrey. (1994) “On Why Hume’s ‘General Point of View’ Isn’t Ideal–and Shouldn’t Be.” Social Philosophy and Policy 11.1: 202-228.
Taylor, Jacqueline (2002) “Hume on the Standard of Virtue.” The Journal of Ethics 6: 43-62.
Author Information
Ryan Pollock
Email: pollocrc@gmail.com
Grand Valley State University
U. S. A.
An encyclopedia of philosophy articles written by professional philosophers.